View source: R/predict_occupancy.R
predict_TOP | R Documentation |
Predicts quantitative TF occupancy or TF binding probability using TOP model trained from ChIP-seq read counts or binary labels.
predict_TOP(
data,
TOP_coef,
tf_name,
cell_type,
use_model = c("ATAC", "DukeDNase", "UwDNase"),
level = c("best", "bottom", "middle", "top"),
logistic_model = FALSE,
transform = c("asinh", "log2", "log", "none")
)
data |
A data frame containing motif PWM score and DNase (or ATAC) bins. |
TOP_coef |
A list containing the posterior mean of TOP regression coefficients. |
tf_name |
TF name to make predictions for.
It will find the model parameters trained for this TF.
This is not needed (not used) when |
cell_type |
Cell type to make predictions for.
It will find the model parameters trained for this cell type.
This is not needed (not used) when |
use_model |
Uses pretrained model if |
level |
TOP model level to use.
Options: ‘best’, ‘bottom’, ‘middle’, or ‘top’.
When |
logistic_model |
Logical. Whether to use the logistic version of TOP model.
If |
transform |
Type of transformation performed for ChIP-seq read counts
when preparing the input training data.
Options are: ‘asinh’(asinh transformation),
‘log2’ (log2 transformation),
‘sqrt’ (sqrt transformation),
and ‘none’ (no transformation).
This only applies when |
Returns a list with the following elements,
model |
TOP model name. |
level |
selected hierarchy level. |
coef |
posterior mean of regression coefficients. |
predictions |
a data frame with the data and predicted values. |
## Not run:
# Predicts CTCF occupancy in K562 using the quantitative occupancy model:
# Predicts using the 'bottom' level model
result <- predict_TOP(data, TOP_coef,
tf_name = 'CTCF', cell_type = 'K562',
level = 'bottom',
logistic_model = FALSE,
transform = 'asinh')
# Predicts using the 'best' model
# Since CTCF in K562 cell type is included in training,
# the 'best' model is the 'bottom' level model.
result <- predict_TOP(data, TOP_coef,
tf_name = 'CTCF', cell_type = 'K562', level = 'best',
logistic_model = FALSE,
transform = 'asinh')
# We can use the 'middle' model to predict CTCF in K562
# or other cell types or conditions
result <- predict_TOP(data, TOP_coef,
tf_name = 'CTCF', level = 'middle',
logistic_model = FALSE,
transform = 'asinh')
# Predicts CTCF binding probability using the logistic version of the model:
# No need to set the argument for 'transform' for the logistic model.
# Predicts using the 'bottom' level model
result <- predict_TOP(data, TOP_coef,
tf_name = 'CTCF', cell_type = 'K562',
level = 'best',
logistic_model = TRUE)
# Predicts using the 'middle' level model
result <- predict_TOP(data, TOP_coef,
tf_name = 'CTCF', level = 'middle',
logistic_model = TRUE)
# If TOP_coef is not specified, it will automatically use the
# pretrained models included in the package.
# Predicts using pretrained ATAC quantitative occupancy model
result <- predict_TOP(data,
tf_name = 'CTCF', cell_type = 'K562',
use_model = 'ATAC', level = 'best',
logistic_model = FALSE,
transform = 'asinh')
# Predicts using pretrained ATAC logistic model
result <- predict_TOP(data,
tf_name = 'CTCF', cell_type = 'K562',
use_model = 'ATAC', level = 'best',
logistic_model = TRUE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.