Description Usage Arguments Value Details Author(s) See Also Examples
melissa
clusters and imputes single cells based on their
methylome landscape on specific genomic regions, e.g. promoters, using the
Variational Bayes (VB) EM-like algorithm.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
X |
The input data, which has to be a list of elements of
length N, where N are the total number of cells. Each element in the list
contains another list of length M, where M is the total number of genomic
regions, e.g. promoters. Each element in the inner list is an |
K |
Integer denoting the total number of clusters K. |
basis |
A 'basis' object. E.g. see create_basis function from BPRMeth package. If NULL, will an RBF object with 3 basis functions will be created. |
delta_0 |
Parameter vector of the Dirichlet prior on the mixing proportions pi. |
w |
Optional, an Mx(D)xK array of the initial parameters, where first dimension are the genomic regions M, 2nd the number of covariates D (i.e. basis functions), and 3rd are the clusters K. If NULL, will be assigned with default values. |
alpha_0 |
Hyperparameter: shape parameter for Gamma distribution. A Gamma distribution is used as prior for the precision parameter tau. |
beta_0 |
Hyperparameter: rate parameter for Gamma distribution. A Gamma distribution is used as prior for the precision parameter tau. |
vb_max_iter |
Integer denoting the maximum number of VB iterations. |
epsilon_conv |
Numeric denoting the convergence threshold for VB. |
is_kmeans |
Logical, use Kmeans for initialization of model parameters. |
vb_init_nstart |
Number of VB random starts for finding better initialization. |
vb_init_max_iter |
Maximum number of mini-VB iterations. |
is_parallel |
Logical, indicating if code should be run in parallel. |
no_cores |
Number of cores to be used, default is max_no_cores - 1. |
is_verbose |
Logical, print results during VB iterations. |
An object of class melissa
with the following elements:
W
: An (M+1) X K matrix with the optimized parameter
values for each cluster, M are the number of basis functions. Each column
of the matrix corresponds a different cluster k.
W_Sigma
: A
list with the covariance matrices of the posterior parmateter W for each
cluster k.
r_nk
: An (N X K) responsibility matrix of each
observations being explained by a specific cluster.
delta
:
Optimized Dirichlet paramter for the mixing proportions.
alpha
: Optimized shape parameter of Gamma distribution.
beta
: Optimized rate paramter of the Gamma distribution
basis
: The basis object.
lb
: The lower bound vector.
labels
: Cluster assignment labels.
pi_k
:
Expected value of mixing proportions.
The modelling and mathematical details for clustering profiles using mean-field variational inference are explained here: http://rpubs.com/cakapourani/ . More specifically:
For Binomial/Bernoulli observation model check: http://rpubs.com/cakapourani/vb-mixture-bpr
For Gaussian observation model check: http://rpubs.com/cakapourani/vb-mixture-lr
C.A.Kapourani C.A.Kapourani@ed.ac.uk
create_melissa_data_obj
,
partition_dataset
, plot_melissa_profiles
,
impute_test_met
, impute_met_files
,
filter_regions
1 2 3 4 5 6 7 8 9 10 11 12 13 | # Example of running Melissa on synthetic data
# Create RBF basis object with 4 RBFs
basis_obj <- BPRMeth::create_rbf_object(M = 4)
set.seed(15)
# Run Melissa
melissa_obj <- melissa(X = melissa_synth_dt$met, K = 2, basis = basis_obj,
vb_max_iter = 10, vb_init_nstart = 1, vb_init_max_iter = 5,
is_parallel = FALSE, is_verbose = FALSE)
# Extract mixing proportions
print(melissa_obj$pi_k)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.