fusedLasso | R Documentation |
Fits generalized fused lasso with either binomial(link="logit")
or Gaussian likelihood, leveraging functions from the
smurf
package.
fusedLasso( sce, formula, model = c("binomial", "gaussian"), genecluster, niter = 1, pen.weights, lambda = "cv1se.dev", k = 5, adj.matrix, lambda.length = 25L, se.rule.nct = 8, se.rule.mult = 0.5, ... )
sce |
A SingleCellExperiment containing assays ( |
formula |
A |
model |
Either |
genecluster |
which gene cluster to run the fused lasso on.
Usually one first identifies an interesting gene cluster pattern by
|
niter |
number of iterations to run; recommended to run 5 times if allelic ratio differences across cell types are within [0.05, 0.1] |
pen.weights |
argument as described in |
lambda |
argument as described in |
k |
number of cross-validation folds |
adj.matrix |
argument as described in |
lambda.length |
argument as described in |
se.rule.nct |
the number of cell types to trigger a different SE-based rule than 1 SE
(to prioritize larger models, less fusing,
good for detecting smaller, e.g. 0.05, allelic ratio differences).
When the number of cell types is less than or equal to this value,
|
se.rule.mult |
the multiplier of the SE in determining the lambda:
the chosen lambda is within |
... |
additional arguments passed to |
Usually, we used a Generalized Fused Lasso penalty for the cell states in order to regularize all possible coefficient differences. Another possibility would be to use the Graph-Guided Fused Lasso penalty to only regularize the differences of coefficients of neighboring cell states.
When using a Graph-Guided Fused Lasso penalty, the adjacency matrix corresponding to the graph needs to be provided. The elements of this matrix are zero when two levels are not connected, and one when they are adjacent.
See the package vignette for more details and a complete description of a use case.
A SummarizedExperiment with attached metadata and colData:
a matrix grouping factor partition
and the penalized parameter lambda
are returned in metadata "partition"
and "lambda"
.
Partition and logistic group allelic estimates are stored in
colData: "part"
and "coef"
.
This function leverages the glmsmurf function from the smurf package. For more details see the following manuscript:
Devriendt S, Antonio K, Reynkens T, et al. Sparse regression with multi-type regularized feature modeling[J]. Insurance: Mathematics and Economics, 2021, 96: 248-261.
glmsmurf
,
glmsmurf.control
,
p
, glm
library(S4Vectors) library(smurf) sce <- makeSimulatedData() sce <- preprocess(sce) sce <- geneCluster(sce, G = seq_len(4)) f <- ratio ~ p(x, pen = "gflasso") # formula for the GFL sce_sub <- fusedLasso(sce, formula = f, model = "binomial", genecluster = 1, ncores = 1) metadata(sce_sub)$partition metadata(sce_sub)$lambda # can add covariates or `gene` to the formula f2 <- ratio ~ p(x, pen = "gflasso") + gene sce_sub <- fusedLasso(sce[1:5,], formula = f2, model = "binomial", genecluster = 1, ncores = 1) # Suppose we have 4 cell states, if we only want neibouring cell states # to be grouped together with other cell states. Note here the names of # the cell states should be given as row and column names. nct <- nlevels(sce$x) adjmatrix <- makeOffByOneAdjMat(nct) colnames(adjmatrix) <- rownames(adjmatrix) <- levels(sce$x) f <- ratio ~ p(x, pen = "ggflasso") # use graph-guided fused lasso sce_sub <- fusedLasso(sce, formula = f, model = "binomial", genecluster = 1, lambda = 0.5, ncores = 1, adj.matrix = list(x = adjmatrix) ) metadata(sce_sub)$partition
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.