checkSeparation: Check for separation of count distributions by variables

View source: R/checkSeparation.R

checkSeparationR Documentation

Check for separation of count distributions by variables

Description

Check the count distributions for each nhood according to a test variable of interest. This is important for checking if there is separation in the GLMM to inform either nhood subsetting or re-computation of the NN-graph and refined nhoods.

Arguments

x

Milo object with a non-empty nhoodCounts slot.

design.df

A data.frame containing meta-data in which condition is a column variable. The rownames must be the same as, or a subset of, the colnames of nhoodCounts(x).

condition

A character scalar of the test variable contained in design.df. This should be a factor variable if it is numeric or character it will be cast to a factor variable.

min.val

A numeric scalar that sets the minimum number of counts across condition level samples, below which separation is defined.

factor.check

A logical scalar that sets the factor variable level checking. See details for more information.

Details

This function checks across nhoods for separation based on the separate levels of an input factor variable. It checks if condition is a factor variable, and if not it will cast it to a factor. Note that the function first checks for the number of unique values - if this exceeds > 50 error is generated. Users can override this behaviour with factor.check=FALSE.

Value

A logical vector of the same length as ncol(nhoodCounts(x)) where TRUE values represent nhoods where separation is detected. The output of this function can be used to subset nhood-based analyses e.g. testNhoods(..., subset.nhoods=checkSepartion(x, ...)).

Author(s)

Mike Morgan

Examples

library(SingleCellExperiment)
ux.1 <- matrix(rpois(12000, 5), ncol=400)
ux.2 <- matrix(rpois(12000, 4), ncol=400)
ux <- rbind(ux.1, ux.2)
vx <- log2(ux + 1)
pca <- prcomp(t(vx))

sce <- SingleCellExperiment(assays=list(counts=ux, logcounts=vx),
                            reducedDims=SimpleList(PCA=pca$x))

milo <- Milo(sce)
milo <- buildGraph(milo, k=20, d=10, transposed=TRUE)
milo <- makeNhoods(milo, k=20, d=10, prop=0.3)
milo <- calcNhoodDistance(milo, d=10)

cond <- rep("A", ncol(milo))
cond.a <- sample(1:ncol(milo), size=floor(ncol(milo)*0.25))
cond.b <- setdiff(1:ncol(milo), cond.a)
cond[cond.b] <- "B"
meta.df <- data.frame(Condition=cond, Replicate=c(rep("R1", 132), rep("R2", 132), rep("R3", 136)))
meta.df$SampID <- paste(meta.df$Condition, meta.df$Replicate, sep="_")
milo <- countCells(milo, meta.data=meta.df, samples="SampID")

test.meta <- data.frame("Condition"=c(rep("A", 3), rep("B", 3)), "Replicate"=rep(c("R1", "R2", "R3"), 2))
test.meta$Sample <- paste(test.meta$Condition, test.meta$Replicate, sep="_")
rownames(test.meta) <- test.meta$Sample

check.sep <- checkSeparation(milo, design.df=test.meta, condition='Condition')
sum(check.sep)


MikeDMorgan/miloR documentation built on Oct. 19, 2024, 8:39 p.m.