singlemodel: Calculate potential fits for a single sample
In ACE: Absolute Copy Number Estimation from Low-coverage Whole Genome Sequencing

Description Usage Arguments Details Value Note Author(s) See Also Examples

singlemodel performs the basic fitting algorithm of ACE on a single sample. Input can be either a template or a QDNAseq-object with the index of the sample specified. Returns a list with input parameters (ploidy, standard, and penalty) and model characteristics (calculated minima, the relative error corresponding with the minima, and the errors calculated at every cellularity). It also returns the plot associated with the error list. The minima represent cellularities, as can be seen in the plot.

1
2
3

singlemodel(template, QDNAseqobjectsample = FALSE, ploidy = 2, 
            standard, method = 'RMSE', exclude = c(), 
            penalty = 0, highlightminima = TRUE)

`template`	Object. Either a data frame as created by `objectsampletotemplate`, or a QDNAseq-object
`QDNAseqobjectsample`	Integer. Specifies which sample to analyze from the QDNAseqobject. Required when using a QDNAseq-object as template. Default = FALSE
`ploidy`	Integer. Calculate fits assuming the median of segments has this absolute copy number. Default = 2
`standard`	Numeric. Force the given ploidy to represent this raw value. When omitted, the standard will be calculated from the data
`method`	String character specifying which error method to use. For more documentation, consult the vignette. Can be "RMSE", "SMRE", or "MAE". Default = "RMSE"
`exclude`	Integer or character vector. Specifies which chromosomes to exclude for model fitting
`penalty`	Numeric value. Penalizes fits at lower cellularities. Suggested values between 0 and 1. Default = 0 (no penalty)
`highlightminima`	Logical. Minima are highlighted in the errorplot by a red color. Default = TRUE

All ACE fitting algorithms work by calculating "expected values" of integer copies given a certain cellularity. It calculates these expected values for 1-12 copies at cellularities 0.05-1 (in increments of 0.01). First of all, this means that fits at cellularities below 0.05 are not calculated. These low-cellularity fits will not give very meaningful results, and only obscure more plausible fits. Second, it means that 0 copies and >12 copies are not "fitted". This prevents fits predicting many and/or large segments with 0 or >12 copies, which is biologically unlikely. More explanation is given in the vignette.

Returns a list, containing

`ploidy`	Absolute copy number that corresponds with the median segment value
`standard`	Ploidy corresponds to this raw data value. Unless specified as argument, it corresponds to the median segment value
`method`	Applied error method
`penalty`	Applied penalty factor
`minima`	Vector with cellularities at which the error reached a minimum
`rerror`	Vector with relative errors corresponding to the minima
`errorlist`	List of errors of all cellularities tested
`errorplot`	ggplot2-graph of the relative errors calculated at each cellularity

singlemodel() only needs a data frame with columns named chr and segments. Every row should contain an individual genomic feature, i.e. a bin or a probe. If you have data with each row representing a segment, and the size of the segment given in a column (e.g. NumBins or NumProbes), you can create the data frame as follows (giving the correct variable names of course):

chr <- rep(Chromosome, NumProbes)

segments <- rep(SegmentMean, NumProbes)

template <- cbind(chr, segments)

Jos B. Poell

objectsampletotemplate, squaremodel, singleplot

## toy data assuming each chromosome comprises 100 bins
s <- jitter(c(1, 1, 0.8, 1.2, rep(1, 5), 1.4, rep(1, 13)), amount = 0)
n <- c(100, 100, 40, 60, rep(100, 5), 100, rep(100, 13))
df <- data.frame(chr = rep(1:22, each = 100), segments = rep(s, n))
singlemodel(df)
singlemodel(df, ploidy = 3)
singlemodel(df, method = 'MAE', penalty = 0.5)
singlemodel(df, exclude = 1:3)

## using segmented data from a QDNAseq-object
data("copyNumbersSegmented")
singlemodel(copyNumbersSegmented, QDNAseqobjectsample = 2)