squaremodel | R Documentation |
squaremodel
performs a "two-dimensional" fitting algorithm on a single sample. It calculates the error of the fit at each cellularity over a range of "ploidies". Input can be either a template or a QDNAseq-object with the index of the sample specified. Returns a list with input parameters (method, penalty, and penploidy) and model characteristics (an error matrix, a logical matrix specifying minima, a data frame with all information, a data frame with only minima, and a graphical representation of the error matrix).
squaremodel(template, QDNAseqobjectsample = FALSE, prows=100, ptop=5, pbottom=1, method = 'RMSE', exclude = c("X", "Y"), sgc = c(), penalty = 0, penploidy = 0, cellularities = seq(5,100), highlightminima = TRUE, standard)
template |
Object. Either a data frame as created by |
QDNAseqobjectsample |
Integer. Specifies which sample to analyze from the QDNAseqobject. Required when using a QDNAseq-object as template. Default = FALSE |
prows |
Integer. Sets the resolution of the ploidy-axis. Determines how many decrements are used to get from ptop to pbottom (see below). Therefore, the actual number of rows is actually prows + 1. Default = 100 |
ptop |
Numeric. Sets the highest ploidy at which to start testing fits. Default = 5 |
pbottom |
Numeric. Sets the lowest ploidy to be tested. Default = 1 |
method |
Character string specifying which error method to use. For more documentation, consult the vignette. Can be "RMSE", "SMRE", or "MAE". Default = "RMSE" |
exclude |
Integer or character vector. Specifies which chromosomes to exclude for model fitting. Default = c("X", "Y") |
sgc |
Integer or character vector. Specifies which chromosomes occur with only a single copy in the germline |
penalty |
Numeric. Penalizes fits at lower cellularities. Suggested values between 0 and 1. Default = 0 (no penalty) |
penploidy |
Numeric. Penalizes fits that diverge from 2N with the formula (1+abs(ploidy-2))^penploidy. Default = 0 |
cellularities |
Numeric vector. Specifies the cellularities (in percentage) to be tested |
highlightminima |
Logical. Minima are highlighted in the matrixplot by a black dot. Default = TRUE |
standard |
Numeric. Force the ploidy to represent this raw value. When omitted, the standard will be calculated from the data |
Unlike other functionality of ACE, squaremodel
does not use the "standard", but it fits all tested ploidies to "standard = 1". It is therefore necessary that segment values are normalized to 1 (which they are by default coming from QDNAseq). The penalty parameter is the same as in singlemodel
. Additionally, it is possible to penalize fits at ploidies diverging from 2N using the penploidy parameter. For other details on the fitting algorithm, see singlemodel
. Range of ploidies is set by parameters ptop and pbottom, and resolution is determined by prows. Resolution on the X-axis can be adapted by changing the cellularities option. To create good contrast in the matrixplot, the color scale derives from the inverse of the error, and the opacity of the dots marking the minima is calculated as min(error)/error.
Returns a list, containing
method |
Applied error method |
penalty |
Applied penalty factor for low cellularities |
penploidy |
Applied penalty factor for diverging ploidies |
errormatrix |
Numeric matrix with errors of all combinations of ploidy and cellularity |
minimatrix |
Logical matrix indicating whether the combination of ploidy and cellularity represents a minimum |
errordf |
Data frame with columns ploidy, cellularity, error, and minimum |
minimadf |
Same as errordf, but only containing minima and sorted by error |
matrixplot |
ggplot2-graph of the relative errors calculated at each combination of ploidy and cellularity |
squaremodel() only needs a data frame with columns named chr
and segments
. Every row should contain an individual genomic feature, i.e. a bin or a probe. If you have data with each row representing a segment, and the size of the segment given in a column (e.g. NumBins or NumProbes), you can create the data frame as follows (giving the correct variable names of course):
template <- data.frame(chr = rep(Chromosome, NumProbes), segments = rep(SegmentMean, NumProbes))
Alternatively you can look into segmentstotemplate
.
If your data contains sex chromosomes and you wish to include these for model fitting, then make sure to specify exclude = c()
, and sgc = c("X", "Y")
when analyzing data from a male individual.
Jos B. Poell
objectsampletotemplate
, squaremodel
, singleplot
## toy data assuming each chromosome comprises 100 bins s <- jitter(c(1, 1, 0.8, 1.2, rep(1, 5), 1.4, rep(1, 13)), amount = 0) n <- c(100, 100, 40, 60, rep(100, 5), 100, rep(100, 13)) df <- data.frame(chr = rep(1:22, each = 100), segments = rep(s, n)) squaremodel(df)$matrixplot sm <- squaremodel(df, method = 'MAE', penalty = 0.5, penploidy = 0.5) sm$matrixplot mdf <- sm$minimadf head(mdf[order(mdf$error,-mdf$cellularity),]) ## using segmented data from a QDNAseq-object data("copyNumbersSegmented") sqm <- squaremodel(copyNumbersSegmented, QDNAseqobjectsample = 2, penalty = 0.5, penploidy = 0.5, ptop = 4.3, pbottom = 1.8, prows = 250) sqm$matrixplot mdf <- sqm$minimadf head(mdf[order(mdf$error,-mdf$cellularity),])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.