Description Usage Arguments Details Value Examples
cleaning_filter
"clean" a CNV calls dataset based on measures such as
length, number of calls per sample and more.
1 2 3 4 5 6 7 8 | cleaning_filter(
results,
min_len = 10000,
min_NP = 10,
blacklists = NULL,
blacklist_samples = NULL,
blacklist_chrs = NULL
)
|
results, |
a |
min_len, |
minimum CNVs length, any shorter event will be filtered out. Default is 10000. |
min_NP, |
minimum CNVs points, any shorter event will be filtered out. Default is 10. |
blacklists, |
blacklist, a |
blacklist_samples, |
character vector containing samples ID to filter out. |
blacklist_chrs, |
character vector containing chromosomes names in the package format, 1:22 for autosomes and 23 24 for chr X and Y. |
This function can be used together with summary
in order
to clean the dataset from possible noise and unwanted calls. It is generally
recommended to briefly explore the data using summary
and
then proceeding to filter out any unwanted group of events. Mandatory
arguments of the function are "results" and "min_len"/"min_NP", default
values are the authors suggested minimal filtering step, however its quite
common to filter anything shorter than 10 or even 50 kb, and/or any call made
by less than 10 points. The use of blacklist of any kind is optional and
should be done with caution, as it can filter potential biologically relevant
events. Over-segmented samples the user wishes to exclude can be specified via
blacklist_samples
. Immunoglobulin regions can be generated with the function
immuno_regions
, while telomeric and/or centromeric regions can be
obtained with telom_centrom
.
the CNVresults
object results
after the selected filters
have been applied.
1 | DT <- cleaning_filter(penn_22, min_len = 50000)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.