View source: R/ClassifierTrainingData.R
makeTrainingData | R Documentation |
Positive set is cds, negative is downstream region of 3' UTRs
makeTrainingData(
tissues = "combined",
features = c("countRFP", "disengagementScores", "entropyRFP", "floss", "fpkmRFP",
"ioScore", "ORFScores", "RRS", "RSS", "startCodonCoverage", "startRegionCoverage",
"startRegionRelative"),
mode = "uORF",
max.artificial.length,
dataFolder = get("dataFolder", envir = .GlobalEnv),
requiredActiveCds = 30
)
tissues |
Tissue to train on, use "combined" if you want all in one, first run of training it is a ORFik experiment. |
features |
features to train model on, any of the features created
during ORFik::computeFeatures, default:
|
mode |
character, default: "uORF". alternative "aCDS". Do you want to predict on uORFs or artificial CDS. if "aCDS" will run twice once for whole length CDS and one for truncated CDS to validate model works for short ORFs. "CDS" is option to predict on whole CDS. |
max.artificial.length |
integer, default: 100, only applies if mode = "aCDS", so ignore this for most people, when creating artificial ORFs from CDS, how large should maximum ORFs be, this number is 1/6 of maximum size of ORFs (max size 600 if artificialLength is 100) Will sample random size from 6 to that number, if max.artificial.length is 2, you can get artificial ORFs of size (6, 9 or 12) (6, + 6 + (3x1), 6 + (3x2)) |
requiredActiveCds |
numeric, default 30. How many CDSs are required to be detected active. Size of minimum positive training set. Will abort if not bigger than this number. |
invisible(NULL), saved to disc
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.