View source: R/create_data_split.r
create.data.split | R Documentation |
This function prepares the cross-validation by splitting the
data into num.folds
training and test folds for
num.resample
times.
create.data.split(siamcat, num.folds = 2, num.resample = 1,
stratify = TRUE, inseparable = NULL, verbose = 1)
siamcat |
object of class siamcat-class |
num.folds |
integer number of cross-validation folds (needs to be
|
num.resample |
integer, resampling rounds (values |
stratify |
boolean, should the splits be stratified so that an equal
proportion of classes are present in each fold?, will be ignored for
regression tasks, defaults to |
inseparable |
string, name of metadata variable to be inseparable,
defaults to |
verbose |
integer, control output: |
This function splits the labels within a siamcat-class object and prepares the internal cross-validation for the model training (see train.model).
The function saves the training and test instances for the different
cross-validation folds within a list in the data_split
-slot of the
siamcat-class object, which is a list with four entries:
num.folds
- the number of cross-validation folds
num.resample
- the number of repetitions for the
cross-validation
training.folds
- a list containing the indices for the
training instances
test.folds
- a list containing the indices for the
test instances
If provided, the data split will take into account a metadata variable
for the data split (by providing the inseparable
argument). For
example, if the data contains several samples for the same individual,
it makes sense to keep data from the same individual within the
same fold.
If inseparable
is given, the stratify
argument will be
ignored.
object of class siamcat-class with the data_split
-slot
filled
data(siamcat_example)
# simple working example
siamcat_split <- create.data.split(siamcat_example, num.folds=10,
num.resample=5, stratify=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.