model_validation: Model Validation

Description Usage Arguments Value See Also Examples

Description

This function is used to validate absolute risk models.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
ModelValidation(study.data, 
                total.followup.validation = FALSE,
                predicted.risk = NULL, 
                predicted.risk.interval = NULL, 
                linear.predictor = NULL, 
                iCARE.model.object = 
                  list(model.formula = NULL,
                       model.cov.info = NULL,
                       model.snp.info = NULL,
                       model.log.RR = NULL,
                       model.ref.dataset = NULL,
                       model.ref.dataset.weights = NULL,
                       model.disease.incidence.rates = NULL,
                       model.competing.incidence.rates = NULL,
                       model.bin.fh.name = NA,
                       apply.cov.profile  = NULL,
                       apply.snp.profile = NULL, 
                       n.imp = 5, use.c.code = 1,
                       return.lp = TRUE, 
                       return.refs.risk = TRUE),
                number.of.percentiles = 10,
                reference.entry.age = NULL, 
                reference.exit.age = NULL,
                predicted.risk.ref = NULL,
                linear.predictor.ref = NULL,
                linear.predictor.cutoffs = NULL,
                dataset = "Example Dataset", 
                model.name = "Example Risk Prediction Model")            

Arguments

study.data

Data frame which includes the variables below.

  • observed.outcome: 1 if disease has occurred by the end of followup, 0 if censored

  • study.entry.age: age (in years) of entering the cohort

  • study.exit.age: age (in years) of last followup visit

  • time.of.onset: time (in years) of onset of disease; note that all subjects are disease free at the time of entry and for those who do not develop disease by end of followup it is Inf

  • sampling.weights: for a case-control study nested within a cohort study, this is a vector of sampling weights for each subject, i.e., probability of inclusion into the sample

total.followup.validation

logical; TRUE if risk validation is performed over the total followup, for all other cases (e.g., 5 year or 10 year risk validation) it is FALSE

predicted.risk

vector of predicted risks; should be supplied if risk prediction is done by some method other than that implemented in iCARE; default is NULL

predicted.risk.interval

scalar or vector denoting the number of years after entering the study over which risk validation is desired (e.g., 5 for validating a model for 5 year risk) if total.followup.validation = FALSE; if total.followup.validation = TRUE, it can be set to NULL

linear.predictor

vector of risk scores for each subject, i.e. x*beta, where x is the vector of risk factors and beta is the vector of log relative risks; in the current version if both the arguments predicted.risk and linear.predictor are supplied the function will use the supplied estimates to perform model validation, otherwise the function will compute these estimates using the computeAbsoluteRisk function

iCARE.model.object

A named list containing the input arguments to the function computeAbsoluteRisk. The names in this list must match the argument names. See computeAbsoluteRisk

number.of.percentiles

the number of percentiles of the risk score that determines the number of strata over which the risk prediction model is to be validated, default = 10

reference.entry.age

age of entry to be specified for computing absolute risk of the reference population

reference.exit.age

age of exit to be specified for computing absolute risk of the reference population

predicted.risk.ref

predicted absolute risk in the reference population assuming the entry age to be as specified in reference.entry.age and exit age to be as specified in reference.exit.age

linear.predictor.ref

vector of risk scores for the reference population

linear.predictor.cutoffs

user specified cut-points for the linear predictor to define categories for absolute risk calibration and relative risk calibration

dataset

name and type of dataset to be displayed in the output, e.g., "PLCO Full Cohort" or "Full Cohort Simulation"

model.name

name of the model to be displayed in output, e.g., "Synthetic Model" or "Simulation Setting"

Value

This function returns a list of the following objects:

See Also

computeAbsoluteRisk

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
data(bc_data, package="iCARE")
validation.cohort.data$inclusion = 0
subjects_included = intersect(validation.cohort.data$id, 
                              validation.nested.case.control.data$id)
validation.cohort.data$inclusion[subjects_included] = 1

validation.cohort.data$observed.followup = validation.cohort.data$study.exit.age - 
  validation.cohort.data$study.entry.age

selection.model = glm(inclusion ~ observed.outcome 
                      * (study.entry.age + observed.followup), 
                      data = validation.cohort.data, 
                      family = binomial(link = "logit"))

validation.nested.case.control.data$sampling.weights =
  selection.model$fitted.values[validation.cohort.data$inclusion == 1]

set.seed(50)

data = validation.nested.case.control.data

snpDat     = bc_72_snps
form       = diagnosis ~ famhist + as.factor(parity)
info       = list(bc_model_cov_info[[1]], bc_model_cov_info[[3]])
vars       = all.vars(form)[-1]
risk.model = list(model.formula = form,
                  model.cov.info = info,
                  model.snp.info = snpDat,
                  model.log.RR = bc_model_log_or[c(1, 8:11)],
                  model.ref.dataset = ref_cov_dat[, vars],
                  model.ref.dataset.weights = NULL,
                  model.disease.incidence.rates = bc_inc,
                  model.competing.incidence.rates = mort_inc,
                  model.bin.fh.name = "famhist",
                  apply.cov.profile = data[,vars],
                  apply.snp.profile = data[,snpDat$snp.name],
                  n.imp = 5, use.c.code = 1, return.lp = TRUE,
                  return.refs.risk = TRUE)

# Not run since it can take a few minutes
# output = ModelValidation(study.data = data, total.followup.validation = TRUE,
#      predicted.risk.interval = NULL, iCARE.model.object = risk.model,
#      number.of.percentiles = 10)
output

iCARE documentation built on Nov. 8, 2020, 5:25 p.m.