knitr::opts_chunk$set(echo = TRUE)

Introduction

We discuss the design of sRACIPE in this article. As our primary aim is to develop a robust framework for simulating gene expression profiles that can capture the experimental observations, we utilized the well established data structures from BioConductor to store the modeling results and parameters. We extend the SummarizedExperiment S4 class to create a new class RacipeSE to store and access the circuit/network, simulated gene expressions, parameters, intial conditions and other meta information. As for any S4 class objects, the slots should not be accessed directly and the getter and accessor functions should be used instead.

Configuration

When a RacipeSE object is constructed, a SummarizedExperiment object is constructed and the default configuration is added as metadata. The default configuration is stored as a configData data object. These configuration parameters can also be passed as argumnets to sracipeSimulate function. The getter/accessor function for configuration is sracipeConfig.

suppressWarnings(suppressPackageStartupMessages(library(sRACIPE)))

racipe <- RacipeSE() # Construct an empty RacipeSE object
racipe  # View the object. Notice the config in metadata. 
sracipeConfig(racipe) # access the config

The config list includes other lists or vectors like simParams, stochParams, hyperParams, options, thresholds, LCParams etc. The list simParams contains values for parameters like the number of models (numModels), simulation time(simulationTime), number of convergence iterations (numConvergenceIter), step size for simulations (integrateStepSize), when to start recording the gene expressions (printStart), time interval between recordings (printInterval), number of initial conditions (nIC), output precision (outputPrecision), tolerance for adaptive runge kutta method (rkTolerance), parametric variation (paramRange). The list stochParams contains the parameters for stochastic simulations like the number of noise levels to be simulated (nNoise), the ratio of subsequent noise levels (noiseScalingFactor), maximum noise (initialNoise), whether to use same noise for all genes or to scale it as per the median expression of the genes (scaledNoise), ratio of shot noise to additive noise (shotNoise). The list hyperParams contains the parameters like the minimum and maximum production and degration of the genes, fold change, hill coefficient etc. The list options includes logical values like annealing (anneal), scaling of noise (scaledNoise), generation of new initial conditions (genIC), parameters (genParams) and whether to integrate or not (integrate). The list 'LCParams' contains all the parameters used by the limit cycle detection algorithm, such as the number of sampled periods (NumSampledPeriods) or the number of iterations to simulate (LCIter).

Metadata information also includes annotation, nInteraction (number of interactions in the circuit), normalized (whether the data is normalized or not), data analysis lists like pca, umap, cluster assignment of the models etc. These are added to the object in subsequent steps in the pipeline.

Circuit

SummarizedExperiment slot rowData is used to store the circuit topology. It is a square matrix with dimension equal to the number of genes in the circuit. The values of the matrix represent the type of interaction in the gene pair given by row and column. 1 represents TF activation, 2 TF inhibition, 3 activation by degradation inhibition, 4 inhibition by degradation activation, 5 signaling activation, 6 signaling inihibition, and 0 for no interaction. The sracipeCircuit getter and accessor should be used to access this information. The annotation getter and setter set/get the name. Setting the circuit also adds nInteractions and annotation to the RacipeSE object.

data("demoCircuit")
sracipeCircuit(racipe) <- demoCircuit
annotation(racipe) <- "demoCircuit"
annotation(racipe)

Parameters, initial conditions, and convergence data

SummarizedExperiment slot colData contains the parameters and initial conditions for each model. Each gene in the circuit has two parameters, namely, its production rate and its degradation rate. Each interaction has three parameters, namely, threshold of activation, the hill coefficient, and the fold change. Each gene has one or more initial gene expression values as specified by nIC. The corresponding getter and setter are sracipeParams and sracipeIC. When convergence testing is done, the convergence status of an expression is recorded, as well as the number of iterations it took to converge, if it did. A convergence value of 0 means no convergence, 1 convergence, 2 it converges to a limit cycle, and 3 means the expression is invalid due to being either NaN or negative. The convergence data for each expression is also stored in the colData slot, and the getter for it is sracipeConverge.

Simulated gene expression

As such, sRACIPE can be used to simulate the trajectory for a single model with specific kinetic parameters or a large large number of models with randomized parameters. These sRACIPE models can be considered as separate samples/cells differing from each other in terms of the parameters reflecting the cellular heterogeneity. The SummarizedExperiment slot assays is used for storing simulated gene expressions. The rows of these matrix-like elements correspond to various genes in the circuit and columns correspond to models. The first element of the List is used for unperturbed deterministic simulations. The subsequent elements are used for stochastic simulations at different noise levels and/or knockout simulations. The name for each assay contains the information on the noise level (for stochastic simulations), knocked out gene (knockout simulations), or time (temporal simulations). The assay or assays accessor and getters from the SummarizedExperiment class should be used to access or modify these slots.

Time Series

If one is interested in trajectories of a single model, then sRACIPE provides an argument timeSeries in the sracipeSimulate funcion. If enabled, sRACIPE will generate the trajectory for a single model and the corresponding plotting functions will also show time series plots. In such case, the trajectories are stored in metadata instead of assays. The use of sracipeGetTS function is recommended in this case.

Usually, sRACIPE only records the steady states or the simulated gene expression at the end of the simulation. But the model trajectories can also be recorded and analyzed using sRACIPE using the variables printInterval and printStart. These variables define how often the simulated gene expressions are recorded. Thus, one can record multiple time points during the model simulations. If intermediate time points are also recorded, then the first element will contain the last (steady-state) solutions and the other slots will contain the values at the intermediate time points. For more details on temporal simulations of large number of models, please refer to the corresponding article.

Limit Cycle Data

Limit cycle data is stored in the metadata of the resulting RacipeSE object from sracipeSimulate, primarily in the LCData element. The LCData takes is a data frame which records a complete oscillation in expression for each limit cycle found. The data frame collects the expression of each gene along the oscillation, which model the limit cycle comes from, which limit cycle within that model is being referred to, as well as the period of the limit cycle. sRACIPE defines the period of a limit cycle as the number of integration steps needed to complete the period. Note that this means the period recorded is dependent on both the step size and integration method used for the limit cycle detection simulation. The metadata also holds the total number of limit cycles detected in the simulation as an integer in the totalNumofLCs element.



lusystemsbio/sRACIPE documentation built on Jan. 29, 2025, 9:59 a.m.