This vignette will cover the basics of using the rsyncrosim package within the SyncroSim software framework.

Overview of SyncroSim

SyncroSim is a software platform that helps you turn your data into forecasts. At the core of SyncroSim is an engine that automatically structures your existing data, regardless of its original format. SyncroSim transforms this structured data into forecasts by running it through a pipeline of calculations (i.e. a suite of models). Finally, SyncroSim provides a rich interface to interact with your data and models, allowing you to explore and track the consequences of alternative "what-if" forecasting scenarios. Within this software framework is the ability to use and create SyncroSim packages.

For more details consult the SyncroSim online documentation.

Overview of rsyncrosim

rsyncrosim is an R package designed to facilitate the development of modeling workflows for the SyncroSim software framework. Using the rsyncrosim interface, simulation models can be added and run through SyncroSim to transform scenario-based datasets into model forecasts. This R package takes advantage of general features of SyncroSim, such as defining scenarios with spatial or non-spatial inputs, running Monte Carlo simulations, and summarizing model outputs. rsyncrosim requires SyncroSim 2.2.13 or higher.

For more details consult the rsyncrosim CRAN documentation.

SyncroSim package: helloworldTime

To demonstrate the utility of the rsyncrosim interface, we will be using the helloworldTime SyncroSim package. helloworldTime was designed to be a simple package to introduce timesteps to SyncroSim modeling workflows.

The package takes from the user 2 inputs, m and b, representing a slope and an intercept value. It then runs these input values through a linear model, y=mt+b, where t is time, and returns the y value as output.

Infographic of helloworldTime package{width=600px}

For more details on the different features of the helloworldTime SyncroSim package, consult the SyncroSim Enhancing a Package: Adding Timesteps tutorial.

Setup

Install SyncroSim

Before using rsyncrosim you will first need to download and install the SyncroSim software. Versions of SyncroSim exist for both Windows and Linux.

Note: this tutorial was developed using rsyncrosim version 2.0. To use rsyncrosim version 2.0 or greater, SyncroSim version 3.0 or greater is required.

Installing and loading R packages

You will need to install the rsyncrosim R package, either using CRAN or from the rsyncrosim GitHub repository. Versions of rsyncrosim are available for both Windows and Linux.

In a new R script, load the rsyncrosim package.

# Load R package for working with SyncroSim
library(rsyncrosim)

Connecting R to SyncroSim using session()

The next step in setting up the R environment for the rsyncrosim workflow is to create a SyncroSim session object in R that provides the connection to your installed copy of the SyncroSim software. A new session is created using the session() function, in which the first argument is a path to the folder on your computer where SyncroSim has been installed. If the first argument is left blank, then the default install folder is used (Windows only).

mySession <- session("path/to/install_folder")      # Create a session based on SyncroSim install folder
mySession <- session()                              # Using default install folder (Windows only)
mySession                                           # Displays the session object
# Results of this code shown for above
mySession <- session()                              # Using default install folder (Windows only)
mySession                                           # Displays the session object

You can check to see which version of SyncroSim your R script is connected to by running the version() function.

version(mySession)

Installing SyncroSim packages using installPackage()

Finally, check if the helloworldTime package is already installed. Use the packages() function from rsyncrosim to first get a list of all currently installed packages in SyncroSim.

# Get list of installed packages
packages()
installedPackages <- packages()
no_pkg <- installedPackages[installedPackages$name == "noPackage", ]
no_pkg

Currently we do not have any packages installed! To see which packages are available from the SyncroSim package server, you can use the installed = FALSE argument in the packages() function.

availablePackages <- packages(installed = FALSE)
head(availablePackages)

Install helloworldTime using the rynscrosim function installPackage(). This function takes a package name as input and then queries the SyncroSim package server for the specified package.

installedPackages <- packages()
if (is.element(
  "helloworldTime", installedPackages$name)) uninstallPackage(
    "helloworldTime")
# Install helloworldTime
installPackage("helloworldTime")

To install the package from a .ssimpkg file on your local computer rather than installing directly from the server, you can use the addPackage() function with the file path to the .ssimpkg, rather than using the package name as the argument.

# Install helloworldTime using file path to ssimpkg file
installPackage("path/to/helloworldTime.ssimpkg")

Now helloworldTime should be included in the package list:

# Get list of installed packages
packages()
installedPackages <- packages()
time_pkg <- installedPackages[installedPackages$name == "helloworldTime", ]
row.names(time_pkg) <- NULL
time_pkg

Note: you can install multiple versions of the same package using the installPackage() function and specifying the version argument. You can also uninstall packages using the uninstallPackage() function in rsyncrosim.

Create a modeling workflow

When creating a new modeling workflow from scratch, we need to create objects of the following scopes:

These objects are hierarchical, such that a library can contain many projects, and each project can contain many scenarios. All parameters or configurations set in a library are inherited by all projects within the library, and all parameters or configurations set in a project are inherited by all scenarios within that project. See below for further information on these SyncroSim objects.

Create a new library using ssimLibrary()

A SyncroSim library is a file (with .ssim extension) that stores all of your model inputs and outputs. The format of each SyncroSim library is unique to the SyncroSim package with which it is associated. We use the ssimLibrary() function to create a new SsimLibrary object in R that is connected (through your session) to a SyncroSim library file.

# Create a new library
myLibrary <- ssimLibrary(name = "helloworldLibrary.ssim",
                         session = mySession,
                         packages = "helloworldTime")

# Check library information
myLibrary
if (file.exists("helloworldLibrary.ssim")){
  deleteLibrary("helloworldLibrary.ssim", force = TRUE)
}
# Create a new library
myLibrary <- ssimLibrary(name = "helloworldLibrary.ssim",
                         session = mySession,
                         packages = "helloworldTime",
                         overwrite = TRUE)

# Create output
myLibraryOutput <- myLibrary
myLibraryOutput@filepath <- "path/to/helloworldLibrary.ssim"

# Print output
myLibraryOutput

Note: if you have SyncroSim installed in the default location, you do not need to specify the session argument when creating or loading a library. However, if you have SyncroSim installed in a non-default location, then you must include the session argument when creating or loading a library and making any subsequent calls to your library.

We can also use the ssimLibrary() function to open an existing library. For instance, now that we have created a library called "helloworldLibrary.ssim", we would simply specify that we want to open this library using the name argument. The name argument takes the path to the SyncroSim library .ssim file that you would like to open. Since "helloworldLibrary" is in our working directory we do not need to specify the full path to this library.

# Open existing library
myLibrary <- ssimLibrary(name = "helloworldLibrary.ssim")

Note: if you want to create a new library file with an existing library name rather than opening the existing library, you can use overwrite=TRUE in the ssimLibrary() function.

Open a project using project()

Each SyncroSim library contains one or more SyncroSim projects, each represented by a Project object in R. Projects typically store model inputs that are common to all your scenarios. In most situations you will need only a single project for your library; by default each new library starts with a single project named "Definitions" (with a unique projectId= 1). The project() function is used to both create and retrieve projects. Note that the ssimObject here can be the name of a library or scenario.

# Open existing project
myProject = project(ssimObject = myLibrary, project = "Definitions")  # Using name for project
myProject = project(ssimObject = myLibrary, project = 1)              # Using projectId for project

# Check project information
myProject
# Open existing project
myProject = project(ssimObject = myLibrary, project = "Definitions")  # Using name for project

# Create output
myProjectOutput <- myProject
myProjectOutput@filepath <- "path/to/helloworldLibrary.ssim"

# Print output
myProjectOutput

Create a new scenario using scenario()

Finally, each SyncroSim project contains one or more scenarios, each represented by a Scenario object in R.

Scenarios store the specific inputs and outputs associated with each transformer in SyncroSim. SyncroSim models can be broken down into one or more of these transformers. Each transformer essentially runs a series of calculations on the input data to transform it into the output data. Scenarios can contain multiple transformers connected by a series of pipelines, such that the output of one transformer becomes the input of the next.

Each scenario can be identified by its unique scenarioId. The scenario() function is used to both create and retrieve scenarios. Note that the ssimObject here can be the name of a library or a project.

# Create a new scenario (associated with the default project)
myScenario = scenario(ssimObject = myProject, scenario = "My first scenario")

# Check scenario information
myScenario
# Create a new scenario (associated with the default project)
myScenario = scenario(ssimObject = myProject, scenario = "My first scenario")

# Create output
myScenarioOutput <- myScenario
myScenarioOutput@filepath <- "path/to/helloworldLibrary.ssim"

# Print output
myScenarioOutput

View model inputs using datasheet()

Each SyncroSim library contains multiple SyncroSim datasheets. A SyncroSim datasheet is simply a table of data stored in the library, and they represent the input and output data for transformers. Datasheets each have a scope: either library, project, or scenario. Datasheets with a library scope represent data that is specified only once for the entire library, such as the location of the backup folder. Datasheets with a project scope represent data that are shared over all scenarios within a project. Datasheets with a scenario scope represent data that must be specified for each generated scenario. We can view datasheets of varying scopes using the datasheet() function from rsyncrosim.

# View all Datasheets associated with a library, project, or scenario
datasheet(myScenario)

If we want to see more information about each datasheet, such as the scope of the datasheet or if it only accepts a single row of data, we can set the optional argument to TRUE.

datasheet(myScenario, optional = TRUE)

From this output we can see the the Run Control and Inputs datasheets only accept a single row of data (i.e. isSingle = TRUE). This is something to consider when we configure our model inputs.

To view a specific datasheet rather than just a data frame of available datasheets, set the name parameter in the datasheet() function to the name of the datasheet you want to view. The general syntax of the name is: "\<name of package>_\<name of Datasheet>". From the list of datasheets above, we can see that there are 3 datasheets specific to the helloworldTime package.

# View the Inputs datasheet for the scenario
datasheet(myScenario, name = "helloworldTime_InputDatasheet")

Here, we are viewing the contents of a SyncroSim datasheet as an R data frame. Although both SyncroSim datasheets and R data frames are both represented as tables of data with predefined columns and an unlimited number of rows, the underlying structure of these tables differ.

Configure model inputs using datasheet() and addRow()

Currently our Inputs scenario datasheet is empty! We will need to add some values to the Inputs datasheet (InputDatasheet) so we can run our model.

First, assign the Inputs datasheet to a new data frame variable.

# Assign contents of the Inputs datasheet to an R data frame
myInputDataframe <- datasheet(myScenario,
                              name = "helloworldTime_InputDatasheet")

Check the columns that need input values and the type of values these columns require (e.g. string, numeric, logical) using the str() base R function. This function will also let us know if certain columns are factors with specific acceptable values.

# Check the columns of the input data frame
str(myInputDataframe)

The Inputs datasheet requires 2 values:

Now, we will update the Inputs data frame. This can be done in many ways (e.g. using the dplyr package), but rsyncrosim also provides a helper function called addRow() for easily adding new rows to R data frames. The addRow() function takes the targetDataframe as the first value (in this case, our Inputs data frame that we want to update), and the data frame of new rows to append to the input data frame as the second value.

# Create input data and add it to the input data frame
myInputRow <- data.frame(m = 3, b = 10)
myInputDataframe <- addRow(myInputDataframe, myInputRow)

# Check values
myInputDataframe

Saving modifications to datasheets using saveDatasheet()

Now that we have a complete data frame of the Inputs, we will save this data frame to its respective SyncroSim datasheets using the saveDatasheet() function. Since this datasheet is scenario-scoped, we will save it at the scenario level by setting ssimObject = myScenario.

# Save Inputs R data frame to a SyncroSim datasheet
saveDatasheet(ssimObject = myScenario, data = myInputDataframe,
              name = "helloworldTime_InputDatasheet")

Configuring the Pipeline datasheet

Next, we need to add data to the Pipeline datasheet. The Pipeline datasheet is a built-in SyncroSim datasheet, meaning that it comes with every SyncroSim library regardless of which packages that library uses.The Pipeline datasheet determines which transformer stage the scenarios will run and in which order. We use the term "transformers" because these constitute scripts that transform input data into output data. Use the code below to assign the Pipeline datasheet to a new data frame variable and check the values required by the datasheet.

# Assign contents of the Pipeline datasheet to an R data frame
myPipeline <- datasheet(myScenario,
                        name = "core_Pipeline")

# Check the columns of the Pipeline data frame
str(myPipeline)

The Pipeline datasheet requires 2 values:

Below, we use the addRow() and saveDatasheet() functions to update the Pipeline datasheet with the transformer(s) we want to run and the order in which we want to run them. In this case, there is only a single transformer available from the helloworldTime package, called "Hello World Time (R)", so we will add this transformer to the data frame and set the RunOrder to 1.

# Create Pipeline data and add it to the Pipeline data frame
myPipelineRow <- data.frame(StageNameId = "Hello World Time (R)", RunOrder = 1)
myPipeline <- addRow(myPipeline, myPipelineRow)

# Check values
myPipeline

# Save Pipeline R data frame to a SyncroSim Datasheet
saveDatasheet(ssimObject = myScenario, 
              data = myPipeline,
              name = "core_Pipeline")

Configuring the Run Control datasheet

There is one other datasheet that we need to configure for our package to run. The Run Control datasheet provides information about how many timesteps to use in the model. Here, we set the minimum and maximum timesteps for our model. We'll add this information to an R data frame and then add it to the Run Control datasheet using addRow(). We need to specify data for the following 2 columns:

# Assign contents of the run control datasheet to an R data frame
runSettings <- datasheet(myScenario, name = "helloworldTime_RunControl")

# Check the columns of the run control data frame
str(runSettings)

# Create run control data and add it to the run control data frame
runSettingsRow <- data.frame(MinimumTimestep = 1,
                             MaximumTimestep = 10)
runSettings <- addRow(runSettings, runSettingsRow)

# Check values
runSettings

# Save run control R data frame to a SyncroSim datasheet
saveDatasheet(ssimObject = myScenario, data = runSettings,
              name = "helloworldTime_RunControl")

Run scenarios

Setting run parameters with run()

We will now run our scenarios using the run() function in rsyncrosim, starting with the first scenario we created ("My first scenario").

# Run the first scenario we created
myResultScenario <- run(myScenario)

Checking the run log with runLog()

For more information use the runLog() function, in which the only argument is the result scenario variable.

# Get run details for the first result scenario
runLog(myResultScenario)

Note: if your scenario fails to run, it will still produce a result scenario that you can use the runLog() function on to see more information about why the run failed.

View results

Result scenarios

A result scenario is generated when a scenario is run, and is an exact copy of the original scenario (i.e. it contains the original scenario's values for all Inputs datasheets). The result scenario is passed to the transformer in order to generate model output, with the results of the transformer's calculations then being added to the result scenario as output datasheets. In this way the result scenario contains both the output of the run and a snapshot record of all the model inputs.

Check out the current scenarios in your library using the scenario() function.

# Check scenarios that currently exist in your library
scenario(myLibrary)

The first scenario is our original scenario, and the second is the result scenario with a time and date stamp of when it was run. We can also see some other information about these scenarios, such as whether or not the scenario is a result or not (i.e. isResult column).

We can also look at how the datasheets differ between the result scenario and the original scenario using the datasheet() function.

# Take a look at original scenario datasheets
datasheet(myScenario, optional = TRUE)

# Take a look at result scenario datasheets
datasheet(myResultScenario, optional = TRUE)

Looking at the data column, the Outputs does not contain any data in the original scenario, but does in the result scenario.

Viewing results with datasheet()

The next step is to view the Outputs datasheet in the result scenario that was populated from running the original scenario. We can load the result table using the datasheet() function and setting the name parameter to the Outputs datasheet.

# Results of first scenario
myOutputDataframe <- datasheet(myResultScenario,
                               name = "helloworldTime_OutputDatasheet")

# View results table
head(myOutputDataframe)

Working with multiple scenarios

You may want to compare multiple alternative scenarios that have slightly different inputs. To save time, you can copy a scenario that you've already made, give it a different name, and modify the inputs. To copy a completed scenario, use the scenario() function with the sourceScenario argument set to the name of the scenario you want to copy.

# Check which scenarios you currently have in your library
scenario(myLibrary)['Name']

# Create a new scenario as a copy of an existing scenario
myNewScenario <- scenario(ssimObject = myProject,
                          scenario = "My second scenario",
                          sourceScenario = myScenario)

# Make sure this new scenario has been added to the library
scenario(myLibrary)['Name']

To edit the new scenario, we must first load the contents of the Inputs datasheet and assign it to a new R data frame using the datasheet() function. We will set the empty argument to TRUE so that instead of getting the values from the existing scenario, we can start with an empty data frame again.

# Load empty Inputs datasheets as an R data frame
myNewInputDataframe <- datasheet(myNewScenario,
                                 name = "helloworldTime_InputDatasheet",
                                 empty=TRUE)

# Check that we have an empty data frame
str(myNewInputDataframe)

Now, all we need to do is add our data frame of values the same way we did before, using the addRow() function.

# Create input data and add it to the input data frame
newInputRow <- data.frame(m = 4, b = 10)
myNewInputDataframe <- addRow(myNewInputDataframe, newInputRow)

# View the new inputs
myNewInputDataframe

Finally, we will save the updated data frame to a SyncroSim datasheet using saveDatasheet().

# Save R data frame to a SyncroSim datasheet
saveDatasheet(ssimObject = myNewScenario, 
              data = myNewInputDataframe,
              name = "helloworldTime_InputDatasheet")

We will keep the Run Control datasheet the same as the first scenario.

Run scenarios

We now have two SyncroSim scenarios. We can run all the scenarios in our project at once by telling run() which project to use and including a vector of scenarios in the scenario argument.

# Run all scenarios
myResultScenarioAll <- run(myProject,
                           scenario = c("My first scenario",
                                        "My second scenario"))

View results

The output that is returned from running many scenarios at once is actually a list of result scenario objects. To view the results, we can still use the datasheet() function, we just need to index for the result scenario object we are interested in.

datasheet(myResultScenarioAll[2], name = "helloworldTime_OutputDatasheet")

Identifying the parent scenario of a result scenario using parentId()

If you have many alternative scenarios and many result scenarios, you can always find the parent scenario that was run in order to generate the Rrsults scenario using the rsyncrosim function parentId().

parentId(myResultScenarioAll[[1]])
parentId(myResultScenarioAll[[2]])

Access model metadata

Getting library information using info()

Retrieve library information:

info(myLibrary)
libInfo <- info(myLibrary)
libInfo$value[6:9] <- sub(
  ".*\\\\", "", libInfo$value[6:9]
  )
libInfo

Getting information of any ssimObject

The following functions can be used to get useful information about a library, project, or scenario:

You can also find identification numbers of projects or scenarios using the following functions:

Backup your library

Once you have finished running your models, you may want to backup the inputs and results into a zipped .backup subfolder. First, we want to modify the library Backup datasheet to allow the backup of external model data. Since this datasheet is part of the built-in SyncroSim core, the name of the datasheet has the prefix "core". We can get a list of all the datasheets with a library scope using the datasheet() function on a ssimLibrary object.

# Find all library-scoped datasheets
datasheet(myLibrary)

# Get the current values for the library's Backup datasheet
myDataframe <- datasheet(myLibrary, name = "core_Backup")   

# View current values for the library's Backup datasheet
myDataframe

# Add output to the library's Backup datasheet and save
myDataframe$IncludeData <- TRUE 
saveDatasheet(myLibrary, data = myDataframe, name = "core_Backup")

# Check to make sure IncludeOutput is now TRUE
datasheet(myLibrary, "core_Backup")

Now, you can use the backup() function from rsyncrosim to backup a library, project, or scenario.

backup(myLibrary)

rsyncrosim and SyncroSim Studio

It can be useful to work in both rsyncrosim and SyncroSim Studio at the same time. You can easily modify datasheets and run scenarios in rsyncrosim, while simultaneously refreshing the library and plotting outputs in SyncroSim Studio as you go. To sync the library in SyncroSim Studio with the latest changes from the rsyncrosim code, click the refresh icon (circled in red below) in the upper tool bar of SyncroSim Studio.

Using rsyncrosim with SyncroSim Studio{width=600px}



syncrosim/rsyncrosim documentation built on Oct. 18, 2024, 1:29 a.m.