The following is a guide for development of wrapper classes for single-cell data structures. For this guide, we will focus on writing a wrapper class for Seurat objects which supports the cells, cell-sets, and expression-matrix Vitessce data types.

To begin, we can write a skeleton for the class, which contains functions that we will fill in. Here, we start by overriding the convert_and_save function of the parent AbstractWrapper class.

The initialize constructor function takes the parameter obj which will be the Seurat of interest (i.e., the object that we are wrapping).

The convert_and_save function performs any required data conversion steps, creates the web server routes, and creates the corresponding file definition creator functions.

Complete file definitions cannot be finalized at this stage because the file definitions depend on the base_url (the URL on which the files will ultimately be served, typically "http://localhost:8000").

SeuratWrapper <- R6::R6Class("SeuratWrapper",
  inherit = AbstractWrapper,
  public = list(
    obj = NULL,
    initialize = function(obj, ...) {
      super$initialize(...)
      self$obj <- obj
    },
    convert_and_save = function(dataset_uid, obj_i) {
      super$convert_and_save(dataset_uid, obj_i)
      # TODO
    }
  )
)

Cells

We can begin to create the output files and web server route details for the cells data type. Files with the cells data type can contain cell-level observations, such as dimensionality reduction coordinates for each cell.

For now, we can create a new function called create_cells_list which we will fill in later.

The make_cells_file_def_creator is a function which creates and returns a new "file definition creator" function. All file definition creator functions must take the base_url parameter and return a complete file definition. File definitions should be lists with named values:

In convert_and_save we append the new file definition creator to the list self$file_def_creators and we append a new web server route to self$routes.

SeuratWrapper <- R6::R6Class("SeuratWrapper",
  inherit = AbstractWrapper,
  public = list(
    obj = NULL,
    initialize = function(obj, ...) {
      super$initialize(...)
      self$obj <- obj
    },
    create_cells_list = function() {
        # TODO
    },
    make_cells_file_def_creator = function(dataset_uid, obj_i) {
      get_cells <- function(base_url) {
        file_def <- list(
          type = DataType$CELLS,
          fileType = FileType$CELLS_JSON,
          url = super$get_url(base_url, dataset_uid, obj_i, "cells.json")
        )
        return(file_def)
      }
      return(get_cells)
    },
    convert_and_save = function(dataset_uid, obj_i) {
      super$convert_and_save(dataset_uid, obj_i)

      # Get list representations of the data.
      cells_list <- self$create_cells_list()

      # Convert the lists to JSON strings.
      cells_json <- jsonlite::toJSON(cells_list)

      # Save the JSON strings to JSON files.
      write(cells_json, file = self$get_out_dir(dataset_uid, obj_i, "cells.json"))

      # Get the file definition creator functions.
      cells_file_creator <- self$make_cells_file_def_creator(dataset_uid, obj_i)

      # Append the new file definition creator function to the main list.
      self$file_def_creators <- append(self$file_def_creators, cells_file_creator)

      # Append a new web server route object which corresponds to the directory of JSON files to be served.
      self$routes <- append(self$routes, self$get_out_dir_route(dataset_uid, obj_i))
    }
  )
)

Next, we want to fill in the create_cells_list function. This function should return an R list which will be automatically converted to a JSON object by jsonlite.

For reference:

We know that we need to obtain the following from the Seurat object:

When we inspect a Seurat object in the R environment, we can see that it has the type S4 object of class Seurat.

To access the values in an S4 object, we can use slot(obj, "key") where "key" is replaced by the key for the part of the object that we want to access.

Inspecting the object further, we can see that:

To generalize our function, we can get a list of names of each dimensionality reduction available with names(slot(obj, "reductions")).

We can get a list of cell IDs with names(slot(obj, "active.ident")).

Then we can iterate over the cell IDs and set up a new empty object with obj_list(). Note obj_list() returns an empty R list that is always translated to a JSON object (rather than the base R list() which is translated to a JSON array when empty).

Then we can iterate over each available dimensionality reduction and cell. We obtain the cell's (x,y) coordinates with embedding_matrix[cell_id, 1:2] where embedding_matrix is the dimensionality reduction matrix. For example, if the dimensionality reduction is "pca" then the matrix can be accessed at slot(slot(obj, "reductions")[["pca"]], "cell.embeddings").

Finally, we return the R list we created.

SeuratWrapper <- R6::R6Class("SeuratWrapper",
  inherit = AbstractWrapper,
  public = list(
    obj = NULL,
    initialize = function(obj, ...) {
      super$initialize(...)
      self$obj <- obj
    },
    create_cells_list = function() {
        obj <- self$obj
        embeddings <- slot(obj, "reductions")
        available_embeddings <- names(embeddings)

        cell_ids <- names(slot(obj, "active.ident"))
        cells_list <- obj_list()
        for(cell_id in cell_ids) {
            cells_list[[cell_id]] <- list(
                mappings = obj_list()
            )
        }
        for(embedding_name in available_embeddings) {
            embedding <- embeddings[[embedding_name]]
            embedding_matrix <- slot(embedding, "cell.embeddings")
            for(cell_id in cell_ids) {
                cells_list[[cell_id]]$mappings[[embedding_name]] <- unname(embedding_matrix[cell_id, 1:2])
            }
        }

        return(cells_list)
    },
    make_cells_file_def_creator = function(dataset_uid, obj_i) {
      get_cells <- function(base_url) {
        file_def <- list(
          type = DataType$CELLS,
          fileType = FileType$CELLS_JSON,
          url = super$get_url(base_url, dataset_uid, obj_i, "cells.json")
        )
        return(file_def)
      }
      return(get_cells)
    },
    convert_and_save = function(dataset_uid, obj_i) {
      super$convert_and_save(dataset_uid, obj_i)

      # Get list representations of the data.
      cells_list <- self$create_cells_list()

      # Convert the lists to JSON strings.
      cells_json <- jsonlite::toJSON(cells_list)

      # Save the JSON strings to JSON files.
      write(cells_json, file = self$get_out_dir(dataset_uid, obj_i, "cells.json"))

      # Get the file definition creator functions.
      cells_file_creator <- self$make_cells_file_def_creator(dataset_uid, obj_i)

      # Append the new file definition creator function to the main list.
      self$file_def_creators <- append(self$file_def_creators, cells_file_creator)

      # Append a new web server route object which corresponds to the directory of JSON files to be served.
      self$routes <- append(self$routes, self$get_out_dir_route(dataset_uid, obj_i))
    }
  )
)


vitessce/vitessceR documentation built on Oct. 12, 2024, 12:24 a.m.