This vignette explores the design of the transformation workflow and functions in hermes.

Problems to solve

Plan

Control function

Here we add parameters that are used for the 2 new methods.

control_normalize <- function(log = TRUE,
                              lib_sizes = NULL,
                              prior_count = 1,
                              fit_type = "parametric") {
  assert_that(
    is.flag(log),
    is.null(lib_sizes) || is_counts_vector(lib_sizes),
    is.number(prior_count) && prior_count >= 0,
    is.string(fit_type)
  )
  list(
    log = log,
    lib_sizes = lib_sizes,
    prior_count = prior_count,
    fit_type = fit_type
  )
}

Method

  normalize_update = function(object,
                        methods = c("vst", "rlog"),
                        control = control_normalize(),
                        ...) {
    method_choices <- c( "vst", "rlog")
    assert_that(all(methods %in% method_choices))
    methods <- match.arg(methods, choices = method_choices, several.ok = TRUE)
    for (method in methods) {
      fun_name <- paste0("h_", method)
      method_result <- eval(utils.nest::call_with_colon(
        fun_name,
        object = object,
        control = control
      ))
      assay(object, method) <- method_result
    }
    metadata(object) <- c(metadata(object), list(control_normalize = control))
    object
  }

Helper functions

h_vst <- function(object, 
                  control = control_normalize()) {
  assert_that(
    is_hermes_data(object),
    is_list_with(control, "fit_type")
  )
  DESeq2::varianceStabilizingTransformation(
    counts(object), 
    fitType = control$fit_type
  )
} 

h_rlog <- function(object, 
                   control = control_normalize()) {
  assert_that(
    is_hermes_data(object),
    is_list_with(control, "fit_type")
  )
  DESeq2::rlog(
    counts(object),
    fitType = control$fit_type
  )
} 

Test

library(hermes)
object <- hermes_data
h_vst(object)
h_rlog(object)
aa <- normalize_update(object)

assay(aa, "vst")
assay(aa, "rlog")

The results look good but a little bit slow.

See also the help page for rlog:

If rlog is run on data with number of samples in [30-49] it will print a message that it may take a few minutes, if the number of samples is 50 or larger, it will print a message that it may take a "long time", and in both cases, it will mention that the vst is a much faster transformation.

Therefore this rlog should not be in the default for normalize(), only vst.

In addition, there are quite a few -Inf values for rlog. But it seems they are only shown in printing, but are not there:

any(!is.finite(assay(aa, "rlog")))

does not show any infinite values. So seems ok.



insightsengineering/hermes documentation built on Dec. 15, 2024, 8:07 a.m.