This vignette explores the introduction of mapping arguments to the HermesData
constructor.
The HermesData
constructor currently just takes a SummarizedExperiment
object and no
other arguments. First it checks for missing required columns in the rowData
and colData
and fills in missing ones with NA
s. Second it derives and checks the gene ID prefix, and then
hands over to the internal default constructors which don't do anything besides validating the
object. The validation includes checks for the counts assay (e.g. it needs to be called counts
),
the required columns and the names and gene ID prefix.
Often the user will have a SummarizedExperiment
object with counts data, but there can be
different names used:
counts
but e.g. count
.SYMBOL
instead of symbol
,
LowDepthFlag
instead of low_depth_flag
etc.Now the user could manually map the names to be aligned with what the constructor needs or to
avoid duplication of columns internally (e.g. filling a symbol
column with NA
s just because
the SYMBOL
column cannot be used). But that would be tedious. Therefore we want to make it easier
for the user to have this mapping been done as part of the constructor work.
So the interface could look like this:
HermesData(object, map_names = list( row_data = c(symbol = "HGNC"), col_data = c(low_depth_flag = "LowDepthFlag"), assay = c(counts = "count") ))
This would be one map_names
argument taking a list of three character vectors
row_data
, col_data
and assay
, where the left-hand (name) side is the target
and the right-hand (value) side is the source.
If one of these vectors is NULL
that is not specified in the list then nothing
will be renamed there.
We would not need to be strict about which targets are used, it would not need
to only be required names in HermesData
objects but also renaming to other targets would
be ok.
We could also work with three separate arguments:
HermesData(object, row_data_map = c(symbol = "HGNC"), col_data_map = c(low_depth_flag = "LowDepthFlag"), assay_map = c(counts = "count") )
But that looks a bit too clunky.
In the constructor, we would do the renaming as the first action, before checking for required columns. Something like this:
library(hermes) library(checkmate) HermesData2 <- function(object, map_names = list()) { assert_that( is_class(object, "SummarizedExperiment"), not_empty(assays(object)) ) # Rename in separate function. object <- rename(object, map_names = map_names) # Other actions here. # Then return. .HermesData(object) }
Actually, looking at it like this - we just want a specific rename
method:
h_map_pos <- function(names, map) { assert_character(map, min.chars = 1L, any.missing = FALSE, unique = TRUE, names = "unique") assert_subset(map, names) match(map, names) } setMethod( "rename", signature = "SummarizedExperiment", definition = function(x, mapping) { assert_list(mapping, types = "character") assert_subset(names(mapping), c("row_data", "col_data", "assay")) if (!is.null(row_map <- mapping$row_data)) { row_data <- rowData(x) col_names <- names(row_data) col_pos <- h_map_pos(names = col_names, map = row_map) names(rowData(x))[col_pos] <- names(row_map) } if (!is.null(col_map <- mapping$col_data)) { col_data <- colData(x) col_names <- names(col_data) col_pos <- h_map_pos(names = col_names, map = col_map) names(colData(x))[col_pos] <- names(col_map) } if (!is.null(assay_map <- mapping$assay)) { assay_names <- assayNames(x) assay_pos <- h_map_pos(names = assay_names, map = assay_map) assayNames(x)[assay_pos] <- names(assay_map) } x } )
Let's try this out:
object <- summarized_experiment assayNames(object) <- "count" object2 <- rename(object, mapping = list( row_data = c(symbol = "HGNC"), col_data = c(low_depth_flag = "LowDepthFlag"), assay = c(counts = "count") )) names(rowData(object2)) names(colData(object2)) assayNames(object2)
This is nice. The user can use this in the pipeline from the original input to the final
HermesData
object in a very transparent manner. So we will not make this a constructor argument
but a separate rename method instead.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.