This is a mouse liver single-cell data comprising of 4 different experiments with 3 different protocols.
Integration challenge
scMerge
-integrated, this trajectory was estimated with higher degree of biological interpretability. | Name | ID | Author | DOI or URL | Protocol | Organism | Tissue | # of cell types | # of cells | # of batches | |-------|----------|--------|---------------------------|------------|----------|--------|-----------------|------------|--------------| | Liver | GSE87795 | Su | 10.1186/s12864-017-4342-x | SMARTer/C1 | Mouse | Liver | 6 | 1236 | | | | GSE90047 | Yang | 10.1002/hep.29353 | Smart-Seq2 | | | 2 | | | | | GSE87038 | Dong | 10.1186/s13059-018-1416-2 | STRT-seq | | | 3 | | | | | GSE96981 | Camp | 10.1038/nature22796 | SMARTer/C1 | | | 3 | | |
scMerge
article: Main Figure 3b and Supplementary Figure 8. {width=100%}
{width=100%}
Some remarks:
To combine the four liver scRNA-seq datasets, we used a semi-supervised approach by considering the developmental stages as the wanted variation to identify the pseudo-replicates. Note that the cells of the four liver datasets are from different fetal mouse liver developmental stages (E9.5-E17.5). To identify the pseudo-replicates corresponding to the hepatoblasts, hepatocytes and cholangiocytes, we used three known markers of hepatoblasts and cholangiocytes, Alb, Afp and Epcam, to guide the scMerge
algorithm. The sets of pseudo-replicates that are highly expressed the markers are further split according to the developmental stages.
In addition to integrating 4 scRNA-Seq datasets, scMerge
revealed a novel result: We constructed the cell trajectories with cells corresponding to the E17.5 time point of GSE90047 removed. We found that the trajectory associated with scMerge
is most consistent with the full Liver data collection and agrees with current literature, while other methods tended to generate extraneous branches with the subset of the Liver data collection.
We further performed SC3 on the scMerge
integrated data (k=9). We found that the SC3 clustering results have high concordance with the original cell types.
scMerge
dataData availability: Mouse Liver Data (in RData format)
scMerge
parameters for integration:
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.