
Single-cell RNA sequencing has become a common approach to trace developmental processes of cells, however, using exogenous barcodes is more direct than predicting from expression profiles recently, based on that, as gene-editing technology matures, combining this technological method with exogenous barcodes can generate more complex dynamic information for single-cell. In this application note, we introduce an R package: LinTInd for reconstructing a tree from alleles generated by the genome-editing tool known as CRISPR for a moderate time period based on the order in which editing occurs, and for sc-RNA seq, ScarLin can also quantify the similarity between each cluster in three ways.


Via GitHub


Via Bioconductor

if (!requireNamespace("BiocManager", quietly = TRUE))


Import data

The input for LinTInd consists three required files:

and an optional file:

```{R,message=FALSE} data<-paste0(system.file("extdata",package = 'LinTInd'),"/CB_UMI") fafile<-paste0(system.file("extdata",package = 'LinTInd'),"/V3.fasta") cutsite<-paste0(system.file("extdata",package = 'LinTInd'),"/V3.cutSites") celltype<-paste0(system.file("extdata",package = 'LinTInd'),"/celltype.tsv") data<-read.table(data,sep="\t",header=TRUE) ref<-ReadFasta(fafile) cutsite<-read.table(cutsite,col.names = c("indx","start","end")) celltype<-read.table(celltype,header=TRUE,stringsAsFactors=FALSE)

For the sequence file, only the column contain reads' strings is requeired, the cell barcodes and UMIs are both optional.


Array identify and indel visualization

In the first step, we shold use FindIndel() to alignment and find indels, and the function IndelForm() will help us to generate an array-form string for each read.

```{R find indels and generate array-form strings, message=FALSE} scarinfo<-FindIndel(data=data,scarfull=ref,scar=cutsite,indel.coverage="All",type="test",cln=1) scarinfo<-IndelForm(scarinfo,cln=1)

Then for single-cell sequencing, we shold define a final-version of array-form string for each cell use `IndelIdents()`, there are three method are provided :

- *"reads.num"*(default): find an array-form stirng supported by most reads in a cell 
- *"umi.num"*: find an array-form stirng supported by most UMIs in a cell
- *"consensus"*: find the consistent sequences in each cell, and then generate array-form strings from the new reads

For bulk sequencing, in this step, we will generate a "cell barcode" for each read.


After define the indels for each cell, we can use IndelPlot() to visualise them.

IndelPlot(cellsinfo = cellsinfo)

Indel extract and similarity calculate

We can use the function TagProcess() to extract indels for cells/reads. The parameter Cells is optional.


And if the annotation of each cells are provided, we can also use TagDist() to calculate the relationship between each group in three way:

The heatmap of this result will be saved as a pdf file.

tag_dist=TagDist(tag,method = "Jaccard")

Tree reconstruct

In the laste part, we can use BuildTree() to Generate an array generant tree.


Finally, we can use the function PlotTree() to visualise the tree created before.

plotinfo<-PlotTree(treeinfo = treeinfo,data.extract = "TRUE",annotation = "TRUE")

Session Info


mana-W/LinTInd documentation built on Feb. 14, 2022, 10:13 a.m.