In this example, we will be analyzing two scATAC-seq datasets (5K and 10K) and one scRNA-seq dataset from PBMC. All three datasets are freely available from 10X genomics. All the data used in this study can be downloaded here.
In detail, we will be performing the following analysis:
Step 0. Data Download
We will start from quality control file singlecell.csv
generated by cell-ranger ATAC pipeline and snap
file generated using snaptools. See here about how to create a snap file. We will download the snap file(See here).
Step 1. Barcode selection First, we select the high-quality barcodes based on two major criteria: 1) number of unique fragments; 2) fragments in promoter ratio;
> library(SnapATAC);
> snap.files = c(
"atac_pbmc_5k_nextgem.snap",
"atac_pbmc_10k_nextgem.snap"
);
> sample.names = c(
"PBMC 5K",
"PBMC 10K"
);
> barcode.files = c(
"atac_pbmc_5k_nextgem_singlecell.csv",
"atac_pbmc_10k_nextgem_singlecell.csv"
);
> x.sp.ls = lapply(seq(snap.files), function(i){
createSnap(
file=snap.files[i],
sample=sample.names[i]
);
})
> names(x.sp.ls) = sample.names;
> barcode.ls = lapply(seq(snap.files), function(i){
barcodes = read.csv(
barcode.files[i],
head=TRUE
);
# remove NO BAROCDE line
barcodes = barcodes[2:nrow(barcodes),];
barcodes$logUMI = log10(barcodes$passed_filters + 1);
barcodes$promoter_ratio = (barcodes$promoter_region_fragments+1) / (barcodes$passed_filters + 1);
barcodes
})
> plots = lapply(seq(snap.files), function(i){
p1 = ggplot(
barcode.ls[[i]],
aes(x=logUMI, y=promoter_ratio)) +
geom_point(size=0.3, col="grey") +
theme_classic() +
ggtitle(sample.names[[i]]) +
ylim(0, 1) + xlim(0, 6) +
labs(x = "log10(UMI)", y="promoter ratio")
p1
})
> plots
> x.sp.ls
## $`PBMC 5K`
## number of barcodes: 20000
## number of bins: 0
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
##
## $`PBMC 10K`
## number of barcodes: 20000
## number of bins: 0
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
# for both datasets, we identify usable barcodes using [3.5-5] for log10(UMI) and [0.4-0.8] for promoter ratio as cutoff.
> cutoff.logUMI.low = c(3.5, 3.5);
> cutoff.logUMI.high = c(5, 5);
> cutoff.FRIP.low = c(0.4, 0.4);
> cutoff.FRIP.high = c(0.8, 0.8);
> barcode.ls = lapply(seq(snap.files), function(i){
barcodes = barcode.ls[[i]];
idx = which(
barcodes$logUMI >= cutoff.logUMI.low[i] &
barcodes$logUMI <= cutoff.logUMI.high[i] &
barcodes$promoter_ratio >= cutoff.FRIP.low[i] &
barcodes$promoter_ratio <= cutoff.FRIP.high[i]
);
barcodes[idx,]
});
> x.sp.ls = lapply(seq(snap.files), function(i){
barcodes = barcode.ls[[i]];
x.sp = x.sp.ls[[i]];
barcode.shared = intersect(x.sp@barcode, barcodes$barcode);
x.sp = x.sp[match(barcode.shared, x.sp@barcode),];
barcodes = barcodes[match(barcode.shared, barcodes$barcode),];
x.sp@metaData = barcodes;
x.sp
})
> names(x.sp.ls) = sample.names;
> x.sp.ls
## $`PBMC 5K`
## number of barcodes: 4526
## number of bins: 0
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
##
## $`PBMC 10K`
## number of barcodes: 9039
## number of bins: 0
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
# combine two snap object
> x.sp = Reduce(snapRbind, x.sp.ls);
> x.sp@metaData["sample"] = x.sp@sample;
> x.sp
## number of barcodes: 13565
## number of bins: 0
## number of genes: 0
## number of peaks: 0
> table(x.sp@sample);
## PBMC 10K PBMC 5K
## 9039 4526
Step 2. Add cell-by-bin matrix
Next, we add the cell-by-bin matrix of 5kb resolution to the snap object. This function will automatically read the cell-by-bin matrix from two snap files and add the concatenated matrix to bmat
attribute of snap object.
> x.sp = addBmatToSnap(x.sp, bin.size=5000);
Step 3. Matrix binarization We will convert the cell-by-bin count matrix to a binary matrix. Some items in the count matrix have abnormally high coverage perhaps due to the alignment errors. Therefore, we next remove top 0.1% items in the count matrix and then convert the remaining non-zero values to 1.
> x.sp = makeBinary(x.sp, mat="bmat");
Step 4. Bin filtering First, we filter out any bins overlapping with the ENCODE blacklist to prevent from potential artifacts.
> library(GenomicRanges);
> black_list = read.table("hg19.blacklist.bed.gz");
> black_list.gr = GRanges(
black_list[,1],
IRanges(black_list[,2], black_list[,3])
);
> idy = queryHits(
findOverlaps(x.sp@feature, black_list.gr)
);
> if(length(idy) > 0){
x.sp = x.sp[,-idy, mat="bmat"];
};
> x.sp
## number of barcodes: 13565
## number of bins: 624794
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
Second, we remove unwanted chromosomes.
> chr.exclude = seqlevels(x.sp@feature)[grep("random|chrM", seqlevels(x.sp@feature))];
> idy = grep(paste(chr.exclude, collapse="|"), x.sp@feature);
> if(length(idy) > 0){
x.sp = x.sp[,-idy, mat="bmat"]
};
> x.sp
## number of barcodes: 13565
## number of bins: 624297
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
Third, the coverage of bins roughly obeys a log normal distribution. We remove the top 5% bins that overlap with invariant features such as the house keeping gene promoters.
> bin.cov = log10(Matrix::colSums(x.sp@bmat)+1);
> hist(
bin.cov[bin.cov > 0],
xlab="log10(bin cov)",
main="log10(Bin Cov)",
col="lightblue",
xlim=c(0, 5)
);
> bin.cutoff = quantile(bin.cov[bin.cov > 0], 0.95);
> idy = which(bin.cov <= bin.cutoff & bin.cov > 0);
> x.sp = x.sp[, idy, mat="bmat"];
> x.sp
## number of barcodes: 13565
## number of bins: 534985
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
Next, we will further remove any cells of bin coverage less than 1,000. The rational behind this is that some cells may have high number of unique fragments but end up with low bin coverage after filtering. This step is optional but highly recommanded.
> idx = which(Matrix::rowSums(x.sp@bmat) > 1000);
> x.sp = x.sp[idx,];
> x.sp
## number of barcodes: 13434
## number of bins: 534985
## number of genes: 0
## number of peaks: 0
## number of motifs: 0
Step 5. Dimentionality reduction SnapATAC applies diffusion maps algorithm, a nonlinear dimensionality reduction technique that discovers low-dimension manifold by performing random walk on the data and is highly robust to noise and perturbation.
However, the computational cost of the diffusion maps algorithm scales exponentially with the increase of number of cells. To overcome this limitation, here we combine the Nyström method (a sampling technique) and diffusion maps to present Nyström Landmark diffusion map to generate the low-dimentional embedding for large-scale dataset.
A Nyström landmark diffusion maps algorithm includes three major steps:
In this example, we will sample 10,000 cells as landmarks and project the remaining query cells onto the diffusion maps embedding of landmarks.
> row.covs.dens <- density(
x = x.sp@metaData[,"logUMI"],
bw = 'nrd', adjust = 1
);
> sampling_prob <- 1 / (approx(x = row.covs.dens$x, y = row.covs.dens$y, xout = x.sp@metaData[,"logUMI"])$y + .Machine$double.eps);
> set.seed(1);
> idx.landmark.ds <- sort(sample(x = seq(nrow(x.sp)), size = 10000, prob = sampling_prob));
Split the x.sp
into landmark (x.landmark.sp
) and query (x.query.sp
) cells.
> x.landmark.sp = x.sp[idx.landmark.ds,];
> x.query.sp = x.sp[-idx.landmark.ds,];
Run diffusion maps on the landmark cells.
> x.landmark.sp = runDiffusionMaps(
obj= x.landmark.sp,
input.mat="bmat",
num.eigs=50
);
> x.landmark.sp@metaData$landmark = 1;
Porject query cells to landmarks.
> x.query.sp = runDiffusionMapsExtension(
obj1=x.landmark.sp,
obj2=x.query.sp,
input.mat="bmat"
);
> x.query.sp@metaData$landmark = 0;
Combine landmark and query cells. Note: To merge snap objects, all the matrix (bmat, gmat, pmat) and metaData must be of the same number of columns between snap objects.
> x.sp = snapRbind(x.landmark.sp, x.query.sp);
> x.sp = x.sp[order(x.sp@metaData[,"sample"])]; #IMPORTANT
Step 6. Determine significant components We next determine the number of eigen-vectors to include for downstream analysis. We use an ad hoc method by simply looking at a pairwise plot and select the number of eigen vectors that the scatter plot starts looking like a blob. In the below example, we choose the first 15 eigen vectors.
> plotDimReductPW(
obj=x.sp,
eigs.dims=1:50,
point.size=0.3,
point.color="grey",
point.shape=19,
point.alpha=0.6,
down.sample=5000,
pdf.file.name=NULL,
pdf.height=7,
pdf.width=7
);
Step 7. Graph-based clustering Using the selected significant components, we next construct a K Nearest Neighbor (KNN) Graph. Each cell is a node and the k-nearest neighbors of each cell are identified according to the Euclidian distance and edges are draw between neighbors in the graph.
> x.sp = runKNN(
obj=x.sp,
eigs.dims=1:20,
k=15
);
> library(leiden);
> x.sp=runCluster(
obj=x.sp,
tmp.folder=tempdir(),
louvain.lib="leiden",
seed.use=10,
resolution=0.7
);
Step 8. Visualization SnapATAC visualizes and explores the data using tSNE (FI-tsne) and UMAP. In this example, we compute the UMAP embedding.
> library(umap);
> x.sp = runViz(
obj=x.sp,
tmp.folder=tempdir(),
dims=2,
eigs.dims=1:20,
method="umap",
seed.use=10
);
> par(mfrow = c(2, 2));
> plotViz(
obj= x.sp,
method="umap",
main="Cluster",
point.color=x.sp@cluster,
point.size=0.2,
point.shape=19,
text.add=TRUE,
text.size=1,
text.color="black",
down.sample=10000,
legend.add=FALSE
);
> plotFeatureSingle(
obj=x.sp,
feature.value=x.sp@metaData[,"logUMI"],
method="umap",
main="Read Depth",
point.size=0.2,
point.shape=19,
down.sample=10000,
quantiles=c(0.01, 0.99)
);
> plotViz(
obj= x.sp,
method="umap",
main="Sample",
point.size=0.2,
point.shape=19,
point.color=x.sp@sample,
text.add=FALSE,
text.size=1.5,
text.color="black",
down.sample=10000,
legend.add=TRUE
);
> plotViz(
obj= x.sp,
method="umap",
main="Landmark",
point.size=0.2,
point.shape=19,
point.color=x.sp@metaData[,"landmark"],
text.add=FALSE,
text.size=1.5,
text.color="black",
down.sample=10000,
legend.add=TRUE
);
Step 9. scRNA-seq based annotation
In this example, we will annotate the single cell ATAC-seq clusters based on corresponding scRNA-seq dataset. Seurat object for 10X PBMC single cell RNA-seq (pbmc_10k_v3.rds
) can be downloaded here.
> library(Seurat);
> pbmc.rna = readRDS("pbmc_10k_v3.rds");
> pbmc.rna$tech = "rna";
> variable.genes = VariableFeatures(object = pbmc.rna);
> genes.df = read.table("gencode.v19.annotation.gene.bed");
> genes.gr = GRanges(genes.df[,1], IRanges(genes.df[,2], genes.df[,3]), name=genes.df[,4]);
> genes.sel.gr = genes.gr[which(genes.gr$name %in% variable.genes)];
## reload the bmat, this is optional but highly recommanded
> x.sp = addBmatToSnap(x.sp);
> x.sp = createGmatFromMat(
obj=x.sp,
input.mat="bmat",
genes=genes.sel.gr,
do.par=TRUE,
num.cores=10
);
We next convert the snap object to Seurat object in preparation of integration.
> pbmc.atac <- snapToSeurat(
obj=x.sp,
eigs.dims=1:20,
norm=TRUE,
scale=TRUE
);
> transfer.anchors <- FindTransferAnchors(
reference = pbmc.rna,
query = pbmc.atac,
features = variable.genes,
reference.assay = "RNA",
query.assay = "ACTIVITY",
reduction = "cca"
);
> celltype.predictions <- TransferData(
anchorset = transfer.anchors,
refdata = pbmc.rna$celltype,
weight.reduction = pbmc.atac[["SnapATAC"]],
dims = 1:20
);
> x.sp@metaData$predicted.id = celltype.predictions$predicted.id;
> x.sp@metaData$predict.max.score = apply(celltype.predictions[,-1], 1, max);
> x.sp@cluster = as.factor(x.sp@metaData$predicted.id);
Step 10. Create psudo multiomics cells
Now each single cell in the snap object x.sp
contains information of both chromatin accessibility @bmat
and gene expression @gmat
.
> refdata <- GetAssayData(
object = pbmc.rna,
assay = "RNA",
slot = "data"
);
> imputation <- TransferData(
anchorset = transfer.anchors,
refdata = refdata,
weight.reduction = pbmc.atac[["SnapATAC"]],
dims = 1:20
);
> x.sp@gmat = t(imputation@data);
> rm(imputation); # free memory
> rm(refdata); # free memory
> rm(pbmc.rna); # free memory
> rm(pbmc.atac); # free memory
Step 11. Remove cells of low prediction score
> hist(
x.sp@metaData$predict.max.score,
xlab="prediction score",
col="lightblue",
xlim=c(0, 1),
main="PBMC 10X"
);
> abline(v=0.5, col="red", lwd=2, lty=2);
> table(x.sp@metaData$predict.max.score > 0.5);
## FALSE TRUE
## 331 13103
> x.sp = x.sp[x.sp@metaData$predict.max.score > 0.5,];
> x.sp
## number of barcodes: 13045
## number of bins: 627478
## number of genes: 19089
## number of peaks: 0
> plotViz(
obj=x.sp,
method="umap",
main="PBMC 10X",
point.color=x.sp@metaData[,"predicted.id"],
point.size=0.5,
point.shape=19,
text.add=TRUE,
text.size=1,
text.color="black",
down.sample=10000,
legend.add=FALSE
);
Step 12. Gene expression projected onto UMAP We next project the expression level of marker genes onto the the UMAP embedding.
> marker.genes = c(
"IL32", "LTB", "CD3D",
"IL7R", "LDHB", "FCGR3A",
"CD68", "MS4A1", "GNLY",
"CD3E", "CD14", "CD14",
"FCGR3A", "LYZ", "PPBP",
"CD8A", "PPBP", "CST3",
"NKG7", "MS4A7", "MS4A1",
"CD8A"
);
> par(mfrow = c(3, 3));
> for(i in 1:9){
j = which(colnames(x.sp@gmat) == marker.genes[i])
plotFeatureSingle(
obj=x.sp,
feature.value=x.sp@gmat[,j],
method="umap",
main=marker.genes[i],
point.size=0.1,
point.shape=19,
down.sample=10000,
quantiles=c(0.01, 0.99)
)};
Step 13. Identify peaks
Next we aggregate reads from the each cluster to create an ensemble track for peak calling and visualization. This step will generate a narrowPeak
that contains the identified peak and .bedGraph
file for visualization. To obtain the most robust result, we don't recommend to perform this step for clusters with cell number less than 100. In the below example, SnapATAC creates PBMC.CD4_Naive_peaks.narrowPeak
and PBMC.CD4_Naive_treat_pileup.bdg
. bdg
file can be compressed to bigWig
file using bedGraphToBigWig
for IGV or Genome Browser visulization.
> system("which snaptools");
/home/r3fang/anaconda2/bin/snaptools
> system("which macs2");
/home/r3fang/anaconda2/bin/macs2
> peaks = runMACS(
obj=x.sp[which(x.sp@cluster=="CD4 Naive"),],
output.prefix="PBMC.CD4_Naive",
path.to.snaptools="/home/r3fang/anaconda2/bin/snaptools",
path.to.macs="/home/r3fang/anaconda2/bin/macs2",
gsize="hs", # mm, hs, etc
buffer.size=500,
num.cores=10,
macs.options="--nomodel --shift 100 --ext 200 --qval 5e-2 -B --SPMR",
tmp.folder=tempdir()
);
Instead of performing peak calling for only one cluster, we provide a short script that perform this step for all clusters.
# call peaks for all cluster with more than 100 cells
> clusters.sel = names(table(x.sp@cluster))[which(table(x.sp@cluster) > 100)];
> peaks.ls = mclapply(seq(clusters.sel), function(i){
print(clusters.sel[i]);
peaks = runMACS(
obj=x.sp[which(x.sp@cluster==clusters.sel[i]),],
output.prefix=paste0("PBMC.", gsub(" ", "_", clusters.sel)[i]),
path.to.snaptools="/home/r3fang/anaconda2/bin/snaptools",
path.to.macs="/home/r3fang/anaconda2/bin/macs2",
gsize="hs", # mm, hs, etc
buffer.size=500,
num.cores=1,
macs.options="--nomodel --shift 100 --ext 200 --qval 5e-2 -B --SPMR",
tmp.folder=tempdir()
);
peaks
}, mc.cores=5);
> peaks.names = system("ls | grep narrowPeak", intern=TRUE);
> peak.gr.ls = lapply(peaks.names, function(x){
peak.df = read.table(x)
GRanges(peak.df[,1], IRanges(peak.df[,2], peak.df[,3]))
})
> peak.gr = reduce(Reduce(c, peak.gr.ls));
This will create a bdg
file for each cluster for visilizations using IGV or other genomic browsers such as UW genome browser. Below is a screenshot of regions flanking FOXJ2 gene from UW genome browser.
Step 14. Create a cell-by-peak matrix
Using merged peaks as a reference, we next create the cell-by-peak matrix and add it to the snap object. We will first write down combined peak list as peaks.combined.bed
.
> peaks.df = as.data.frame(peak.gr)[,1:3];
> write.table(peaks.df,file = "peaks.combined.bed",append=FALSE,
quote= FALSE,sep="\t", eol = "\n", na = "NA", dec = ".",
row.names = FALSE, col.names = FALSE, qmethod = c("escape", "double"),
fileEncoding = "")
Next we create cell-by-peak matrix and add to the snap file. This step will take a while.
$ snaptools snap-add-pmat \
--snap-file atac_pbmc_10k_nextgem.snap \
--peak-file peaks.combined.bed &
$ snaptools snap-add-pmat \
--snap-file atac_pbmc_5k_nextgem.snap \
--peak-file peaks.combined.bed
We then add the cell-by-peak matrix to the existing snap object in R.
> x.sp = addPmatToSnap(x.sp);
Step 15. Identify differentially accessible peaks For a given group of cells Ci, we first look for their neighboring cells Cj (|Ci|=|Cj|) in the diffusion component space as “background” cells to compare to. If Ci accounts for more than half of the total cells, we use the remaining cells as local background. Next, we aggregate Ci and Cj to create two raw-count vectors as Vci and Vcj. We then perform differential analysis between Vci and Vcj using exact test as implemented in R package edgeR (v3.18.1) with BCV=0.1 for mouse and 0.4 for human. P-value is then adjusted into False Discovery Rate (FDR) using Benjamini-Hochberg correction. Peaks with FDR less than 0.05 are selected as significant DARs.
We recognize the limitation of this approach is that the statically significance may be under power for small clusters. For clusters that failed to identify significant differential elements, we rank the elements based on the enrichment pvalue and pick the top 2,000 peaks a representative elements for motif analysis.
> DARs = findDAR(
obj=x.sp,
input.mat="pmat",
cluster.pos="CD14+ Monocytes",
cluster.neg.method="knn",
test.method="exactTest",
bcv=0.4, #0.4 for human, 0.1 for mouse
seed.use=10
);
> DARs$FDR = p.adjust(DARs$PValue, method="BH");
> idy = which(DARs$FDR < 5e-2 & DARs$logFC > 0);
> par(mfrow = c(1, 2));
> plot(DARs$logCPM, DARs$logFC,
pch=19, cex=0.1, col="grey",
ylab="logFC", xlab="logCPM",
main="CD14+ Monocytes"
);
> points(DARs$logCPM[idy],
DARs$logFC[idy],
pch=19,
cex=0.5,
col="red"
);
> abline(h = 0, lwd=1, lty=2);
> covs = Matrix::rowSums(x.sp@pmat);
> vals = Matrix::rowSums(x.sp@pmat[,idy]) / covs;
> vals.zscore = (vals - mean(vals)) / sd(vals);
> plotFeatureSingle(
obj=x.sp,
feature.value=vals.zscore,
method="umap",
main="CD14+ Monocytes",
point.size=0.1,
point.shape=19,
down.sample=5000,
quantiles=c(0.01, 0.99)
);
Step 16. Motif variability analysis SnapATAC incorporates chromVAR (Schep et al) for motif variability analysis.
> library(chromVAR);
> library(motifmatchr);
> library(SummarizedExperiment);
> library(BSgenome.Hsapiens.UCSC.hg19);
> x.sp = makeBinary(x.sp, "pmat");
> x.sp@mmat = runChromVAR(
obj=x.sp,
input.mat="pmat",
genome=BSgenome.Hsapiens.UCSC.hg19,
min.count=10,
species="Homo sapiens"
);
> x.sp;
## number of barcodes: 13103
## number of bins: 627478
## number of genes: 19089
## number of peaks: 157750
## number of motifs: 271
> motif_i = "MA0071.1_RORA";
> dat = data.frame(x=x.sp@metaData$predicted.id, y=x.sp@mmat[,motif_i]);
> p <- ggplot(dat, aes(x=x, y=y, fill=x)) +
theme_classic() +
geom_violin() +
xlab("cluster") +
ylab("motif enrichment") +
ggtitle("MA0071.1_RORA") +
theme(
plot.margin = margin(5,1,5,1, "cm"),
axis.text.x = element_text(angle = 90, hjust = 1),
axis.ticks.x=element_blank(),
legend.position = "none"
);
Step 17. De novo motif discovery
SnapATAC can help identify master regulators that are enriched in the differentially accessible regions (DARs). This will creates a homer motif report knownResults.html
in the folder ./homer/C2
.
> system("which findMotifsGenome.pl");
/projects/ps-renlab/r3fang/public_html/softwares/homer/bin/findMotifsGenome.pl
> DARs = findDAR(
obj=x.sp,
input.mat="pmat",
cluster.pos="Double negative T cell",
cluster.neg.method="knn",
test.method="exactTest",
bcv=0.4, #0.4 for human, 0.1 for mouse
seed.use=10
);
> DARs$FDR = p.adjust(DARs$PValue, method="BH");
> idy = which(DARs$FDR < 5e-2 & DARs$logFC > 0);
> motifs = runHomer(
x.sp[,idy,"pmat"],
mat = "pmat",
path.to.homer = "/projects/ps-renlab/r3fang/public_html/softwares/homer/bin/findMotifsGenome.pl",
result.dir = "./homer/DoubleNegativeTcell",
num.cores=5,
genome = 'hg19',
motif.length = 10,
scan.size = 300,
optimize.count = 2,
background = 'automatic',
local.background = FALSE,
only.known = TRUE,
only.denovo = FALSE,
fdr.num = 5,
cache = 100,
overwrite = TRUE,
keep.minimal = FALSE
);
Step 18. Predict gene-enhancer pairs Finally, using the "pseudo" cells, we next develop a method to link the distal regulatory elements to the target genes based on the association between expression of a gene and chromatin accessibility at its distal elements in single cells. For a given marker gene, we first identify peaks within a flanking window the target gene. For each flanking peak, we perform logistic regression using gene expression as input varaible to predict the binarized chromatin state. The resulting model estimates the significance of the association between chromatin accessibility and gene expression.
> TSS.loci = GRanges("chr12", IRanges(8219067, 8219068));
> pairs = predictGenePeakPair(
x.sp,
input.mat="pmat",
gene.name="C3AR1",
gene.loci=resize(TSS.loci, width=500000, fix="center"),
do.par=FALSE
);
# convert the pair to genome browser arc plot format
> pairs.df = as.data.frame(pairs);
> pairs.df = data.frame(
chr1=pairs.df[,"seqnames"],
start1=pairs.df[,"start"],
end1=pairs.df[,"end"],
chr2="chr2",
start2=8219067,
end2=8219068,
Pval=pairs.df[,"logPval"]
);
> head(pairs.df)
## chr1 start1 end1 chr2 start2 end2 Pval
## 1 chr12 7984734 7985229 chr2 8219067 8219068 14.6075918
## 2 chr12 7987561 7988085 chr2 8219067 8219068 5.6718381
## 3 chr12 7989776 7990567 chr2 8219067 8219068 24.2564608
## 4 chr12 7996454 7996667 chr2 8219067 8219068 0.6411017
## 5 chr12 8000059 8000667 chr2 8219067 8219068 2.0324922
## 6 chr12 8012404 8013040 chr2 8219067 8219068 0.0000000
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.