estimateHellingerDiv | R Documentation |
Given a the methylation levels of two individual, the function computes the information divergence between methylation levels.
estimateHellingerDiv(p, n = NULL)
p |
A numerical vector of the methylation levels p = c(p1, p2) of individuals 1 and 2. |
n |
if supplied, it is a vector of integers denoting the coverages used in the estimation of the methylation levels. |
The methylation level p_ij
for an individual i
at
cytosine site j
corresponds to a probability vector p^ij = (p_ij,
1 - p_ij)
. Then, the information divergence between methylation levels
p^1j
and p^2j
from individuals 1 and 2 at site j
is the
divergence between the vectors p^1j = (p_1j, 1 - p_1j)
and p^2j =
(p_2j, 1 - p_2j)
. If the vector of coverage is supplied, then the
information divergence is estimated according to the formula:
hdiv = 2*(n_1 + 1)*(n_2 + 1)*((sqrt(p_1j) - sqrt(p_2j))^2 +
(sqrt(1 - p_1j) - sqrt(1 - p_2j))^2)/(n_1 + n_2 + 2)
This formula corresponds to Hellinger divergence as given in the first formula from Theorem 1 from reference 1. Otherwise:
hdiv = (sqrt(p_1j) - sqrt(p_2j))^2 + (sqrt(1 - p_1j) -
sqrt(1 - p_2j))^2
Missing methylation levels, reported as NA or NaN, are replaced with zero.
The Hellinger divergence value for the given methylation levels is returned
' 1. Basu A., Mandal A., Pardo L (2010) Hypothesis testing for two discrete populations based on the Hellinger distance. Stat Probab Lett 80: 206-214.
p <- c(0.5, 0.5)
estimateHellingerDiv(p)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.