library(knitr) knitr::opts_chunk$set( error = FALSE, tidy = FALSE, message = FALSE, warning = FALSE, fig.width = 8, fig.height = 8, fig.align = "center" )
There is a nice visualization of CPAN (Perl) modules with the Hilbert curve on http://mapofcpan.org/. In this article, I will demonstrate how to make the plot with the HilbertCurve package.
We first read the list of CPAN modules. The information is in the 02packages.details.txt
file which
can be directly accessed with the following link:
df = read.table(url("https://www.cpan.org/modules/02packages.details.txt"), skip = 9) head(df)
The modules are listed in the first column. We also sort it alphabetically.
all_modules = sort(df[, 1])
We take the "namespace" of the module which is the string before the first "::":
ns = gsub("::.*$", "", all_modules)
Next we will put every value in ns
in a Hilbert curve. First let's convert it into an Rle
object:
library(IRanges) r = Rle(ns)
Now in r
, each element corresponds to an unique namespace in ns
, and the length or the width of the namespace in ns
is also calculated. Let's get these values:
s = start(r) # start position of each ns in r e = end(r) # end position of each ns in r w = width(r) # width of each ns in r labels = runValue(r) # corresponding labels
Now we can visualize it via the HilbertCurve package. We wil highlight the top 30 namespaces with the largest number of modules.
library(HilbertCurve) ind = order(w, decreasing = TRUE)[1:30] w_cutoff = w[ ind[30] ] set.seed(123) # because we have randomm colors hc = HilbertCurve(0, length(all_modules), level = 8, mode = "pixel") hc_layer(hc, x1 = s, x2 = e, col = circlize::rand_color(nrun(r), transparency = ifelse(w >= w_cutoff, 0, 0.8))) hc_text(hc, x1 = s[ind], x2 = e[ind], labels = labels[ind], gp = gpar(fontsize = 9))
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.