periodogram | R Documentation |
\chi^{2}
periodogramperiodogram()
returns the Sokolove & Bushell's \chi^{2}
periodogram plot and measures for a tsibble
object.
The \chi^{2}
periodogram is a technique to identify
periodic patterns in a time series, being used in chronobiology to identify
the presence/absence of circadian periodicities in rest-activity data. It was
proposed by Sokolove and Bushell (1978) as an adaptation of Enright's
periodogram (1965), adding the peak significance test to the procedure.
periodogram(
data,
col,
p_unit = "minutes",
p_min = 1000,
p_max = 2500,
p_step = 1,
alpha = 0.05,
print = TRUE
)
data |
A |
col |
A string indicating which column of |
p_unit |
(optional) a string indicating at which time unit the index
must be aggregated. By aggregating the index, this will change the time
series interval and, consequently, its |
p_min |
(optional) an integer number indicating the minimum period
( |
p_max |
(optional) an integer number indicating the maximum period
( |
p_step |
(optional) an integer number indicating the range of values
that must be skipped between computing one test and the next (e.g., when
|
alpha |
(optional) a number, from |
print |
(optional) a |
Sokolove and Bushell's periodogram assumes that the time series is regular,
i.e., that it has regular intervals/epochs. periodogram()
will still work
with slightly irregular time series, but only if the latter have at least a
90% prevalence of the same periodicity. Warning messages will be issued
telling the user if any data irregularities were found.
If there's more than one data for each p_unit
(e.g., when the data was
recorded per minute and the user wants to compute the periodogram per hour),
periodogram()
will aggregate the values by averaging them or by assigning
the most frequent value (mode) (for non-numeric or single integer variables).
periodogram()
will work even with missing values. As is the case for
any analysis with missing data, the results may diverge. You may want to
interpolate these data points.
There are few articles that deals with interpolation in actigraphy. Tonon et
al. (2022) recommends not using interpolation (i.e., maintain NA
values)
whenever is possible. The same authors also recommends using the weekly mean method of interpolation when the parameters
cannot be computed in the presence of NA
values.
A list
object with the following elements:
p_unit
: a string indicating the time unit in which the data was
aggregated.
p_seq
: a numeric
object with the the sequence of the
tested periods.
alpha
: a number indicating the significant level used.
a_p
: a numeric
object with the root mean square amplitude
(A_{p}
) for each period.
a_p_plot
: a ggplot
object with a line chart
showing a_p
(y) by p_seq
(x).
q_p
: a numeric
object with the peak significant test
(Q_{p}
) for each period .
q_p_critical
: a numeric
object with the
\chi^{2}
critical values for each q_p
, based on the
alpha
parameter.
q_p_pvalue
: a numeric
object with the p-value for each
q_p
, based on the alpha
parameter.
q_p_peaks
: a tibble
object listing each period that
peaked above the critical value along with its q_p
, q_p_critical
,
relative q_p
(q_p_rel = q_p_critical - q_p
), and q_p_pvalue
.
q_p_plot
: a ggplot
object with a line chart
showing q_p
(y1) and q_p_critical
(y2) by p_seq
(x).
Enright (1965) and Sokolove & Bushell (1978) guidelines for the
\chi^{2}
periodogram computation are as follows.
If you are visualizing this documentation in plain text, you may have some trouble understanding the equations. You can see this documentation on the package website.
A_{p}
: The statistic adopted to express "importance"periodogram()
compute the A_{p}
stat for each test period
p
, between the minimum determined by p_min
argument and the maximum
p_max
argument, at a step determined by p_step
argument (e.g., when
p_min == 1
, p_max == 5
, and p_step == 1
, the test periods will be
1
, 2
, 3
, 4
and 5
).
The A_{p}
measure of Enright's periodogram is the standard
deviation of column means of a Buys-Ballot table, or, as Enright puts it,
"the root mean square amplitude". This Buys-Ballot table has P
columns and m
rows, where m
is a number that maximizes the amount
of values that a time series of N
values can have represented in a
P
columns table (thus, m
is N / p
rounded down). m
can be seen as a cycle, and the greater the similarity between the cycles and
the difference between the columns, more intense will be the standard
deviation (A_{p}
).
Buys-Ballot table for an integral period p
:
P (count) |------------------------------------------------| - | Row 1 X_1 X_2 ... X_P | Row 2 X_P+1 X_P+2 ... X_2P | m (count) ... ... ... ... ... | Row m X_P(m-1)+1 X_P(m-1)+2 ... X_Pm | -------------------------------------------------- - Totals U_P,1 U_P,2 ... U_P,P Averages Y_P,1 Y_P,2 ... Y_P,P
As the table above shows, P
is the number of columns in the array
(matrix/table), while m
is the number of rows. When p
(test
period) is an integer, p = P
. That way, A_{p}
can be computed
as:
A_{p} = \sqrt{\frac{\sum^{P}_{h = 1} (Y_{p, h} -
\overline{Y}_{p})^{2}}{P}}
In which:
\overline{Y}_{p} = \frac{\sum^{P}_{h = 1} Y_{p, h}}{P}
Q_{p}
: Sokolove & Bushell's peak significance testPlotting the A_{p}
values allows you to identify the magnitude of
the standard deviations for different p
s. The higher the standard
deviation the more it will tend to a peak. As these values are susceptible to
high and instantaneous fluctuations, Sokolove & Bushell proposed adding a
peak significance test, reducing the intensity of peaks by weighting the
total variance of the data and the period in which it appears. In this
process, peak moments will be highlighted among its neighbors, and smaller
changes will lose prominence, tending to a constant variation from one period
to another.
The significance test of the peaks given by A_{p}
leads to a
\chi^{2}
distribution for each p
, with the test for a
given p
being consistent with a distribution of P - 1
degrees of
freedom.
The formula for calculating the test is:
Q_{p} = \frac{P \times Ap^{2}}{\sigma^{2}_{\overline{X}}}
In which:
\sigma^{2}_{\overline{X}} = \frac{\sigma^{2}_{X}}{m}
Where:
P
= Number of columns of the test Buys-Ballot table.
m
= Number of rows of the test Buys-Ballot table.
A_{p}
= Standard deviation of the column averages from the test
Buys-Ballot table.
\sigma^{2}_{\overline{X}}
= Variance of the test data
(X
).
Enright, J. T. (1965). The search for rhythmicity in biological time-series. Journal of Theoretical Biology, 8(3), 426-468. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/0022-5193(65)90021-4")}.
Sokolove, P. G., & Bushell, W. N. (1978). The chi square periodogram: its utility for analysis of circadian rhythms. Journal of Theoretical Biology, 72(1), 131-160. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/0022-5193(78)90022-x")}.
Tonon, A. C. et al. (2022). Handling missing data in rest-activity time series measured by actimetry. Chronobiology International, 39(7). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/07420528.2022.2051714")}.
Other period analysis functions:
spectrogram()
## Not run:
if (requireNamespace("curl", quietly = TRUE) &&
requireNamespace("jsonlite", quietly = TRUE) &&
requireNamespace("tools", quietly = TRUE)) {
if (curl::has_internet()) {
file <- get_from_zenodo(
doi = "10.5281/zenodo.4898822", path = tempdir(),
file = "processed.txt"
)
data <- read_acttrust(file, tz = "America/Sao_Paulo")
per <- periodogram(data, "pim")
}
}
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.