4.1 - sample - Cyclostationarity

In the essay shown immediately below, a concise review of the long evolution of the study of cycles in time-series data is provided as a basis for explaining the relationship between the half century of work on cycles by the original WCM between 1972 and the 2024 and the classical work of mathematicians and scientists throughout the preceding century including especially Norbert Wiener (generalized harmonic analysis, optimum filtering, nonlinear system identification), and also D. Brennan (Fraction-of-Time Probability), Ronald Fisher (cumulants), Lars Hanson (generalized method of moments), E.M. Hofstetter (Fraction-of-Time Probability), Andrei Kolmogorov (stochastic processes), Karl Pearson (method of moments), Arthur Schuster (periodogram), Thorvald Thiele (cumulants), John Wishart (cumulants), Herman Wold (hidden periodicity and disturbed harmonics), and many others who contributed to the theory of stationary stochastic processes and various topics in statistical signal processing based on the stationary process model.

To complement this concise review, a survey paper on hidden periodicity in science data is available here:

Abstract

A concise review of the long evolution of the study of cycles in time-series data is provided as a basis for explaining the relationship between the half century of work on cycles by William A. Gardner between 1972 and the 2024 and the classical work of mathematicians and scientists throughout the preceding century including especially Norbert Wiener (generalized harmonic analysis, optimum filtering, nonlinear system identification), and also E.M. Hofstetter (Fraction-of-Time Probability), Ronald Fisher (cumulants), Lars Hanson (generalized method of moments) , Andrei Kolmogorov (stochastic processes), Karl Pearson (method of moments), Arthur Schuster (periodogram), Thorvald Thiele (cumulants), John Wishart (cumulants), Herman Wold (hidden periodicity and disturbed harmonics), and many others who contributed to the theory of stationary stochastic processes and various topics in statistical signal processing based on the stationary process model.

Introduction to Cycles

The following introduction to the topic of the present essay was written by Herman O. A. Wold in 1968, as the opening paragraph in his survey contribution to the topic “Cycles” in the Encyclopedia of the Social Sciences [1].

Cycles, waves, pulsations, rhythmic phenomena, regularity in return, periodicity—these notions reflect a broad category of natural, human, and social phenomena where cycles are the dominating feature. The daily and yearly cycles in sunlight, temperature, and other geophysical phenomena are among the simplest and most obvious instances. Regular periodicity provides a basis for prediction and for extracting other useful information about the observed phenomena. Nautical almanacs with their tidal forecasts are a typical example. Medical examples are pulse rate as an indicator of cardiovascular status and the electrocardiograph as a basis for analysis of the condition of the heart. The study of cyclic phenomena dates from prehistoric times, and so does the experience that the area has dangerous pitfalls. From the dawn of Chinese history comes the story that the astronomers Hi and Ho lost their heads because they failed to forecast a solar eclipse (perhaps 1 of 36 2137 b.c.). In 1929, after some twelve years of promising existence, the Harvard Business Barometer (or Business Index) disappeared because it failed to predict the precipitous drop in the New York stock market.

The purpose of this brief essay is to put into perspective the breakthrough made in the mid-1980s in the modeling of and statistical inference based on time-series data exhibiting cyclic behavior. Up until this breakthrough, statistical models for cycles—as a complement to nonstatistical cycles modeled, for example, by differential equations—had been studied analytically using crude mathematical models for more than a century but had not moved beyond the following two models: 1) the sum of one or more periodic time series and a featureless (randomly fluctuating, erratic, unpredictable, stationary) times series, often referred to as noise, which sum is amenable to more than just temporally local prediction, and 2) the response of a linear time-invariant resonant dynamical systems, mathematically modeled as a convolution, driven by a featureless time series, which response is amenable to only local prediction, because the apparent cycles are not true cycles. In a hypothesis testing setting, the null hypothesis (the alternative to models 1) or 2)) is an unpredictable nonstationary time series that may appear from time to time to exhibit cyclicity but that, upon closer inspection, is found to exhibit no true cycles and no substantive predictability. However, model 2) can be considered to be included in the null hypothesis since the disturbed harmonics produced by this model do not represent true cycles, and predictability is relatively limited. For an illustrative discussion of the general problem of cycles from a historical perspective, the reader is referred to Appendices 1 – 3, which consist of excerpts from Wold’s article, “Cycles” [1].

The first method that emerged for analysis of data according to model 1), at the turn of the 19th Century, is the periodogram (the squared magnitude of the Fourier transform of a finite-length times series of data, normalized by the length of the data segment) which was followed by a variety of what were termed high-resolution and super-resolution model fitting methods beginning around mid-20th Century. The periodogram was proven to be the set of sufficient statistics for Maximum Likelihood (ML) estimation of the period of a cycle due to a single sinewave in additive white Gaussian noise (AWGN) and the amplitude and phase of the Fourier component at the detected period are ML estimates of a sinusoid with that period. The complexity of the generalization to ML estimation for multiple sinusoids in AWGN, especially those with cycle periods that are not substantially different, led to a wide variety of alternative model fitting method, which are surveyed in [3, chap 9], where Gardner introduces the use of the FOT probability model to circumvent the unnecessary abstraction of the stochastic process model (cf. [4]) which dominated the literature on this topic essentially to the extent of complete exclusion of the FOT probability model once the stochastic process had been introduced. Data following model 2) were referred to as disturbed harmonics and were analyzed primarily by methods developed specifically for Autoregressive Models (AR) and AR-Moving Average (ARMA) models. These models were initially implicitly based on the FOT model (i.e., on time averages of lag products, not probabilistic expected values) but soon transitioned to the stochastic process model.

Review of Cyclostationarity

In 1985 and 1987, two analytical books by William A Gardner [2], [3] appeared, and introduced the first comprehensive theoretical investigations of two new classes of models which he termed 4) cyclostationary time series exhibiting a single periodicity and its generalization to 5) almost cyclostationary time series exhibiting multiple incommensurate periodicities, that is, multiple incommensurate periods of statistical cyclicity. (For a finite number of incommensurate cycles, Gardner later introduced the more specific term poly-cyclostationary.) Book [2] introduced these models in terms of stochastic processes and briefly explained their duals defined in terms of time averages instead of expected values, and [3] maintained close ties to empirical data by developing a comprehensive theory based on times averages alone or, equivalently, Fraction-of-Time (FOT) probabilities. The term statistical cyclicity means that precise cycles appear only in averages performed on the data, not in the raw data itself, which may or may not exhibit imprecise cycles. For the stochastic process model, these averages are expected values of functions of the data, which can be approximated with averages over statistical samples from a population of data sets. For the alternative non-stochastic model, these averages are ideally infinitely long time averages of functions of the data, which can be approximated by finite-time averages. The two models are dual and, in addition, they are essentially equivalent for a special subclass of stochastic processes that satisfy the ergodic hypothesis, generalized to the cycloergodic hypothesis which can, in principle, be tested using the necessary and sufficient condition provided by the cycloergodicity theorem derived by the original WCM [JP70].

The original models 1) and 2) were first described prior to the advent of the concept of a stochastic process and later were replaced with stochastic-process alternatives. The two new models 4) and 5), which generalize models 1) and 2), were first treated comprehensively almost simultaneously in both forms, stochastic and non-stochastic, in [2], and the non-stochastic alternative was greatly expanded on in [3], because of its parsimony and more direct relevance to most applications—those for which only a single time series of measurements is available instead of a set of multiple statistical samples of time series from a population which is the situation originally motivating the stochastic process model. There were a few isolated journal papers prior to (and cited in) [2], [3], [5] and which briefly treated what were called periodically correlated stochastic processes, but there had been no attempt to develop a comprehensive theory of these stochastic processes, and not even a mention of the alternative theory of non-stochastic models for non-population time series first proposed in [2],[3] (cf. [4],[6]-[8]) let alone non-stochastic models for periodically and almost periodically time varying higher-than 2nd order moments, cumulants, and probability density functions. There also were a few isolated papers on stochastic cyclostationarity in the Russian literature that are cited in [5]. The fundamental concept underlying (almost) cyclostationarity does not require the concept or mathematical model of a population of time series and a corresponding stochastic process. Rather (almost) cyclostationarity can be defined in terms of time-series models consisting of (almost) periodically time-varying FOT probability density functions defined independently of the probability space notion upon which the stochastic process is defined. The reader is referred to [7] for a discussion of the underlying measure theory foundation for FOT probability, and to [4], [8] for discussions of the key mathematical differences 3 of 36 between FOT probability, which is constructed from a single time series, and Kolmogorov’s abstract axiomatically defined probability theory, which is defined in terms of what is called a probability space. Periodically (and almost periodically) time varying moments and cumulants can be characterized in terms of FOT probability. The breadth of this class of models and the phenomena to which they apply dwarfs the earlier models of cycles of type 1) referred to above. In fact, the model 1) is the most elementary example of a cyclostationary time series—so elementary that it does not need the mathematical machinery of FOT probability to analyze.

More specifically, in the model of type 1) a true cycle corresponds to a periodic mean and, in the model of type 2), an apparent but not true cycle corresponds to damped oscillation of the autocorrelation function of the process. In cyclostationary (or almost cyclostationary) processes, any order moment or cumulant can be periodic (or almost periodic with multiple incommensurate periods). For example, a cyclostationary process or time series can have a constant (time-invariant) mean and constant variance, but a periodic covariance producing cycles in coherence time; or it can have 1st and 2nd order moments all of which are constant, but periodic higher-order moments or cumulants. In general, (almost) cyclostationary processes have (almost) periodic joint probability density functions.

Gardner’s more general FOT probability model of cycles does not rely on a hypothetical deterministic model (a periodic function or a convolution) mixed with or driven by a featureless noise. Rather, it constructs the model from time averages of functions of the time series. This model can consist of FOT probability density functions, joint moments of multiple time samples with any time separations, corresponding joint cumulants, etc. Nevertheless, the FOT probability model can be derived from a mathematical model of deterministic dynamics driven by featureless noise, in term of the FOT model of such noise, which is typically chosen to be a series of statistically independent identically distributed (in the FOT probability sense) variables. Several examples are listed below:

A piecewise constant time series with transitions once every period and with constants given by a stationary sequence. This model has constant mean and variance, but periodic covariance. The stationary sequence can be, for example, a featureless noise with known FOT probability, such as independent identically distributed variables.
A product of a deterministic periodic sequence and a stationary sequence (e.g., featureless noise) with known FOT probability. If this stationary noise is white, so too is the cyclostationary time series.
A marked and filtered Poisson point process (e.g., a detected photon stream) with average rate of occurrence of points that is a deterministic periodic function. The average-rate parameter also can be a stochastic process producing what is called a double stochastic process, which is a useful model in optical communication systems. (The FOT-probability counterpart of a Poisson process is conceptually relatively straightforward for constant and periodic, and almost periodic rate parameters, but the counterpart for the doubly stochastic process, while apparently viable, is admittedly more conceptually challenging.)
A resonant dynamic system, with periodically time-varying resonant frequency and/or damping factor, driven by featureless noise.
A pulse stream with random amplitudes (e.g., featureless noise) and periodically time varying pace and/or duration.

The books [2] and [3], and a lifetime of follow-on work by the Author, comprehensively reviewed in [6], substantially extends and generalizes the following work by many:

Herman Wold’s and George Yule’s work on hidden periodicities and disturbed harmonics [1], [6, p. 4.1] (incorporated into the more general theory of cyclostationarity);
Norbert Wiener’s work on Generalized Harmonic Analysis of stationary time-series [9] (generalized to spectral correlation analysis of cyclostationary and almost cyclostationary time series [6, p. 2.1] and further generalized from 2nd-order joint moments to higher order moments and joint probability densities [6, p. 2.1]);
Wiener’s minimum time-averaged-squared error (MTASE) linear time-invariant filtering of stationary time series [10] (generalized to periodically and almost periodically time-variant linear filtering of cyclostationary time series [6, p.2.5.1]);
the body of work by Wiener and his team of PhD students at M.I.T. (cf. Zadeh’s integrative formulation [11] and Mattera’s comprehensive review [6, p. 11.7]) on nonlinear system identification (reformulated in terms of fraction-of-time-probability and generalized from time invariant to (almost) periodically time variant nonlinear systems [6, p. 2.5.3]);
the work of many on blind channel identification/equalization (generalized from stationary to cyclostationary channel inputs thereby enabling measurement of phase as well as magnitude of the transfer function [6, p. 2.5.3]).
Andrei Kolmogorov’s theory of Stochastic Processes [12], for which Gardner’s work creates a parsimonious alternative, based on his concept of (almost) periodic fraction-of-time probability [6, p. 3], which is based on Gardner’s non-population alternative to relative frequency for cyclostationary time series (a generalization of Hofstetter’s early work on stationary time series [13],[14]);
Thorvald Thiele’s introduction in the late 19th Century of semi-invariants and later termed cumulants by Ronald Fisher and John Wishart [15] (complemented with Gardner’s FOT-probability definition of the cumulant, which introduces an entirely innovative meaning of the cumulant of non-population cyclostationary and almost cyclostationary time series [6, p. 2.1];
Arthur Schuster’s introduction of the periodogram [16] (extended and generalized the work of many on periodogram-based methods for power spectral density estimation to cyclic-periodogram-based methods for spectral correlation density estimation [2, p. 331], [3, p. 385];
Karl Pearson’s classic MoM [18] from the turn of the 19th Century (1894), and Lars Hansen’s 1982 Generalized MoM [19] (complemented with a radically new Method of Moments (MoM) for model parameter estimation [17], [6, p. 11.4];
George Birkhoff’s ergodicity theorem for discrete-time stationary processes (generalized to the cycloergodicity theorem for discrete- and continuous-time processes exhibiting almost cyclostationarity and its special cases and more generally asymptotically-mean almost cyclostationary processes and its special cases) .

The only other comprehensive treatment of Gardner’s theory of cyclostationarity appeared over 3 decades after publication of his two books in an unusually scholarly and encyclopedic treatment in a 2019 book by Antonio Napolitano [20], who cites Gardner’s founding work over 580 times.

The treatment of statistical time-series analysis in Gardner’s two mid-1980s books is the first to argue at length that cyclostationarity modeling of time series data was missing from the preceding century of work on hidden periodicities, and a half century of work on disturbed harmonics; and that, from the mid-20th Century on, there is no apparent reason for this shortcoming in the development of time-series models and analysis other than the convenience of the availability of a mathematical theory of stationary stochastic processes and the fact that a seemingly harmless technique promoted by Blackman and Tukey in 1958 [2, page 357] can be used to render stationary a stochastic process otherwise exhibiting what became known as cyclostationarity—a property that had been intentionally avoided following Kolmogorov’s introduction three decades earlier of stochastic processes. This “harmless” technique, called phase randomization [21], can indeed be quite harmful in terms of yielding higher-than-minimum-Bayes-risk statistical inferences based on the time series and its stationarized model [2], [3], and in terms of the masking of key properties, such as spectral correlation, the separability of spectral correlation among additive mixtures of time-series—often referred to as signals—and the separability of such signals themselves, and more general insight into statistical inference involving cycles [22]- [24].

In Wold’s 1968 encyclopedia article [1], it is acknowledged that interest in the study of cyclicity waned following the transition from classical time-series analysis to the stationary stochastic process framework. This was an unfortunate setback in time series analysis of cycles that Gardner attributes to what Professor James Massey [6, p. 9.1] referred to as “the stochastic process bandwagon” in his review of the book [3] (cf. [6, p. 4]).

Almost thirty years after writing the treatise [3] on Regular Cyclostationarity (including Regular Almost Cyclostationarity), Gardner gave consideration to the alternative class of cyclostationarity complementary to regular cyclostationarity, which he termed Irregular Cyclostationarity (and irregular almost cyclostationarity). In 2015, the original unpublished version of the 2018 publication [25], as well as this 2018 article itself, revealed how to extend the cyclostationarity paradigm from regular to irregular cyclostationary times series. Irregular cyclostationarity is predominant in scientific data of natural origin in contrast to engineering data where cyclostationarity is often “manufactured” and intentionally made regular. Examples arise in communication systems design and analysis where the cyclicity in otherwise stationary data is intentionally introduced at the transmitter so that it can be used to advantage at the receiver for extracting the information content in the transmitted signal. Other examples arise in rotating machine monitoring and fault diagnosis where cyclicity is unavoidably introduced by motions of machine components such as rotating crankshafts, reciprocating pistons, and revolving bearings in internal combustion engines and electric motors and generators, including hydroelectric and wind turbines. This field of application of Gardner’s theory was first proposed by Gardner in [3] and has since become a major field of study based on his theory.

Irregular cyclicity is specifically defined in [25] to be regular cyclicity after it has been subjected to time warping. This excludes other forms of departure from exact cyclicity that cannot be so modeled (e.g., pace-irregular pulsed time series described in [25], which arise in rotating machinery with time varying but non-periodic rpm. Irregular cyclicity is the more tractable departure of the two because it is amenable to mathematical modeling in terms of regular cyclostationarity and because irregular cyclostationary time series can be exactly or approximately converted to regular cyclostationarity. This work applies also to irregular almost cyclostationarity provided that all cycles are subjected to the same time warping. The model for irregular cyclostationarity generalizes the concept of a cycle which, by definition, is an exact periodicity to a special type of irregular cycle which is a time-warped periodicity. This significantly broadens the type of phenomena that can be advantageously modeled and predicted.

Not available in the open academic literature is decades of Gardner’s and his research team’s work applying his theory of cyclostationarity to the development of signal processing algorithms for signals intelligence for purposes of national security (cf. [26], [6, p. 12]). This work, reported in numerous treatises prepared for the government, revolutionized this field of study resulting in significant improvements in signals intelligence capability [6, quotation by Nelson Blackman on page 9.1]. Gardner’s theory is also a cornerstone of today’s work on spectrum sensing and management by spectral correlation analysis (cyclic spectrum analysis) as well as power spectral density estimation for cognitive radio systems [6, p. 11.1], [27].

Most recently, in 2024, Gardner, in collaboration with Napolitano, applied the FOT-probability model— as an alternative to the stochastic process model—to revisit the pros and cons of a spectrum estimation technique known as the multi-taper method (MTM) originally introduced by D. J. Thomson in 1982 [28], in comparison with classical methods (CMs) based on time averaging and/or frequency smoothing periodograms. The results of this work contradict the literature on this subject where superiority of the MTM over CMs is claimed [27]. This work illustrates the benefits achievable through simplified conceptualization by replacing the unnecessarily abstract stochastic process model with Gardner’s parsimonious FOT probability model.

Conclusion

Many fields of application of the theory and methodology of (almost) cyclostationarity, other than those mentioned above, are listed in [6, pages 1, 4] and [20, chaps 7 (sec. 6), 9,10]. A concise list of Gardner’s specific mathematical contributions to the theory and methodology based on his FOT-probability approach to non-population times series exhibiting cyclostationarity is provided in Appendix 4, which also addresses applications. Other unifying contributions to time series analysis made by Gardner also are outlined in Appendix 4. A detailed list of applications of cyclostationarity to a variety of fields of science and engineering is presented in Appendix 5.

The concise review provided by this essay illustrates Gardner’s unusual approach to furthering our understanding of theory and methodology for statistical time series analysis. To quote the late Enders A. Robinson, past Professor of Geophysics at Columbia University, past Member of the National Academy of Engineering, and highest honored scientist in the field of geophysics, borrowing from his letter of reference to a Department Chairperson at University of California, Davis, on behalf of Professor Gardner [6, p. 9.1]:

From time to time it is good to look back and see in perspective the work of those people who have made a difference in the engineering profession. One of the important members of this group is William A. Gardner.

Professor Gardner has the ability to impart a fresh approach to many difficult problems. William is one of those few people who can effectively do both the analytic and the practical work required for the introduction and acceptance of a new engineering method. His general approach is to go back to the basic foundations and lay a new framework. This gives him a way to circumvent many of the stumbling blocks confronted by other workers . . .

I am particularly impressed by the fundamental work in spectral analysis done by Professor Gardner. Whereas most theoretical developments make use of ensemble averages, he has gone back and reformulated the whole problem in terms of time-averages. In so doing he has discovered many avenues of approach which were either not known or neglected in the past. In this way his work more resembles some of the outstanding mathematicians and engineers of the past. This approach took some courage, because generally people tend to assume that the basic work has been done, and that no new results can come from re-examining avenues that had been tried in the past and then dropped. William’s success in the approach shows the strength of his engineering insight. He has been able to solve problems that others have left as being too difficult. It is this quality that he so well imparts to his students, who have gone forth and solved important and far-reaching problems in their own right.

To provide a more concrete perspective on the substantial work on cycles that preceded the breakthrough in the mid-1980s, the reader is referred to [1]; also, a concise summary of the treatise [1] is provided in Appendices 1 – 3.

This essay is concluded here with a coarse timeline of the progression of recorded thought about cycles and corresponding data models from basic interest in cycles to the most sophisticated mathematical models yet to be devised:

- 2000 BC: Interest in the General Notion of Cycles (see excerpt from Wold [1] in the first paragraph of the present essay)
- 1700s AD: Hidden periodicities (Euler, Lagrange; see [3, p. 2])
- 1927: Disturbed Harmonics (Yule; see [3, p. 14])
- 1975 – 1978: Precursor to Regular (Almost) Cyclostationarity (Gardner; see [5], [21])
- 1985 – 1987: Regular (Almost) Cyclostationarity (first in-depth treatises: Gardner; see [2, Chap 13], [3, Part II)
- 2015 – 2018: Irregular Cyclostationarity [Gardner; see [25] and, for follow on work, see Napolitano, [29])
- 2025: The Cycloergodicity Theorem [Gardner, see [30]

APPENDIX 1. H.O.A. Wold on Cycle

4.1 – sample