Table Of Contents

11. Contributions to Assorted Topics in Time-Series Analysis and Signal Processing

  • 11.1 Optimization of Statistical Performance of Spectral Correlation Analyzers and Spectrum Analyzers

    An example that suggests that the unnecessary use of abstract stochastic process models can interfere with conceptualization and progress in developing methodology and even performance analysis, a theoretical development in non-stochastic probabilistic analysis of methods of statistical spectral analysis is reviewed here. The source of this non-stochastic characterization of the problem of designing quadratic signal processors for estimation of ideal statistical spectral densities (and, more generally, cross-spectral densities and spectral correlation densities) is Sec C of Chapter 5 and Section B of Chapter 15 of the 1987 book [Bk2].

    As shown in [Bk2], essentially all traditional and commonly used methods of direct statistical spectral analysis, which excludes indirect methods based on data modeling reviewed on Page 11.2, can be characterized as quadratic functionals of the observed data which, in the most general case, slide along the time series of data. These functionals are, in turn, characterized by the weighting kernels of quadratic forms in both cases of discrete-time and continuous-time data. It is further shown that the spectral resolution, temporal resolution, and spectral leakage properties of all these individual spectrum estimators (collectively causing estimator bias) and the reliability (variance and coefficient of variation) of these estimators are characterized in terms of properties of these kernels. These kernels are explicitly specified by the window functions used by the specific spectrum estimators, including data tapering windows, autocorrelation tapering windows, time-smoothing and frequency-smoothing windows, and variations on these. With the use of the simple tabular collection of these kernels, Table 5-1 in [Bk2], all the direct spectrum estimators are easily qualitatively and quantitatively compared. With this general approach, the previously often-confusing (judging from the literature) comparison of spectrum estimators is rendered transparent. 

    That being said, the question that arises in my mind is: If the standard formulation of spectrum estimation based on the foundation of stochastic processes is so wonderful as to have been universally adopted, why was this reduction of the estimator design and performance analysis problems to a trivial exercise never achieved within the stochastic process framework? This achievement within the fraction-of-time probability framework (or, simply, the time-average framework) was reached in 1987, after the theory and so-called methodology within the stochastic process framework had apparently matured . . . and left most users perplexed about the true role, interaction, and quantitative impact of operations such as time-smoothing, frequency smoothing, data tapering, autocorrelation tapering, direct Fourier transformation of data, indirect Fourier transformation of autocorrelation functions, time-hopped time averaging vs continuously sliding time averaging, and more.

    As best I can tell, this transparent theory and methodology remains unknown 35 years later to most users, most likely because it is based on the non-traditional (but I claim superior in many ways) non-stochastic framework of data modeling.

    Given Table 5-2 and the results in the above-cited chapter sections of [Bk2], which quantitatively compare all the direct methods of spectrum analysis, one can set about optimizing the design of a spectrum analyzer so as to achieve a desired level of temporal resolution, spectral resolution, spectral leakage, and reliability within the limits of this entire class of estimators. One can easily see the tradeoffs among all these performance parameters and select what one considers the optimal tradeoff for the particular application at hand. This is done through the selection, from among existing catalogs of window functions, particular windows for the operations of time-smoothing, frequency smoothing, data tapering, and autocorrelation tapering. Yet, evidence of this achievement in [Bk2] being used in practice has never been brought to my attention.

    It should be mentioned before concluding this overview that there is another performance characteristic of spectrum estimators, and especially spectral correlation analyzers, which typically require the computation of many cross-spectra for a single record of data, and this is computational cost and data storage requirements. These performance parameters are not characterized in Table 5-1 mentioned above. This performance evaluation task is generally more challenging than that discussed above, especially given Table 5-1. Nevertheless, thorough analysis of the competing algorithms for spectral correlation analysis (also called cyclic spectrum analysis) has been reported in the literature on cyclic spectrum analysis dating back to the seminal paper by my colleagues R. Roberts, W. Brown, and H. Loomis in the special issue of the 1991 IEEE Signal Processing Magazine [JP36] on pp 38 to 49.

  • 11.2 FOT-Probability Theory of Parametric Spectrum Analysis

    The theory and methodology of parametric spectrum analysis developed over a long period of time with increased emphasis during later periods focused on observations of data containing multiple sine waves with similar frequencies, i.e., frequencies whose differences are comparable to or smaller than the reciprocal of the observation time, particularly when only one time record of data is available. After early initial work in time-series analysis on what was called the “problem of hidden periodicities” (see Page 4.1) using non-probabilistic models, a concerted effort based on the use of stochastic process models led to a substantial variety of methods particularly for high-resolution spectral analysis (resolving spectral peaks associated with additive sine waves with closely spaced frequencies) ensued. This effort is another example of methodology development based on unnecessarily abstract data models; that is, stochastic process models that mask the fact that ensembles of sample paths and abstract probability measures on these ensembles (sample spaces) are completely unnecessary in the formulation and solution of the problems addressed (cf. Page 3). 

    The first, and evidently still the only, comprehensive treatment of this methodology within the non-stochastic framework of Fraction-of-Time Probability Theory is presented in Chapter 9 of the book [Bk2]. The treatment provided covers the following topics:

    • Autoregressive Modeling Theory
      • Yule-Walker Equations
      • Levinson-Durbin Algorithm
      • Linear Prediction
      • Wold-Cramer Decomposition
      • Maximum-Entropy Model
      • Lattice Filters
      • Cholesky Factorization
    • Autoregressive Methods
      • Least-Squares Procedures
      • Model-Order Determination
      • Singular-Value Decomposition
      • Maximum-Likelihood
    • Autoregressive Moving Average Methods
      • Modified Yule-Walker Equations
      • Estimation of the AR parameters
      • Estimation of the MA parameters
    • Experimental Study
      • Periodogram Methods
      • Minimum-Leakage Method
      • Yule-Walker, Burg, and Forward-Backward Least-Squares AR Methods
      • Overdetermined-Normal-Equations AR Method
      • Singular-Value-Decomposition AR Method
      • Hybrid Method

    The extensive comparison of methods in the experimental study leads to the conclusion that, in general, for data having spectral densities including both smooth parts and impulsive parts (spectral lines), the best performing methods are hybrids of direct methods based on processed periodograms and indirect methods based on model fitting. A well designed hybrid method can take advantage of the complementary strengths of both direct and indirect methods.

  • 11.3 Bayesian Imaging of RF Sources

    Content in preparation, 10 July 2020.

    • 11.3.1 Star Ranging by Interplanetary-Baseline Interferometry

      In the 1980s, I (WCM) proposed applicability of cyclostationarity to Astronomy and Astrophysics for applications in which the star’s RF emission itself exhibits cyclostationarity, and I also introduced the original methods for reduction and excision of cyclostationary RFI during that period (see page 2.5), but my latest application to astronomical data processing technology is focused on my exploratory concept of Interplanetary Baseline Interferometry (IBI), which is not (presently) based on any form of exploitation of cyclostationarity. Earth-based VLBI (Very-Long-Baseline Interferometry) is presently used at Jet Propulsion Lab together with the Deep Space Network to achieve the highest possible position and velocity measurements on space probes, but Interplanetary Interferometry has evidently not yet (to my knowledge) been investigated. The two primary technological challenges presented by IBI may be 1) the required accuracy in knowledge of the position and velocity of a space probe with an antenna pointing away from the Solar System (possibly a satellite orbiting Mars)—the other IBI antenna being located on Earth (or an Earth satellite, to avoid terrestrial RFI); and 2) the objective of using two antennas with the largest possible apertures, synthesized or not. Although the size and the stability of location and orientation of Earth-based antennas probably outstrips that of satellite-based antennas today, it is conceivable that existing orbit determination and orbit reconstruction (after the data has been received and stored) may hold promise for rendering IBI practically feasible. Today, NASA reports that solar system navigation accuracies for space probes as high as hundredths of a millimeter per sec in velocity coordinates and tens of nanoradians in LOB (Line of Bearing) coordinates are being achieved by using VLBI and Quasar reference stars whose LOBs are known to great accuracy. (Note, one advantage of the space probe containing one of the two IBI antennas being on a prescribed flight path other than a Mars orbit is that a greater diversity of baseline orientations can be achieved over time. The primary limitation of the space probe seems to come from the challenges presented by the distance of the probe from Earth: The greater this distance is, the higher the attainable IBI accuracy is, but the more difficult the telemetry challenge for both relaying the star signal to Earth and maintaining accurate knowledge of probe location and velocity.)

      While JPL’s VLBI is envisioned as likely being required to meet Challenge 1), without which IBI wouldn’t be possible, the signal processing software of this VLBI capability also can potentially be repurposed for IBI.  The challenge of IBI is perhaps the greatest yet faced for any form of RF interferometry as a result of the immense distance of the radio sources of primary interest, the range being the most salient challenge because of the accuracy requirements it imposes on the LOBs (even if the LOBs are not explicitly calculated).  The presently attainable LOB accuracy of measurements using optical telescopes and the Parallax method quickly becomes inadequate as star range increases, even when the two snapshots from the antenna on Earth are taken ½ a year apart in time, making the length of the baseline between the two antenna positions twice the distance between Earth and Sun, 300 million km! For example, if the longest range for which LOB accuracies are adequate could be extended to16 light years, there would be only 50 stars to investigate out of one trillion stars in the Milky Way Galaxy. However, 16 light years is 1.5 x 1014 km, making the LOB about 500,000 times longer than the 300 million km baseline! This makes the point of intersection of the two LOBs very sensitive to the tiny angle between them.

      Without getting into the somewhat complicated details of IBI signal processing, the essence of the method can be fairly easily understood as follows. Interferometry means 1) receiving and (for long enough wavelengths) converting propagating waves (typically plane waves but possibly spherical waves from nearby sources) to voltage signals in an electric circuit from a source of radiating energy at two sensors separated in space by a known distance with a known orientation of the line between these sensors (baseline length and orientation); 2) forming the difference of these signals while inserting an adjustable relative time delay (and possibly a relative frequency shift or time-dilation) between them; and then minimizing the time-averaged power of this difference signal with respect to the adjustable parameters. From the values of parameters at which a minimum is reached, and knowledge of the baseline length and orientation, the LOB from the source to each of the sensors and the velocity of the source can be calculated, even if the baseline is moving in a known manner. The propagating energy can be visible light, ultra-violate light, infrared light, radio-frequency waves, sound waves, etc. However, the technology used for interferometry depends on the wavelength of the propagating waves. The parameter adjustments needed are far more technically viable for RF than they are for visible light, because of the possibility of conversion to voltage signals in an electric circuit, after which digital signal processing technology can be used, implemented in either hardware or software. Interestingly, it is easily shown by expanding the square in the time-averaged power measurement on the difference signal that minimizing the time-averaged power is the same as maximizing the cross-correlation of the two signals. This cross-correlation is a function of the inserted relative delay and the inserted relative frequency shift or time dilation, and it is called the cross-ambiguity function (CAF). When these two parameters are replaced with mathematical models of the dependence of the negative of the actual time difference and frequency difference (or the reciprocal of the time-dilation ratio) resulting from the source location and the baseline position and orientation and possible velocities), the CAF is said to be spatially registered. If the source is known to be confined to some surface or line in space, the name of that surface or line is used as a designation. For example, if the source is on Earth’s surface, we have geo-registration; if it is on a known LOB, it is called LOB-registration. The primary challenge of IBI, besides the hardware (the antennas and space probe(s) and rockets required to launch the probes), is attaining the needed signal processing accuracies in the spatial registration function as determined by measurements of the actual antenna positions and velocities and measurements of the time delays and frequency shifts (or dilations) induced between the antennas and the central data processing station during data transfer including telemetry. 

      The IBI method described above can be concisely identified by simply providing it with a sufficiently descriptive name: Interplanetary-Baseline Interferometric Synthetic-Aperture Passive-RADAR (IBI-SAPR).

      Coming Next on This Page: Approximations of and bounds on attainable accuracy for star ranging using IBI-SAPR, as a function of star range.

    • 11.3.2 Bayesian Theoretical Basis for Source Location

      (Includes derivation of statistically optimum solution described on page 11.3.1). In preparation.

    • 11.3.3 Bayesian Containment Regions

      Content in preparation, 10 July 2020.

  • 11.4 A Radically Different Method of Moments

    This page presents a way to use the Bayesian Minimum-Risk Inference methodology subject to structural constraints on the functionals of available time-series data to be used for making inferences.

    Because the approach of minimizing risk subject to such a constraint is not tractable and, in fact, is even less tractable than unconstrained minimum-risk inference, an alternative suboptimum method is developed. This method produces minimum-risk (i.e., minimum-mean-squared-error) structurally constrained estimates of the required posterior probabilities or PDFs, and then uses these estimates as if they were exact in the standard Bayesian methodology for hypothesis testing and parameter estimation. Since all the computational complexity in the Bayesian methodology is contained in the computation of the posterior probabilities or PDFs, this approach to constraining the complexity of computation is appropriate and it is tractable. It requires only inversion of linear operators, regardless of the nonlinearities allowed by the structural constraint. The dimension of the linear operators does however increase as the polynomial order of the allowed nonlinearities is increased.

    A full presentation of this different method of moments can be viewed here.

  • 11.5 Cyclic Point Processes, Marked and Filtered

    Content in preparation, 10 July 2020.

  • 11.6 Exploiting Spectral Redundancy of Frequency Modulated Signals

    The initial concept discussed here is to use the approximation of narrowband FM by DSB-AM plus quadrature carrier, and then exploit the 100% spectral redundancy of DSB-AM, due to its cyclostationarity, to suppress co-channel interference.

    The simplest version of the problem addressed by this approach is that for which interference exists only on one side of the carrier. Nevertheless, it is possible to correct for interference on both sides of the carrier, provided that the set of corrupted sub-bands on one side of the carrier has a frequency-support set with a mirror image about the center frequency that does not intersect the set of corrupted sub-bands on the other side of the center frequency.

    For WBFM signals, in order to meet the conditions under which FM is approximately equal to DSB-AM plus a quadrature carrier, we must first pass the signal through a frequency divider which divides the instantaneous frequency of the FM signal by some integer which is large enough to ensure that the phase deviation due to modulation is much smaller than 2 \pi, s(t)=\sin \left[\int \omega(t) d t\right] must be converted to \tilde{s}(t)=\sin \left[\frac{1}{n} \int \omega(t) d t\right] (or a cosine) for some positive integer n. Here \omega(t) is the instantaneous frequency; that is, it is the time derivative of the time-varying angle of the sinusoid.

    If \omega(t)=2 \pi f_{o}+b(t), where \int b(t) d t=a(t), then

        \[ \tilde{s}(t)=\sin \left[\frac{1}{n} \int \omega(t) d t\right]=\sin \left[\frac{2 \pi f_{o}}{n} t+\frac{1}{n} a(t)\right] \]


        \[ \begin{aligned} &\sin \left[\frac{2 \pi f_{o}}{n} t+\frac{1}{n} a(t)\right]=\frac{1}{2} \cos \left[\frac{1}{n} a(t)\right] \sin \left[\frac{2 \pi f_{o}}{n} t\right]+\sin \left[\frac{1}{n} a(t)\right] \frac{1}{2} \cos \left[\frac{2 \pi f_{o}}{n} t\right] \\ &\cong \frac{1}{2} \sin \left[\frac{2 \pi f_{o}}{n} t\right]+\frac{1}{2 n} a(t) \cos \left[\frac{2 \pi f_{o}}{n} t\right] \end{aligned} \]

    where the approximation is close if

        \[ \frac{1}{n}|a(t)| \ll 2 \pi . \]

    If we started out with the signal plus corruption x(t)=s(t)+c(t) then, at the output of the frequency divider, we have y(t)=\tilde{s}(t)+\tilde{c}(t)+e(t) where e(t) = y(t)-\tilde{s}(t)-\tilde{c}(t) and \tilde{c}(t) is the frequency divided version of c(t).

    Unfortunately, e(t) is not easy to characterize, because a frequency divider is not linear; there are various methods for dividing frequency, one of which is described as follows: At every up-crossing (a zero crossing with a positive slope) of an input FM signal, the divider outputs a fixed signal level and holds that output level until the m-th up-crossing and then switches the polarity of that level and holds it until the 2m-th up-crossing, and switches the polarity again, etc. This will produce a square-wave with fundamental frequency f_{o} / n where f_{o} is the frequency of the input FM signal during those up crossings, and n=2 m. Alternatively, the polarity of the output square-wave can be switched every n zero crossing (regardless of whether these are up-crossings or down-crossings. The final step for the divider is to pass this square-wave through a bandpass filter that passes the fundamental-frequency component and rejects all higher-order harmonics. Yet another alternative is to select the bandpass filter to pass only the 3^{\text {rd }} harmonic (symmetrical square-waves contain only odd-order harmonics). Then n will be only 1 / 3 of the n for the fundamental-frequency component.

    Sources of distortion to the frequency-divided FM signal in the output of such a frequency divider include any changes in the input FM frequency during the hold period and any changes to the FM signal’s zero crossings resulting from additive noise or additive interfering signals or frequency-selective fading. Because of this, one can expect performance, relative to a perfect frequency divider, to degrade as the strengths of any and all these sources of input corruption grow. Like essentially all nonlinear signal processors, a threshold phenomenon can be expected. That is, performance can remain good as sources of corruption grow in strength, until they reach a level at which the divider fails catastrophically. In this regard, the fact that the frequency-divided sine wave at the divider output is attenuated by the factor 1/n may exacerbate the impact on the output due to the presence of corruption at the input.

    Ignoring the error e(t) in the divider output, as long as \tilde{c}(t) resides in complementary sub-bands on each side of the center frequency f_{o} / n, then Fresh filtering can suppress it. But the best suppression requires knowledge of the bands in which \tilde{c}(t) resides. These bands can, however, be determined from knowledge of the bands in which c(t) resides, which may or (unfortunately) may not be known.

    In conclusion, if the average power level of e(t) at the output of the frequency divider is low enough, co-channel interference can potentially be suppressed for WBFM signals. For NBFM, for which frequency division is not needed, e(t)=0 and FRESH filtering can indeed suppress co-channel interference.

    Although this is of practical interest, as is, it would be more interesting if the conditions under which e(t) has low enough average power to remain below threshold can be determined. This challenge is somewhat related to the problem of threshold characterization for FM demodulators, a topic that has received significant attention in the communications systems literature of the past.

    As is, the above concept will be of little practical use until the threshold phenomenon is quantified. This quantification might well depend on the nature of the additive corruption, and this could indicate that the threshold value of the input SCR (signal-to-corruption ratio) must be determined independently for each application. Consequently, experimental determination of the threshold SCR is probably the most pragmatic approach. 

    This material has not been published as of its posting here (June 2022).

  • 11.7 A historical Perspective on Nonlinear Systems Identification

    Content in preparation.