Increased data acquisition by uncalibrated, heterogeneous digital sensor systems such as smartphones present new challenges. Binary metrics are proposed for the quantification of cyberphysical signal characteristics and features, and a standardized constantQ variation of the Gabor atom is developed for use with wavelet transforms. Two different continuous wavelet transform (CWT) reconstruction formulas are presented and tested under different signal to noise ratio (SNR) conditions. A sparse superposition of Nth order Gabor atoms worked well against a synthetic blast transient using the wavelet entropy and an entropylike parametrization of the SNR as the CWT coefficientweighting functions. The proposed methods should be well suited for sparse feature extraction and dictionarybased machine learning across multiple sensor modalities.
This paper applies the constantQ standardized Infrasonic Energy, Nth Octave (Inferno) framework [
The transformation of diverse digital measurements into robust, scalable, and transportable representations is a prerequisite for signal detection, source localization, and machine learning applications for signature classification. The challenge at hand is to construct sparse signal representations that contain sufficient information for classification. Unambiguous classification can be elusive; measurement artifacts, unexpected signal variability, and nonstationary noise often conspire to add uncertainty to our classifiers. As will be discussed in this paper, information and uncertainty quantification can be substantially simplified when using standardized wavelets and binary metrics.
Oscillatory processes often exhibit spatial and temporal scalability and selfsimilarity. Although some physical processes scale linearly, many exhibit recurrent patterns that scale logarithmically and are well represented by power laws. Both linear and logarithmic scales can coexist. For example, overtones in harmonic acoustic systems are often linearly spaced in frequency, yet our sense of tone similarity is close to base 2 logarithmic (binary) octave scales. The term octave comes from the eight major notes in 12tone musical notation, where every note frequency closely repeats with factors of two. This paper uses the term octave and binary interchangeably to denote the base 2 geometric scaling of frequency and time. The mapping between frequency (or pitch) and time (period) is direct for continuous tones, such as musical notes, or statistically stationary oscillations like the orbits of planets. Discrete Fourier transform methods are exceptionally well suited for the interpretation of steady tonal signals with linearly spaced harmonics. The Fourier transform deconstructs oscillations with distinct recurrent time periods into a
Stable oscillators can be even more succinctly represented by a fundamental frequency
The plot thickens when temporal variability is introduced in the signal or the noise. In the first class of CW problems, temporal variability is due to nonstationary broadband or bandlimited noise. This is a chronic condition in infrasonic signal processing, where ambient noise can be coherent or incoherent across a dense sensor network [
In the second class of CW problems, temporal variability is introduced by a change in the temporal, spectral, and/or statistical properties of the signal. These changes can be due to aging, failure, motion, communication, or any other change in state. In a simple twostate problem, one may quantify the properties of the first state, the transition period between states, and the properties on the final state. In a multiplestate problem, such as with communication systems, speech, or music, the ShortTime Fourier Transform (STFT) is often used to characterize spectral variability.
If the transition period between states is faster that the characteristic time scale of the initial state, the STFT does not always provide an accurate representation of this
The zeroth class of transient problems consist of delta functions with their integrals and derivatives. Such instantaneous spikes do not exist in the natural world but can be readily constructed digitally to evaluate the impulse response of a system or represent a neuromorphic network [
The concept of a windowed sinusoid to represent a transient signal was introduced by Gabor [
The second class of transient problems overlaps with the second class of CW problems. It corresponds to transients of significant durations which could be addressed with STFTs, wavelets, or their combination. Very often a transient is imbedded in a noise field with bandlimited harmonic structure. Or the transient itself is a sweep, characterized by a substantial frequency change in the fundamental frequency and its harmonic structure.
The primary differences between STFTs and wavelet transform approaches are that the STFT uses a linear period mapping and a constant time window duration, while wavelets uses geometric pseudoperiod mapping and time window durations that scales with the pseudoperiod. Whereas in the Fourier framework there is a onetoone mapping between time and frequency, the wavelet mapping between time scale and frequency can be less evident and depends on the selected wavelet.
This paper concentrates on developing highly standardized Gabor atoms [
A CyberPhysical System (CPS) is an algorithmcontrolled computer system with physical inputs and outputs. A typical example of a mobile CPS is a smartphone with a microphone input (sound activation) that outputs a response (speech, music, or signal recognition) to a screen. Cyberphysical Measurement and Signature Intelligence (MASINT) is an emerging discipline that concentrates on phenomena transmitted through cyberphysical devices and their interconnected data networks. For smartphones and other multisensor mobile platforms connected to wireless networks, this includes digital noise, bit errors, and latencies internal to the device and its communication channels [
Data processed by the cyber part of CPSs are digital and represented as binary digits (bits). Although the precision of the data would be initially defined by its their allocated integer word size (16, 24 bit, etc.), the original data may be converted into floating point equivalents when an algorithms acts on them. For example, consider sound recorded by a smartphone at the standard rate of 48,000 samples per second. A typical sound record may have 16bit resolution, so that its dynamic range in bits is 2^{−15} to 2^{15} – 1. However, one may only be interested in the lower frequency components of the raw data, so one would implement a lowpass antialiasing filter before decimation. Such filters often require floating point arithmetic in double precision (52 bit mantissa re IEEE 754 at the time of this writing) to reduce instability. Therefore, the precision of the resulting lowpass filtered data would exceed the specification of the original 16bit integral input. However, the theoretical dynamic range of the system would not exceed the specification of the integer 16 physical input. Furthermore, data compression can be more efficient on floats than integers, which leads us to the topic of fractional bits as a measure of CPS amplitude, power, and information.
Many of the metrics we used in traditional physical and geophysical systems are inherited from the analog era. The base 10 decibel scale is a measure of power relative to a reference level, and is used extensively in telecommunications, acoustics, and electrical engineering. Let us estimate the hypothetical dynamic range of a 16bit microphone record of a sinusoid at full scale. The peak rms amplitude would be
All systems have quantization and system noise, and the noise can have a positive or negative bias. This is not a noise paper; for the sake of illustration, I model the system noise as oscillating around a mean of zero and alternating between −1 and 1,
The theoretical dynamic range of the system in dB for a sinusoid recorded with a 16bit microphone and sound card combination with a onebit noise floor could be characterized by the ratio of the power
Another unit that is often specified is the ½ power point of the frequency response of a filter, which defines the quality factor of that filter. This is often referred to as the −3 dB point, since
Consider the communication channel capacity introduced by Shannon [
The effective SNR and therefore the detectability of a compressed pulse (such as a wavelet) is the product of the bandwidth, the signal to noise ratio, and the time duration of a signal [
Energy and Shannon entropies using the binary log are constructed for both the wavelet coefficients and SNR in
This is an algorithmic paper providing foundational methods to construct standardized Gabor wavelets within a binary framework. No materials are included or required; all the algorithms required to reproduce the results are presented, with recommendations for specific existing functions in opensource software frameworks.
Although the methods are intended to be sensoragnostic and transportable across diverse domains, the selection of the Gabor mother wavelet does define the optimal applicability of the algorithms: the methods in this paper will work best with a transient, or a portion of a transient, that can be well represented by a superposition of Gabor wavelets. Fortunately, this covers a fairly wide range of transient signature types. The fundamental principles in this work are expandable to other wavelets as well as to fourdimensional spatiotemporal representations.
A digital time series is constructed by collecting digital measurements at discrete times separated by a nominal sample interval
In many scientific domains, such as astronomy and climatology, the sample interval may be greater than one second. Domains where the phenomena of interest change more rapidly use the equivalent metric of samples per second, referred to as the sample rate and often expressed in units of Hertz. The relationship between the sample interval
Although time is the primary discrete sampling parameter, system requirements are often provided as frequency specifications within the context of Fourier transforms. The nominal sample rate sets the maximum upper edge of the bandpass of the system; there should be negligible energy at the Nyquist frequency, which is half of the sample rate. The actual bandpass of a system is set by the low and high frequency cutoffs of a cyberphysical system, which may include the sensor response, hardware specifications, firmware and software modifications (such as antialiasing filtering), and data compression.
The mapping between frequency and period is simple for a continuous wave tone; the tone period is the inverse of the tone frequency. It is not so clear for transients. Following [
Constant quality factor (
From [
Gabor [
Consider the translation and dilation of the familiar GaborMorlet mother wavelet
The constantQ Gabor atoms are constrained to the discrete set of values
By quantizing constantQ bands and the resulting wavelet scales it is possible to also discretize the uncertainty in time and frequency of the resulting analyses. Gaussian pulses in general [
Converting to physical time with
The recommended quanta for the Gabor atoms are positive integer band numbers
It may be useful to think of the binary (base 2) order
This paper recommends atom quantization using the wellestablished fixed order
These relations are seldom made explicit for constantQ wavelet representations, which often leads to inadvertently creative interpretations and implementations. In traditional fractional octave bands,
The estimate for
Consider the curious case of a single oscillation in the window, where
It is possible to estimate the smallest possible universal binary scale from the Planck time, the smallest measurable time scale
Since the Planck time would be the smallest possible sample interval, the smallest oscillation that could be observed would be at the universal Nyquist period
At the other end of the timeline, the age of the universe is estimated to be 13.8 billion years, or
A third order representation (
Many software packages readily produce a GaborMorlet wavelet with default parameters (
Because none of these specifications correspond to standard orders, the resulting wavelets will tend to either overestimate (due to spectral leakage) or underestimate (due to spectral gaps between bands) the energy within adjacent constantQ bands if binary center frequencies are forced, or will produce nonstandard center frequencies.
Although it is possible to quantize the constantQ Gabor atoms using the order
The continuous wavelet transform (CWT) of a function
One advantage of the constant Q wavelet representation is that it is possible to estimate the information content and detectability of a signal in a band by applying the same set of wavelet transforms to the signal and comparing them to the transform of a noise segment or model. Consider the definition for Shannon’s channel capacity [
The effective
Shannon’s definition of the channel capacity was intended to represent the highest theoretical transfer rate of information through an analog line. Since SNR is given in power, which is typically the square of the signal amplitude, an unscaled binary log is off by a factor of two from the original data in bits. To reconcile this definition with the original collection of a time series signal in floating point bits (fbits), I define the binary SNR to match the signal rms amplitude as well as Shannon’s units for the information rate per band
The entropy of a signal of interest can be estimated by the wavelet coefficients. A practical approach is described in [
The total energy in a given record can be estimated from
The complex probability of
The log energy entropy (lee) per coefficient can be defined by the binary logarithm
The methods presented in this paper are foundational: the intention is to use the Gabor atoms as fundamental building blocks with minimal timefrequency uncertainty and high information density. These methods are illustrated and discussed in the context of a blast pressure pulse. Consider a normalized transient wave function characteristic of an explosion. Suppose one wanted to construct a sparse wavelet representation of a blast pulse with peak energy at 6.3 Hz, corresponding to the detonation of one metric ton of TNT observed at 1 km. It is known [
The form of the amplitudenormalized source pressure function for an explosive blast [
This pulse has an associated analytic function
Note that the amplitude is not used in this exercise because in some cyberphysical systems, such as smartphones, the amplitude response of onboard sensors may not be known. However, sensor dynamic range is usually specified and available (e.g., int16, float32) and can be used for signal scaling relative to the full range or the noise.
The normalized pulse has zero mean (conservation of momentum) and its theoretical variance is
The complex Fourier transform
The power spectra of real digital signals are usually expressed using only the positive frequencies up to the Nyquist frequency, where the unilateral spectral density
Since the target signature corresponds to a one tonne (1000 kg) detonation, the analysis concentrates on a target frequency of 6.3 Hz [
This reindexing is much easier to do numerically than to describe algorithmically. For the purposes of illustration and demonstration, let us choose a signal frequency that exactly matches the target frequency; if this example fails there is no purpose in continuing. A sample rate of 200 Hz will be more than sufficient for this example. Gaussian noise with a standard deviation that is one bit below the signal standard deviation (factor of 1/2) is superposed, and then antialias filtered for all frequencies below Nyquist. The analytic function is computed numerically from the real pulse for later comparisons with the waveletreconstructed signal.
The CWT scalogram is computed using the complex nondimensional mother quantum wavelet of order
The only free variables are the order
After minor conditioning, the SciPy CWT function [
The waveletfiltered reconstructed complex analytical signal can be approximated from
The reconstruction process recovers the original dimensionality of the time series but returns its Hilbert transform, so the total dimensionality may be doubled (2Mp sample points). If only the original real signal is desired, then the dimensionality is unchanged.
The next steps estimate entropy and SNR, and consider sparse signal representation. Although binary bands are adequate for characterizing this signal, and are routinely used in the discrete wavelet transform, I take advantage of the flexibility offered by the CWT and use third order bands (
The energy probability distribution is constructed from the wavelet coefficients to estimate entropy, as discussed in the previous section. The log energy entropy looks like any other scalogram and does not add much value, but the Shannon entropy plot is interesting and well scaled (
Next a noise model is constructed to build the SNR and to establish criteria for standardized and reproducible sparse signal representation. Many are the ways to characterize noise, and few of them accurately characterize nonstationary noise over brief observation windows. An incorrect noise model can penalize the signal passband and degrade the signal SNR. For the white noise model with variance that is one bit below the signal variance, the CWT of the noise (
As anticipated, the binary SNR appears much like the log energy entropy since they are both scaled by a constant value, with the former over the bandaveraged noise and the latter over the total energy. The SNR RbR, as described in the previous section, should also look very much like the entropy, except it would be zero for SNR of unity and positive for SNR > 1. The SNR RbR is shown in
One may use the CWT coefficient energy, the Shannon entropy, or the SNR RbR to test the feasibility of the sparse Gabor atom superposition. Suppose we use any of these Np scales x Mpoint time matrices to identify the peak contributions over the record, and define the complex time indexes as
Increasing the noise standard deviation by a factor of two (one bit) still permits reconstruction from superposition (
There is no end to the number of sensitivity studies that can be performed; in addition to other SNR tests, shifting the peak blast frequency away from the nominal target frequency still returned a stable reconstruction. Increasing the order past N > 6 only worsened the fit to the target waveform, increasing dimensionality and computational cost while decreasing reconstruction fidelity. This is expected from using a wavelet that does not match the target signature.
This paper proposes a transition to binary metrics for digital data and introduces a standardized, quantized variation of the Gabor atoms with binary bases, optimal timefrequency resolution, and clear spectral energy containment. A binary entropylike metric for the SNR is proposed and used to extract the peak coefficients to evaluate the performance of the superposition of Gabor atoms against the more traditional CWT reconstruction. Although the immediate application is the analysis of time series data collected with cyberphysical systems such as smartphones, the methods presented in this paper should be transportable to other types of digital records and can be extended to other wavelet families.
I used a synthetic pressure pulse corresponding to the detonation of one metric ton if TNT in Gaussian noise as an example, and did not include the blast amplitude as a key parameter in order to concentrate on the entropy and SNR, which are both dimensionless scaled quantities. Observations collected close to an explosion should have brief durations and a high SNR; for short pulses it is advisable to use Gabor atoms of small order (
The methods developed have the goal of providing a tunable, standardized framework for signature feature extraction that can be used for signal classification and which should be well suited for dictionary learning [
This paper summarizes over five years of applied research and is based upon work supported in part by the Department of Energy National Nuclear Security Administration under Award Numbers DENA 0002534 (CVT), DEAC0705ID14517 (MINOS), DENA0003920 (MTV), DENA0003921 (ETI), and the Air Force Research Laboratory under Agreement FA87501820113.
The author is grateful for two anonymous reviewers who helped improve the manuscript, and extends warm aloha to A. Christe, B. Williams, S. Takazawa, and J. Tobin for their patience and perseverance in reviewing various drafts of the document. Many thanks for S. Pozzi, D. Chichester, A. Ericson, E. Lam, S. Leung, J. Carlo, A. Smith, A. Rangarajan, J. Zeineddine for providing invaluable context.
The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
This report was prepared as an account of work sponsored by agencies of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof. The United States Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon.
This work builds on the Infrasonic Energy, Nth Octave (Inferno) framework [
In this section we generalize the constantQ framework to the logarithmic discretization of evaluation intervals relative to a given reference scale and base. For a given reference scale
The natural base for both contemporary and quantum computers is base 2, and analysis windows with powers of two are recommended for complex computations at large scales. Many efficient algorithms are based on binary (base two) filter banks. Selecting
Note that center and band edge scales attached to a given band
The next step substantially simplifies the estimation of constantQ bands with a minimal introduction of a 2% computational error. To the author’s knowledge, this is the first time this expression is presented (and he would be most grateful to be informed otherwise). Numerical evaluation shows that
The center frequencies and band edges, and thus the quality factor, of traditional fractional octave bands are well known and can be readily computed for all the standard bands. The primary value of the expression for
Although the center frequency is traditionally defined as the geometric mean of the band edges, the ½ power spectral points at the band edges are only symmetric around the arithmetic mean of the center frequency. The relation between the arithmetic mean
As an extension of the Inferno framework [
Different disciplines call the same things different names; many of the challenges in presentday data science are often due to divergent lexicon and the diversity of applications specific to each field. The idea of using a windowed sinusoid as a basis function for signal representation was developed in detail in Gabor’s [
The Gabor wavelet is a special case of a waveletmodulated window ([
The Fourier transform of the mother wavelet is
The Inferno framework was developed with the introduction of multiresolution array processing in the field of infrasound. The time duration of an analysis window at a specific period is represented as
This time window generally sets the temporal resolution of the resulting data products. In the case of the STFT, the fixedduration analysis window can be referred to as the window of integration. In other words, the integration window
The upper bandwidth of the analysis window can be set by the Nyquist frequency, which is half of the sampling frequency of the digital time series. In practice the upper bandwidth is close to one quarter of Nyquist. Although this representation is simple and tidy, it is not particularly informative. A more useful representation of window duration is the number of wavelet oscillations in the window, which can be estimated from the quality factor
The wavelet admissibility condition for the for this wavelet is equivalent to the zero mean, or
The canonical form for computational evaluation is:
The second btype form has a different structure
The power spectral density of the Gabor wavelet is:
Consider the decay of the spectrum relative with distance
The loss in dBs and binary bits can be expressed as
There is a loss of 3 dB, 12 dB, 27 dB, and 48 dB, and a binary power loss of ½, 2, 4.5, and 8 fbits, for integer multiples of the bandedge
It is worth considering an alternate definition for the quality factor of an oscillator. Consider the time required for the amplitude to drop to 1/e of its peak value. In the case of the Quantum wavelet this is set by the Gaussian envelope, and this particular definition is best suited for the real part of the wavelet which is symmetric about the origin. By applying this definition,
Since the wavelet is symmetric, this states that the portion of the wavelet contained within 2
Practical implementations of Gabor wavelets and their variants often have to make some compromises in the application of the wavelet duration
In other words,
Gabor introduced the timefrequency uncertainty principle in his landmark paper [
This section follows the generalized mathematical formalism of ([
The Gabor uncertainty principle constrains uncertainty to Gabor box defined by the variance in time and frequency. It is equivalent to the Heisenberg uncertainty principle for position and momentum extended to time and frequency, or space and wavenumber. Let a onedimensional signal of interest be represented by a wave function
The variance in the time localization of the signal as
Reference [
In the special case of the GaborMorlet wavelet and its Quantum spawn, where the wave function is symmetric and centered around the timeshift
Consider the standard deviation for time integrated over the scaled window
For
For
Next, consider the standard deviation for time integrated over the scaled window
For
A few variations of the GaborMorlet wavelet are available in presentday computing environments. One of the more familiar forms of the mother wavelet used in modern computations [
This form is found in the Matlab “cmor” function as well as the Python Pywavelets [
Foster [
The reconstruction coefficients of the complex Morlet CWT return the imaginary part of the analytic signal. The complex analytic signal corresponding to the real signal
Let
The Hilbert transform of the canonical GT blast pulse is rather unwieldy, but can be evaluated from
The complex terms are awkward; fortunately, multiplication and division by zero can be readily avoided numerically by adding the smallest floating point value (float epsilon) to arguments in logarithmic computations so it is possible to evaluate the real part of the solution. Another inconvenience is the discontinuity in
Evaluating the second term yields
These deficiencies are suboptimal, and not altogether surprising given that the waveform did not design integrability into the GT pulse [
Analytic signal from mathematical equation, computation with SciPy Hilbert, and the continuous wavelet transfer (CWT) reconstruction. (
Wavelet reconstruction with binary bands. (
Wavelet reconstruction with 1/3 octave bands. (
Wavelet decomposition with 1/3 octave bands, with CWT amplitudes scaled by the reconstruction coefficients. (
Wavelet decomposition in order 3 binary bands, raw CWT amplitudes. (
Shannon entropy in order 3 bands from raw CWT amplitudes. (
Raw CWT of noise in 1/3 octave bands. (
SNR RbR in 1/3 octave bands. (
Superposition of largest SNR entropy coefficients per band using all twenty 1/3 octave bands. (
Superposition of largest coefficients per band within 4 bits of the peak SNR entropy. (
(
(
Quality factor




1  1.4142  2.3548 
3  4.3185  7.1907 
6  8.6514  14.4055 
12  17.3099  28.8229 
24  34.6235  57.6519 
48  69.2488  115.3067 
96  138.4984  230.6150 
^{1} Dyadic base, G = 2.
Exact and approximate quality factor




1  1.4142  1.4142 
3  4.3185  4.2426 
6  8.6514  8.4853 
12  17.3099  16.9706 
24  34.6235  33.9411 
48  69.2488  67.8823 
96  138.4984  135.7645 
^{1} Dyadic base, G = 2.
Approximate quality factor Q and




1  0.7071  1.6651 
2  1.4142  3.3302 
4  2.8284  6.6604 
8  5.6569  13.3209 
16  11.3137  26.6417 
32  22.6274  53.2835 
64  45.2548  106.5670 
128  90.5097  213.1340 
^{1} Dyadic base, G = 2.
Approximate quality factor




1  0.600561204  0.4246609 
2  1.201122409  0.8493218 
4  2.402244818  1.698643601 
5  3.002806022  2.123304501 
6  3.603367226  2.547965401 
8  4.804489635  3.397287201 
^{1} Dyadic base, G = 2.