|
|
(273 intermediate revisions by more than 100 users not shown) |
Line 1: |
Line 1: |
| The '''method of reassignment''' is a technique for
| | This is a preview for the new '''MathML rendering mode''' (with SVG fallback), which is availble in production for registered users. |
| sharpening a [[time-frequency representation]] by mapping
| |
| the data to time-frequency coordinates that are nearer to | |
| the true [[Support (mathematics)|region of support]] of the
| |
| analyzed signal. The method has been independently
| |
| introduced by several parties under various names, including
| |
| ''method of reassignment'', ''remapping'', ''time-frequency reassignment'', | |
| and ''modified moving-window method''.<ref name="hainsworth">{{Cite thesis |type=PhD |chapter=Chapter 3: Reassignment methods |title=Techniques for the Automated Analysis of Musical Audio |url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.5.9579 |last=Hainsworth |first=Stephen |year=2003 |publisher=University of Cambridge |accessdate= |docket= |oclc= }}</ref> In
| |
| the case of the [[spectrogram]] or the [[short-time Fourier transform]],
| |
| the method of reassignment sharpens blurry
| |
| time-frequency data by relocating the data according to
| |
| local estimates of instantaneous frequency and group delay.
| |
| This mapping to reassigned time-frequency coordinates is
| |
| very precise for signals that are separable in time and
| |
| frequency with respect to the analysis window.
| |
|
| |
|
| == Introduction ==
| | If you would like use the '''MathML''' rendering mode, you need a wikipedia user account that can be registered here [[https://en.wikipedia.org/wiki/Special:UserLogin/signup]] |
| | * Only registered users will be able to execute this rendering mode. |
| | * Note: you need not enter a email address (nor any other private information). Please do not use a password that you use elsewhere. |
|
| |
|
| [[Image:Reassigned spectrogral surface of bass pluck.png|thumb|400px|
| | Registered users will be able to choose between the following three rendering modes: |
| Reassigned spectral surface for the onset of an acoustic bass tone
| |
| having a sharp pluck and a fundamental frequency of approximately 73.4 Hz.
| |
| Sharp spectral ridges representing the harmonics are evident, as is the
| |
| abrupt onset of the tone.
| |
| The spectrogram was computed using a 65.7 ms Kaiser window with a shaping
| |
| parameter of 12.]]
| |
|
| |
|
| Many signals of interest have a distribution of energy that
| | '''MathML''' |
| varies in time and frequency. For example, any sound signal
| | :<math forcemathmode="mathml">E=mc^2</math> |
| having a beginning or an end has an energy distribution that
| |
| varies in time, and most sounds exhibit considerable
| |
| variation in both time and frequency over their duration.
| |
| Time-frequency representations are commonly used to analyze
| |
| or characterize such signals. They map the one-dimensional
| |
| time-domain signal into a two-dimensional function of time
| |
| and frequency. A time-frequency representation describes the
| |
| variation of spectral energy distribution over time, much as
| |
| a musical score describes the variation of musical pitch
| |
| over time.
| |
|
| |
|
| In audio signal analysis, the spectrogram is the most
| | <!--'''PNG''' (currently default in production) |
| commonly used time-frequency representation, probably
| | :<math forcemathmode="png">E=mc^2</math> |
| because it is well-understood, and immune to so-called
| |
| "cross-terms" that sometimes make other time-frequency | |
| representations difficult to interpret. But the windowing
| |
| operation required in spectrogram computation introduces an
| |
| unsavory tradeoff between time resolution and frequency
| |
| resolution, so spectrograms provide a time-frequency
| |
| representation that is blurred in time, in frequency, or in
| |
| both dimensions. The method of time-frequency reassignment
| |
| is a technique for refocussing time-frequency data in a
| |
| blurred representation like the spectrogram by mapping the
| |
| data to time-frequency coordinates that are nearer to the
| |
| true region of support of the analyzed signal.
| |
|
| |
|
| == The spectrogram as a time-frequency representation == | | '''source''' |
| | :<math forcemathmode="source">E=mc^2</math> --> |
|
| |
|
| One of the best-known time-frequency representations is the
| | <span style="color: red">Follow this [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering link] to change your Math rendering settings.</span> You can also add a [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering-skin Custom CSS] to force the MathML/SVG rendering or select different font families. See [https://www.mediawiki.org/wiki/Extension:Math#CSS_for_the_MathML_with_SVG_fallback_mode these examples]. |
| spectrogram, defined as the squared magnitude of the
| |
| short-time Fourier transform. Though the short-time phase
| |
| spectrum is known to contain important temporal information
| |
| about the signal, this information is difficult to
| |
| interpret, so typically, only the short-time magnitude
| |
| spectrum is considered in short-time spectral analysis.
| |
|
| |
|
| As a time-frequency representation, the spectrogram has
| | ==Demos== |
| relatively poor resolution. Time and frequency resolution
| |
| are governed by the choice of analysis window and greater
| |
| concentration in one domain is accompanied by greater
| |
| smearing in the other.
| |
|
| |
|
| A time-frequency representation having improved resolution,
| | Here are some [https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Frederic.wang demos]: |
| relative to the spectrogram, is the [[Wigner–Ville distribution]],
| |
| which may be interpreted as a short-time
| |
| Fourier transform with a window function that is perfectly
| |
| matched to the signal. The Wigner–Ville distribution is
| |
| highly concentrated in time and frequency, but it is also
| |
| highly nonlinear and non-local. Consequently, this
| |
| distribution is very sensitive to noise, and generates
| |
| cross-components that often mask the components of interest,
| |
| making it difficult to extract useful information concerning
| |
| the distribution of energy in multi-component signals.
| |
|
| |
|
| [[Cohen's class distribution function|Cohen's class]] of
| |
| bilinear time-frequency representations is a class of
| |
| "smoothed" Wigner–Ville distributions, employing a smoothing
| |
| kernel that can reduce sensitivity of the distribution to
| |
| noise and suppresses cross-components, at the expense of
| |
| smearing the distribution in time and frequency. This
| |
| smearing causes the distribution to be non-zero in regions
| |
| where the true Wigner–Ville distribution shows no energy.
| |
|
| |
|
| The spectrogram is a member of Cohen's class. It is a
| | * accessibility: |
| smoothed Wigner–Ville distribution with the smoothing kernel
| | ** Safari + VoiceOver: [https://commons.wikimedia.org/wiki/File:VoiceOver-Mac-Safari.ogv video only], [[File:Voiceover-mathml-example-1.wav|thumb|Voiceover-mathml-example-1]], [[File:Voiceover-mathml-example-2.wav|thumb|Voiceover-mathml-example-2]], [[File:Voiceover-mathml-example-3.wav|thumb|Voiceover-mathml-example-3]], [[File:Voiceover-mathml-example-4.wav|thumb|Voiceover-mathml-example-4]], [[File:Voiceover-mathml-example-5.wav|thumb|Voiceover-mathml-example-5]], [[File:Voiceover-mathml-example-6.wav|thumb|Voiceover-mathml-example-6]], [[File:Voiceover-mathml-example-7.wav|thumb|Voiceover-mathml-example-7]] |
| equal to the Wigner–Ville distribution of the analysis
| | ** [https://commons.wikimedia.org/wiki/File:MathPlayer-Audio-Windows7-InternetExplorer.ogg Internet Explorer + MathPlayer (audio)] |
| window. The method of reassignment smoothes the Wigner–Ville
| | ** [https://commons.wikimedia.org/wiki/File:MathPlayer-SynchronizedHighlighting-WIndows7-InternetExplorer.png Internet Explorer + MathPlayer (synchronized highlighting)] |
| distribution, but then refocuses the distribution back to
| | ** [https://commons.wikimedia.org/wiki/File:MathPlayer-Braille-Windows7-InternetExplorer.png Internet Explorer + MathPlayer (braille)] |
| the true regions of support of the signal components. The
| | ** NVDA+MathPlayer: [[File:Nvda-mathml-example-1.wav|thumb|Nvda-mathml-example-1]], [[File:Nvda-mathml-example-2.wav|thumb|Nvda-mathml-example-2]], [[File:Nvda-mathml-example-3.wav|thumb|Nvda-mathml-example-3]], [[File:Nvda-mathml-example-4.wav|thumb|Nvda-mathml-example-4]], [[File:Nvda-mathml-example-5.wav|thumb|Nvda-mathml-example-5]], [[File:Nvda-mathml-example-6.wav|thumb|Nvda-mathml-example-6]], [[File:Nvda-mathml-example-7.wav|thumb|Nvda-mathml-example-7]]. |
| method has been shown to reduce time and frequency smearing
| | ** Orca: There is ongoing work, but no support at all at the moment [[File:Orca-mathml-example-1.wav|thumb|Orca-mathml-example-1]], [[File:Orca-mathml-example-2.wav|thumb|Orca-mathml-example-2]], [[File:Orca-mathml-example-3.wav|thumb|Orca-mathml-example-3]], [[File:Orca-mathml-example-4.wav|thumb|Orca-mathml-example-4]], [[File:Orca-mathml-example-5.wav|thumb|Orca-mathml-example-5]], [[File:Orca-mathml-example-6.wav|thumb|Orca-mathml-example-6]], [[File:Orca-mathml-example-7.wav|thumb|Orca-mathml-example-7]]. |
| of any member of Cohen's class
| | ** From our testing, ChromeVox and JAWS are not able to read the formulas generated by the MathML mode. |
| <ref name = "improving">
| |
| {{cite journal |author=F. Auger and P. Flandrin |date=May 1995 |title=Improving the readability of time-frequency and
| |
| time-scale representations by the reassignment method |journal=IEEE Transactions on Signal Processing |volume=43 |issue=5 |pages=1068–1089 |publisher= |doi=10.1109/78.382394 |url= |accessdate= }}
| |
| </ref>
| |
| .<ref>P. Flandrin, F. Auger, and E. Chassande-Mottin, | |
| ''Time-frequency reassignment: From principles to algorithms'',
| |
| in Applications in Time-Frequency Signal Processing
| |
| (A. Papandreou-Suppappola, ed.), ch. 5, pp. 179 – 203, CRC Press, 2003.</ref>
| |
| In the case of the reassigned
| |
| spectrogram, the short-time phase spectrum is used to
| |
| correct the nominal time and frequency coordinates of the
| |
| spectral data, and map it back nearer to the true regions of
| |
| support of the analyzed signal.
| |
|
| |
|
| == The method of reassignment == | | ==Test pages == |
|
| |
|
| Pioneering work on the method of reassignment was
| | To test the '''MathML''', '''PNG''', and '''source''' rendering modes, please go to one of the following test pages: |
| published by Kodera, Gendrin, and de Villedary under the
| | *[[Displaystyle]] |
| name of ''Modified Moving Window Method''
| | *[[MathAxisAlignment]] |
| <ref>
| | *[[Styling]] |
| {{cite journal |author=K. Kodera, R. Gendrin, and C. de Villedary |date=Feb 1978 |title=Analysis of time-varying signals with small BT values |journal=IEEE Transactions on Acoustics, Speech and Signal Processing |volume=26 |issue=1 |pages=64–76 | publisher= |doi=10.1109/TASSP.1978.1163047 |url= |accessdate= }}
| | *[[Linebreaking]] |
| </ref>
| | *[[Unique Ids]] |
| Their technique enhances the resolution in time and
| | *[[Help:Formula]] |
| frequency of the classical Moving Window Method (equivalent
| |
| to the spectrogram) by assigning to each data point a new
| |
| time-frequency coordinate that better-reflects the
| |
| distribution of energy in the analyzed signal.
| |
|
| |
|
| In the classical moving window method, a time-domain
| | *[[Inputtypes|Inputtypes (private Wikis only)]] |
| signal, <math>x(t)</math> is decomposed into a set of
| | *[[Url2Image|Url2Image (private Wikis only)]] |
| coefficients, <math>\epsilon( t, \omega )</math>, based on a set of elementary signals, <math>h_{\omega}(t)</math>,
| | ==Bug reporting== |
| defined
| | If you find any bugs, please report them at [https://bugzilla.wikimedia.org/enter_bug.cgi?product=MediaWiki%20extensions&component=Math&version=master&short_desc=Math-preview%20rendering%20problem Bugzilla], or write an email to math_bugs (at) ckurs (dot) de . |
| | |
| <center><math>
| |
| h_{\omega}(t) = h(t) e^{j \omega t}
| |
| </math></center>
| |
| | |
| where <math>h(t)</math> is a (real-valued) lowpass kernel
| |
| function, like the window function in the short-time Fourier
| |
| transform. The coefficients in this decomposition are defined
| |
| | |
| <center><math>\begin{align}
| |
| \epsilon( t, \omega )
| |
| &= \int x(\tau) h( t - \tau ) e^{ -j \omega \left[ \tau - t \right]} d\tau \\
| |
| &= e^{ j \omega t} \int x(\tau) h( t - \tau ) e^{ -j \omega \tau } d\tau \\
| |
| &= e^{ j \omega t} X(t, \omega) \\
| |
| &= X_{t}(\omega) = M_{t}(\omega) e^{j \phi_{\tau}(\omega)}
| |
| \end{align}</math></center>
| |
| | |
| where <math>M_{t}(\omega)</math> is the magnitude, and
| |
| <math>\phi_{\tau}(\omega)</math> the phase, of
| |
| <math>X_{t}(\omega)</math>, the Fourier transform of the
| |
| signal <math>x(t)</math> shifted in time by <math>t</math>
| |
| and windowed by <math>h(t)</math>.
| |
| | |
| <math>x(t)</math> can be reconstructed from the moving window coefficients by
| |
| | |
| <center><math>\begin{align}
| |
| x(t) & = \iint X_{\tau}(\omega) h^{*}_{\omega}(\tau - t) d\omega d\tau \\
| |
| & = \iint X_{\tau}(\omega) h( \tau - t ) e^{ -j \omega \left[ \tau - t \right]} d\omega d\tau \\
| |
| &= \iint M_{\tau}(\omega) e^{j \phi_{\tau}(\omega)} h( \tau - t ) e^{ -j \omega \left[ \tau - t \right]} d\omega d\tau \\
| |
| &= \iint M_{\tau}(\omega) h( \tau - t ) e^{ j \left[ \phi_{\tau}(\omega) - \omega \tau+ \omega t \right] } d\omega d\tau
| |
| \end{align}</math></center>
| |
| | |
| For signals having magnitude spectra,
| |
| <math>M(t,\omega)</math>, whose time variation is slow
| |
| relative to the phase variation, the maximum contribution to
| |
| the reconstruction integral comes from the vicinity of the
| |
| point <math>t,\omega</math> satisfying the phase
| |
| stationarity condition
| |
| | |
| <center><math>\begin{matrix}
| |
| \frac{\partial}{\partial \omega} \left[ \phi_{\tau}(\omega) - \omega \tau + \omega t\right] & = 0 \\
| |
| \frac{\partial}{\partial \tau} \left[ \phi_{\tau}(\omega) - \omega \tau + \omega t \right] & = 0
| |
| \end{matrix}</math></center>
| |
| | |
| or equivalently, around the point <math>\hat{t}, \hat{\omega}</math> defined by
| |
| | |
| <center><math>\begin{align}
| |
| \hat{t}(\tau, \omega) & = \tau - \frac{\partial \phi_{\tau}(\omega)}{\partial \omega} =
| |
| - \frac{\partial \phi(\tau, \omega)}{\partial \omega} \\
| |
| \hat{\omega}(\tau, \omega) & = \frac{\partial \phi_{\tau}(\omega)}{\partial \tau} =
| |
| \omega + \frac{\partial \phi(\tau, \omega)}{\partial \tau} .
| |
| \end{align}</math></center>
| |
| | |
| This phenomenon is known in such fields as optics as the
| |
| [[stationary phase approximation|principle of stationary phase]], | |
| which states that for periodic or quasi-periodic
| |
| signals, the variation of the Fourier phase spectrum not
| |
| attributable to periodic oscillation is slow with respect to
| |
| time in the vicinity of the frequency of oscillation, and in
| |
| surrounding regions the variation is relatively rapid.
| |
| Analogously, for impulsive signals, that are concentrated in
| |
| time, the variation of the phase spectrum is slow with
| |
| respect to frequency near the time of the impulse, and in
| |
| surrounding regions the variation is relatively rapid.
| |
| | |
| In reconstruction, positive and negative contributions to
| |
| the synthesized waveform cancel, due to destructive
| |
| interference, in frequency regions of rapid phase variation.
| |
| Only regions of slow phase variation (stationary phase) will
| |
| contribute significantly to the reconstruction, and the
| |
| maximum contribution (center of gravity) occurs at the point
| |
| where the phase is changing most slowly with respect to time
| |
| and frequency.
| |
| | |
| The time-frequency coordinates thus computed are equal to
| |
| the local group delay, <math>\hat{t}_{g}(t,\omega)</math>,
| |
| and local instantaneous frequency, <math>\hat{\omega}
| |
| _{i}(t,\omega)</math>, and are computed from the phase of
| |
| the short-time Fourier transform, which is normally ignored
| |
| when constructing the spectrogram. These quantities are
| |
| ''local'' in the sense that they represent a windowed
| |
| and filtered signal that is localized in time and frequency,
| |
| and are not global properties of the signal under analysis.
| |
| | |
| The modified moving window method, or method of
| |
| reassignment, changes (reassigns) the point of attribution
| |
| of <math>\epsilon(t,\omega)</math> to this point of maximum
| |
| contribution <math>\hat{t}(t,\omega),
| |
| \hat{\omega}(t,\omega)</math>, rather than to the point
| |
| <math>t,\omega</math> at which it is computed. This point is
| |
| sometimes called the ''center of gravity'' of the
| |
| distribution, by way of analogy to a mass distribution. This
| |
| analogy is a useful reminder that the attribution of
| |
| spectral energy to the center of gravity of its distribution
| |
| only makes sense when there is energy to attribute, so the | |
| method of reassignment has no meaning at points where the
| |
| spectrogram is zero-valued.
| |
| | |
| == Efficient computation of reassigned times and frequencies ==
| |
| | |
| In digital signal processing, it is most common to sample
| |
| the time and frequency domains. The discrete Fourier
| |
| transform is used to compute samples <math>X(k)</math> of
| |
| the Fourier transform from samples <math>x(n)</math> of a
| |
| time domain signal. The reassignment operations proposed by
| |
| Kodera ''et al.'' cannot be applied directly to the
| |
| discrete short-time Fourier transform data, because partial
| |
| derivatives cannot be computed directly on data that is
| |
| discrete in time and frequency, and it has been suggested
| |
| that this difficulty has been the primary barrier to wider
| |
| use of the method of reassignment.
| |
| | |
| It is possible to approximate the partial derivatives using
| |
| finite differences. For example, the phase spectrum can be
| |
| evaluated at two nearby times, and the partial derivative
| |
| with respect to time be approximated as the difference
| |
| between the two values divided by the time difference, as in
| |
| | |
| <center><math>\begin{matrix}
| |
| \frac{\partial \phi(t, \omega)}{\partial t} & \approx
| |
| \frac{1}{\Delta t} \left[ \phi(t + \frac{\Delta t}{2}, \omega) - \phi(t - \frac{\Delta t}{2}, \omega) \right] \\
| |
| \frac{\partial \phi(t, \omega)}{\partial \omega} & \approx
| |
| \frac{1}{\Delta \omega}
| |
| \left[ \phi(t, \omega+ \frac{\Delta \omega}{2}) - \phi(t, \omega-\frac{\Delta \omega}{2}) \right]
| |
| \end{matrix}</math></center>
| |
| | |
| For sufficiently small values of <math>\Delta t</math> and
| |
| <math>\Delta \omega</math>, and provided that the phase
| |
| difference is appropriately "unwrapped", this
| |
| finite-difference method yields good approximations to the
| |
| partial derivatives of phase, because in regions of the
| |
| spectrum in which the evolution of the phase is dominated by
| |
| rotation due to sinusoidal oscillation of a single, nearby
| |
| component, the phase is a linear function.
| |
| | |
| Independently of Kodera ''et al.'', Nelson arrived at a similar method for
| |
| improving the time-frequency precision of short-time
| |
| spectral data from partial derivatives of the short-time phase
| |
| spectrum.
| |
| <ref name = "crossspectral">
| |
| {{cite journal |author=D. J. Nelson |date=Nov 2001 |title=Cross-spectral methods for processing speech |journal=Journal of the Acoustical Society of America |volume=110 |issue=5 |pages=2575–2592 |publisher= |doi=10.1121/1.1402616 |url= |accessdate= }}
| |
| </ref>
| |
| It is easily shown that Nelson's
| |
| ''cross spectral surfaces'' compute an approximation of the derivatives that
| |
| is equivalent to the finite differences method.
| |
|
| |
| | |
| Auger and Flandrin showed that the method of reassignment, proposed
| |
| in the context of the spectrogram by Kodera ''et al.'', could be extended to
| |
| any member of [[Cohen's class]] of time-frequency representations by generalizing the | |
| reassignment operations to
| |
| | |
| <center><math>\begin{matrix}
| |
| \hat{t} (t,\omega) & = t -
| |
| \frac{\iint \tau \cdot W_{x}(t-\tau,\omega -\nu) \cdot \Phi(\tau,\nu) d\tau d\nu}
| |
| {\iint W_{x}(t-\tau,\omega -\nu) \cdot \Phi(\tau,\nu) d\tau d\nu } \\
| |
| \hat{\omega} (t,\omega) & = \omega -
| |
| \frac{\iint \nu \cdot W_{x}(t-\tau,\omega -\nu) \cdot \Phi(\tau,\nu) d\tau d\nu}
| |
| {\iint W_{x}(t-\tau,\omega -\nu) \cdot \Phi(\tau,\nu) d\tau d\nu}
| |
| \end{matrix}</math></center>
| |
| | |
| where <math>W_{x}(t,\omega)</math> is the Wigner–Ville
| |
| distribution of <math>x(t)</math>, and
| |
| <math>\Phi(t,\omega)</math> is the kernel function that
| |
| defines the distribution. They further described an
| |
| efficient method for computing the times and frequencies for
| |
| the reassigned spectrogram efficiently and accurately
| |
| without explicitly computing the partial derivatives of
| |
| phase.
| |
| <ref name = "improving" />
| |
| | |
| In the case of the spectrogram, the reassignment operations
| |
| can be computed by
| |
| | |
| <center><math>\begin{matrix}
| |
| \hat{t} (t,\omega) & = t - \Re \Bigg\{ \frac{ X_{\mathcal{T}h}(t,\omega) \cdot X^*(t,\omega) }
| |
| { | X(t,\omega) |^2 } \Bigg\} \\
| |
| \hat{\omega}(t,\omega) & = \omega + \Im \Bigg\{ \frac{ X_{\mathcal{D}h}(t,\omega) \cdot X^*(t,\omega) }
| |
| { | X(t,\omega) |^2 } \Bigg\}
| |
| \end{matrix}</math></center>
| |
| | |
| where <math>X(t,\omega)</math> is the short-time Fourier
| |
| transform computed using an analysis window
| |
| <math>h(t)</math>, <math>X_{\mathcal{T}h}(t,\omega)</math>
| |
| is the short-time Fourier transform computed using a
| |
| time-weighted anlaysis window <math>h_{\mathcal{T}}(t) = t
| |
| \cdot h(t)</math> and
| |
| <math>X_{\mathcal{D}h}(t,\omega)</math> is the short-time
| |
| Fourier transform computed using a time-derivative analysis
| |
| window <math>h_{\mathcal{D}}(t) = \frac{d}{dt}h(t)</math>.
| |
| | |
| Using the auxiliary window functions
| |
| <math>h_{\mathcal{T}}(t)</math> and
| |
| <math>h_{\mathcal{D}}(t)</math>, the reassignment operations
| |
| can be computed at any time-frequency coordinate
| |
| <math>t,\omega</math> from an algebraic combination of three
| |
| Fourier transforms evaluated at <math>t,\omega</math>. Since
| |
| these algorithms operate only on short-time spectral
| |
| data evaluated at a single time and frequency, and do not
| |
| explicitly compute any derivatives, this gives an efficient
| |
| method of computing the reassigned discrete short-time
| |
| Fourier transform.
| |
| | |
| One constraint in this method of computation is that the <math>| X(t,\omega) |^2</math> must be non-zero. This is not much of a restriction,
| |
| since the reassignment operation itself implies that there
| |
| is some energy to reassign, and has no meaning when the
| |
| distribution is zero-valued.
| |
| | |
| ==Separability== | |
| The short-time Fourier transform can often be used to
| |
| estimate the amplitudes and phases of the individual
| |
| components in a ''multi-component'' signal, such as a
| |
| quasi-harmonic musical instrument tone. Moreover, the time
| |
| and frequency reassignment operations can be used to sharpen
| |
| the representation by attributing the spectral energy
| |
| reported by the short-time Fourier transform to the point
| |
| that is the local center of gravity of the complex energy
| |
| distribution.
| |
| | |
| For a signal consisting of a single component, the
| |
| instantaneous frequency can be estimated from the partial
| |
| derivatives of phase of any short-time Fourier transform
| |
| channel that passes the component. If the signal is to be
| |
| decomposed into many components,
| |
| | |
| <center><math>
| |
| x(t) = \sum_{n} A_{n}(t) e^{j \theta_{n}(t)}
| |
| </math></center>
| |
| | |
| and the instantaneous frequency of each component
| |
| is defined as the derivative of its phase with respect to time,
| |
| that is,
| |
| | |
| <center><math>
| |
| \omega_{n}(t) = \frac{d \theta_{n}(t)}{d t},
| |
| </math></center>
| |
| | |
| then the instantaneous frequency of each individual component
| |
| can be computed from the phase of the response of a filter that passes
| |
| that component, provided that no more than
| |
| one component lies in the passband of the filter.
| |
| | |
| This is the property, in the frequency domain, that Nelson
| |
| called ''separability''
| |
| <ref name = "crossspectral" />
| |
| and is required of all signals so analyzed. If this property is not met, then
| |
| the desired multi-component decomposition cannot be achieved,
| |
| because the parameters of individual components cannot be
| |
| estimated from the short-time Fourier transform. In such
| |
| cases, a different analysis window must be chosen so that
| |
| the separability criterion is satisfied.
| |
| | |
| If the components of a signal are separable in frequency
| |
| with respect to a particular short-time spectral analysis
| |
| window, then the output of each short-time Fourier transform
| |
| filter is a filtered version of, at most, a single
| |
| dominant (having significant energy) component, and so the
| |
| derivative, with respect to time, of the phase of the
| |
| <math>X(t,\omega_{0})</math> is equal to the derivative with
| |
| respect to time, of the phase of the dominant component at
| |
| <math>\omega_{0}</math>. Therefore, if a component,
| |
| <math>x_{n}(t)</math>, having instantaneous frequency
| |
| <math>\omega_{n}(t)</math> is the dominant component in the
| |
| vicinity of <math>\omega_{0}</math>, then the instantaneous
| |
| frequency of that component can be computed from the phase
| |
| of the short-time Fourier transform evaluated at
| |
| <math>\omega_{0}</math>. That is,
| |
| | |
| <center><math>\begin{matrix}
| |
| \omega_{n}(t)
| |
| &= \frac{\partial}{\partial t} \arg\{ x_{n}(t) \} \\
| |
| &= \frac{\partial }{\partial t} \arg\{ X(t,\omega_{0}) \}
| |
| \end{matrix}</math></center>
| |
| | |
| [[Image:Long-window reassigned spectrogram of speech.png|thumb|400px|
| |
| Long-window reassigned spectrogram of the word "open",
| |
| computed using a 54.4 ms Kaiser window with a shaping
| |
| parameter of 9, emphasizing harmonics.]]
| |
| | |
| [[Image:Short-window reassigned spectrogram of speech.png|thumb|400px|
| |
| Short-window reassigned spectrogram of the word "open",
| |
| computed using a 13.6 ms Kaiser window with a shaping
| |
| parameter of 9, emphasizing formants and glottal pulses.]]
| |
| | |
| Just as each bandpass filter in the short-time Fourier
| |
| transform filterbank may pass at most a single complex
| |
| exponential component, two temporal events must be
| |
| sufficiently separated in time that they do not lie in the
| |
| same windowed segment of the input signal. This is the
| |
| property of separability in the time domain, and is
| |
| equivalent to requiring that the time between two events be
| |
| greater than the length of the impulse response of the
| |
| short-time Fourier transform filters, the span of non-zero
| |
| samples in <math>h(t)</math>.
| |
| | |
| In general, there is an infinite number of equally valid
| |
| decompositions for a multi-component signal.
| |
| The separability property must be considered in the context of the
| |
| desired decomposition. For example, in the analysis of a speech signal,
| |
| an analysis window that is long relative to the time between glottal pulses | |
| is sufficient to separate harmonics, but the individual
| |
| glottal pulses will be smeared, because
| |
| many pulses are covered by each window
| |
| (that is, the individual pulses are not separable, in time, | |
| by the chosen analysis window).
| |
| An analysis window that is much shorter than the
| |
| time between glottal pulses may resolve the glottal pulses,
| |
| because no window spans
| |
| more than one pulse, but the harmonic frequencies
| |
| are smeared together, because the main lobe of the analysis window
| |
| spectrum is wider than the spacing between the harmonics
| |
| (that is, the harmonics are not separable, in frequency,
| |
| by the chosen analysis window).
| |
| | |
| == References ==
| |
| | |
| <references/>
| |
| | |
| == Further reading ==
| |
| *S. A. Fulop and K. Fitz, ''A spectrogram for the twenty-first century'', Acoustics Today, vol. 2, no. 3, pp. 26–33, 2006.
| |
| *S. A. Fulop and K. Fitz, ''Algorithms for computing the time-corrected instantaneous frequency (reassigned) spectrogram, with applications'', Journal of the Acoustical Society of America, vol. 119, pp. 360 – 371, Jan 2006.
| |
| | |
| == External links ==
| |
| * [http://tftb.nongnu.org/ TFTB — Time-Frequency ToolBox]
| |
| * [http://www.klingbeil.com/spear/ SPEAR - Sinusoidal Partial Editing Analysis and Resynthesis]
| |
| * [http://www.cerlsoundgroup.org/Loris/ Loris - Open-source software for sound modeling and morphing]
| |
| * [http://musicalgorithms.ewu.edu/algorithms/roughness.html SRA - A web-based research tool for spectral and roughness analysis of sound signals] (supported by a Northwest Academic Computing Consortium grant to J. Middleton, Eastern Washington University)
| |
| | |
| [[Category:Time–frequency analysis]]
| |
| [[Category:Transforms]]
| |