AREVA

dBSONIC PX

PerceptualXplorer

The dBSONIC PerceptualXplorer is a revolutionary, state-of-the-art method for visual evaluation of noise signals and for sound design. It is a powerful tool for visual exploration, editing and resynthesis of auditory representations. With its auditory models, complicated relationships between physical quantities and perceptual quantities can be visualised ("See what you hear").
The PerceptualXplorer is a developed in close cooperation with universities. It employs pioneering hearing-based visualization technology to display the "GESTALT" of a sound signal (with high resolution) versus the time/frequency domain. To achieve this, "time- and frequency contours" are extracted from the signal. The tonal components are then detected as "tracks" and the aurally redundant (and thus inaudible) information is removed. In contrary to conventional methods like FFT- or third-octave analysis, the dBSONIC auditory spectrogram provides a high time and a high frequency resolution simultaneously.
With the Spectral Editor of the dBSONIC PerceptualXplorer the results can be intuitively edited "millisecond by millisecond" or "Hertz by Hertz". The modified signal can be re-synthesized and played back for verification and control purposes.

Explore and simulate auditory signals by means of

  • Aurally related analysis
  • Contour extraction
  • Tracking of tonal components
  • Editing and resynthesis of auditory representations

Applications

  • Research&Development in Psychoacoustics
  • Speech Analysis& Synthesis
  • Sound Quality
  • Musical Acoustics
  • Teaching & Training

Beside the time function, sounds are normally visualized with spectrograms or waterfall diagrams. A spectrogram displays the power of a sound signal color coded in dependence on frequency and time. Instead of color, a waterfall diagram displays the power of a sound on a third axis in dependence on frequency and time. For most sounds a waterfall diagram becomes quite intricate, thus in dBSONIC PX sounds are visualized in spectrograms. Slices of a waterfall diagram are visualized additionally in a coupled spectrum/slice display. In dBSONIC PX the frequency scaling is in contrary to conventional spectrograms not linear and the analysis bandwidth is not constant. Frequency scaling and analysis bandwidth are adapted to the frequency and time selectivity of the human ear forming an aurally adequate signal representation.

In a waterfall diagram the aurally adequate signal representation of a pure sine would form a mountain range, resembling the excitation of the basilar membrane of the ear. But a pure sine signal is heard as a pure sine. A visual analogon is a single line, the ridge of the mountain range! Thus the ear performs some kind of contouring of  the representation of a sound found at the stage of the basilar membrane.

In dBSONIC PX this is modeled by the extraction of maxima in each spectrum of the spectrogram forming frequency contours and the extraction of maxima  in each filter channel of the auditory spectrogram forming time contours. Frequency contours include the tonal components of a sounds like vowels in speech, time contours represent impulsive components like the plosives in speech. Frequency contours and time contours can be manipulated and resynthesized, and overlayed and manipulated and resynthesized. They are excellent tools for the visual exploration and analysis of sounds as demonstrated in many studies including the fields of sound quality, musical acoustics, speech processing and auditory scene analysis. Resynthesis of the complete contour set results in sounds nearly undistinguishable from the original. Thus the contours contain all relevant acoustical information of the original time signal.

The dBSONIC Perceptual Explorer

  • makes the aural sensations visible in auditory spectrograms (ASP)
  • models human information processing using time and frequency contours as basic elements of sensory information acquisition
  • thus forms an ideal basis for modeling higher stages of auditory processing
  • removes aurally redundant (inaudible) information
  • detects tonal components in the form of tracks
  • resynthesizes contours to confirm that only aurally relevant information was extracted
  • is a WINDOWS (TM) based software

Display modes

  • High resolution spectrograms (256 colors or gray scale)
  • Time function overview display
  • Overlay of time and frequency contours on the auditory spectrogram
  • Coupled spectrum/slice display
  • Temporal average and spectral sum
  • Frequency in Hz and Bark
  • High resolution and flexible zoom functions

Features

  • Auditory Spectrogram (ASP) according to the human ear
  • Time Contour, Frequency Contour, Nonlinear masking, Tracks
  • Resynthesis of contours, tracks, ASP
  • Spectral Editor with undo/redo function

Advanced Auditory Analysis with the Auditory Spectrogram

The auditory analysis implemented in dBSONIC is based on a customized STFT (Short-Term Fourier-Transformation). The analysis bandwidth is selected proportional to the critical bandwidth of the human ear.

Filters of 4th, 3rd, 2nd and 1st order may be selected for different applications: e.g.: 1st order filters result in time windows of Terhardt's Fourier-Time-Transformation; the use of 4th order filters improves the separation of transient events from stationary parts and the resynthesis quality significantly.

Auditory Spectrogram (ASP) of the sentence "bring your problems"
FFT analysis of the sentence "bring your problems"

Smoothing

It is possible to smooth the resulting auditory spectrogram (ASP) by filtering it with a first order low-pass filter before contouring. The bandwidth of this low-pass filter can be adjusted.

Group Delay Compensation

Similar to the basilar membrane the auditory filters applied in the calculation of the auditory spectrogram (ASP) causes a delay. The auditory system compensates the delay, thus a listener perceives events at different frequencies simultaneously although they occur at different times at the stage of the basilar membrane. dBSONIC PerceptualXplorer performs an exact delay compensation, too.

Phases

In addition to the level spectrogram by default the phases of the auditory spectrogarm (ASP) are stored, too. Thus the resynthesis of the sound from the ASP with original phases is possible.

Time Contours of the sentence "bring your problems"

Time and Frequency Contours

Time Contours

A maximum is detected as a time contour point, if before the maximum occurs, the change in level in a frequency channel exceeds a certain threshold value. The phase information of the time contour points can be saved during calculation of the auditory spectrogram.

Frequency Contours of the sentence "bring your problems"

Frequency Contours

In order to be detected as a frequency contour point, the difference in level of a spectral maximum to neighboring levels has to exceed a certain threshold value. The phase information of the frequency contour points can be saved during calculation of the auditory spectrogram.

Time / Frequency Contours of the sentence "bring your problems"

Time and/or frequency contours

can be combined and/or overlaid on the auditory spectrogram.

Nonlinear masking applied onto Time / Frequency Contours

Masking

Masking including level dependence of the upper masking slope and the threshold of hearing can be applied to the time and frequency contours basing on Terhardt's approach.

 

Frequency Tracks of the sentence "bring your problems"

Tracks

For building "tracks" a search algorithm is applied. A noise component is included in a track if it meets the following requirements:

  • The minimal length of the tonal components has to exceed a certain threshold in order to separate the signal into voiced (tonal) and unvoiced parts.
  • The frequency search range determines up to which frequency spacing neighboring contour points are linked together.
  • Temporally spaced values differing too much in level are detected as single impulsive events and not grouped together. Contour points falling into the frequency search range but differing in level more then the level search range are therefore not concatenated.

Resynthesis

  • from (edited) frequency contours only
  • from (edited) time contours only
  • from (edited) frequency and time contours
  • from (edited) auditory spectrogram
  • from (edited) tracks

The minimum level above which components are included in the resynthesis can be specified. Thus background noise can be eliminated. Or, with a higher threshold, the resynthesis can be concentrated on the main components. The original phases derived from the auditory spectrogram can be used or, alternatively, the phase will be estimated by a phase heuristics. If nonlinear masking was applied, only the contours found not to be masked are used for the resynthesis.

Additional features that can be applied:

  • Automatic fade out of the resynthesized signal (for click suppression)
  • Compensation of the overall delay introduced by the filterbank used for the resynthesis

Graphical Editor

With the mouse an area on the screen can be marked. This selected area can be amplified or attenuated.

New sections can be added. Sections can be overlayed and removed one after another with the undo function. The edited parts effect the resynthesis.

Alternatively single lines can be moved along the time or/and frequency axis within the spectrum
Spectrum/slice display of an ASP

 

Display Modes and Tools

 Spectrum /slice display  

With the mouse, or by the arrow keys points in time and frequency can be selected. The selection is marked with a crosshair in the spectrogram. The horizontal line of the cross corresponds to a time slice of the corresponding waterfall diagram and is displayed in the middle window. It shows level over time for the selected frequency. The vertical line of the cross corresponds to a single spectrum of the corresponding waterfall diagram and is displayed in the lower window. There level over frequency for the selected time point is shown.

 

 

Spectral sum and mean spectra

The display of the mean of all or of a selected portion of the spectrogram can be shown. The display of the mean spectrum replaces the single spectrum of the spectrum / slice display. The spectral sum versus time replaces at the same time the single frequency channel versus time of the spectrum / slice display.

Mean spectrum and spectral sum

Zoom

Every display offers flexible and easy-to-use zoom in and out functions.

Left: ASP spectrogram of complete signal showing the two kinds of zoom functions.

Critical-Band Rate Scale

The auditory analysis uses a critical-band rate scale given in Bark whereas a linear frequency scale in Hz is used in conventional Fourier analysis. The Bark scale reflects the nonlinear frequency transformation of the human ear.

Table 1: Critical-band rate z as a function of frequency

 

z / Bark

1

2

3

4

5

6

7

8

9

10

11

12

13

f / Hz

100

200

300

400

510

630

770

920

1080

1270

1480

1720

2000

/ Bark

14

15

16

17

18

19

20

21

22

23

24

f / Hz

2320

2700

3150

3700

4400

5300

6400

7700

9500

12000

15500

In order to transform frequencies given in Hz into Bark the following approximation given in [1] is commonly used.

z/Bark = 13 arctan(0.76f/kHz) + 3.5 arctan (f/7.5 kHz)2

However as shown in [2] deviations of up to 0.2 Bark may occur between transformed and tabulated values in [1]. For the default frequency interval of 0.05 Bark, differences of 0.2 Bark would amount to a difference of 4 frequency channels.

Therefore dBSONIC PerceptualXplorer uses a more precise approximation as proposed by [8]:

z1 /Bark = 26.81 * f / (1960 + f) -  0.53 

for z1 < 2.0: z = z1*2./2.53 + 1.06/2.53

for z1 > 20.1 z = z1*1.22 -4.422

The analysis bandwidth of the hearing system – the critical bandwidth - as a function of frequency in Hz is evaluated by the following formula:

Delta fG/Hz = 25 + 75 [1 + 1.4 (f/kHz)2]0.69

Literature

[1] Zwicker, E., Fastl, H.: Psychoacoustics - Facts and Models. Springer Verlag Berlin. 1990.

[2] Zwicker, E. and Terhardt, E.: Analytical expressions for critical-band rate and critical bandwidth as a function of frequency. J. Acoust. Soc. Am. 68, 1523, 1980.

[3] Terhardt, E.: Fourier transformation of time signals: conceptual revision, Acustica, 57: 242-256, 1985.

[4] Heinbach, W.: Aurally adequate signal representation: The Part-Tone-Time-Pattern. Acustica, 67, S. 113-121, 1988.

[5] Terhardt, E.: Psychophysics of audio signal processing and the role of pitch in speech. In: Schouten, M. E. H., Editor, The Psychophysics of Speech Perception, S. 271-283. M. Nijhoff Publ., Dordrecht, 1987.

[6] Terhardt, E.: From speech to language: On auditory information processing. In: Schouten, M. E. H., Editor, The Auditory Processing of Speech: from Sounds to Words, S. 363-380. Mouton de Gruyter, Berlin, (1992).

[7] Terhardt, E.: Akustische Kommunikation. Grundlagen mit Hörbeispielen. Springer Verlag Berlin – Heidelberg 1998.

[8] Baumann, U.: Ein Verfahren zur Erkennung und Trennung multipler akustischer Objekte. Herbert Utz Verlag Wissenschaft, Dissertation, Munich, 1995.

[9] Heldmann, K.: Wahrnehmung, gehörgerechte Analyse und Merkmalsextraktion technischer Schalle. Dissertation, Munich 1994.

[10] Wartini, S.: Zur Rolle der Spektraltonhöhen und ihrer Akzentuierung bei der Wahrnehmung von Sprache. Fortschr.- Ber. VDI Reihe 10, VDI-Verlag Düsseldorf., Dissertation, Munich 1996.

[11] Schlang, M., M. Mummert, “Die Bedeutung der Fensterfunktion für die Fourier Transformation als gehörgerechte Spektralanalyse“, Fortschritte der Akustik, DAGA'90, 1043-1047, 1990.

[12] Mummert, M.: Sprachcodierung durch Konturierung eines gehörangepaßten Spektrogramms und ihre Anwendung zur Datenreduktion. Fortschr.-Ber. VDI Reihe 10, VDI-Verlag Düsseldorf. Dissertation, Munich, 1998.

[13] Horn, T.: Image processing of speech with Auditory Magnitude Spectrograms. Acustica Vol. 84, 175-177, 1998.

[14] Terhardt, E., Stoll, G., Seewann, M.: Algorithm for extraction of pitch and pitch salience from complex tonal signals. J. Acoust. Soc. Am., Vol. 71, 679-688, 1982.

[15] Daniel, P., Ellermeier, W., Leclerc, P.: Tonalness and Unpleasantness of tire sounds: methods of assessment and psychoacoustical modeling. Euro-noise 98, 627-632, 1998.

[16] Vormann, M., Weber, R.: Gehörgerechte Darstellung von instationären Umweltgeräuschen mittels Fourier-Time-Transformation (FTT). Fortschritte der Akustik, DAGA 95, 1191-1194, 1995.

[17] Heldmann, K., Keiper, W.: Analyse von instationären technischen Geräuschen. Fortschritte der Akustik, DAGA 91, 761-764, 1991.

[18] Valenzuela, M.N.: Untersuchungen und Berechnungsverfahren zur Klangqualität von Klaviertönen. Dissertation, Herbert Utz Verlag, Munich 1998.

[19] Fleischer, H.: Schwingung und Schall von Glocken. Fortschritte der Akustik, DAGA 2000, 2000.

Para mais informações, nós contatar : comercial@01db.com.br

Mapa do Site | Atualizações | Menções Legais
© Copyright 2007 - 01dB Brasil

Medições e Analise de Ruidos e Vibrações