Please tick the box to continue:

Page 1: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

Digital Sound Synthesis by Physical Modelling

Rudolf Rabenstein and Lutz Trautmann Telecommunications Laboratory University of Erlangen-Nurnberg D-91058 Erlangen, Cauerstr. 7

{rabe, traut} (


After recent advances in coding of natural speech and audio signals, also the synthetic creation of musical sounds is gaining importance. Various methods for waveform syn- thesis are currently used in digital instruments and software synthesizers. A family of new synthesis methods is based on physical models of vibrating structures (string, drum, etc.) rather than on descriptions of the resulting waveforms. This article describes various approaches to digital sound syn- thesis in general and discusses physical modelling methods in particular: Physical models in the form of partial differ- ential equations are presented. Then it is shown, how to derive discrete-time models which are suitable for real-time DSP implementation. Applications to computer music are given as examples.

1. Introduction

The last 150 years have seen tremendous advances in electrical, electronical, and digital information transmission and processing. From the very beginning, the available technology has not only been used to send written or spoken messages but also for more entertainig purposes: to make music! An early example is the Musical Telegraph of Elisha Gray in 1876, based on the telephone technology of that time. Later examples used vacuum tube oscilators through- out the first half of last century, transistorized analog syn- thesizers in the 1960s, and the first digital instruments in the 1970s. By the end of last century, digital soundcards with various methods for sound reproduction and genera- tion were commonplace in any personal computer.

give desktop computers the functionality of stereo equip- ment or sound studios. An example are new coding schemes for high quality audio. Together with rising bitrates for ,

file transmission on the internet, they have made digital music recordings freely avaiable on the world wide web. Another example is the combination of high performance sound cards, high capacity and fast access hard disks, and sophisticated software for audio recording, processing and mixing. A high-end personal computer equipped with these. components and programs provides the full functionality for a small home recording studio.

While more powerful hard- and software turn a single computer into a music machine, advances in standardiza- tion pave the way to networked solutions. The benefits of audio coding standards has already been mentioned. But the new MPEG-4 video and audio coding standard does not only provide natural but also synthetic audio coding. This means, that not only compressed samples of recorded music can be transmitted, but also digital scores similar to MIDI in addition to algorithms for the sound generation. Finally the concept of Structured Audio allows to break down an acoustic scene into their components and to transmit and manipulate them independently.

While natural audio coding is a well researched subject with widespread applications, the creation of synthetic high quality music is a topic of active development. For some time, applications have been confined to the refinement of digital musical instruments and software synthesizers. Re- cently, digital sound synthesis finds its way into the h4PEG- 4 video and audio coding standard. The most recent and maybe most interesting family of synthesis algorithms is based on physical models of vibrating structures.

The development is rapidly going on. One driving force is certainly the availablity of ever more powerful hardware. Cheap memory allows to store sound samples in high qual- ity and astonishing variety. The increase in processing power makes it possible to "compute" sounds in real time.

But also new algorithms and more powerful software

This article will higlight some of the methods for digi- tal sound synthesis with special emphasis on physical mod- elling. Section 2 presents a survey of synthesis methods. Two algorithms for physical modelling are described in sec- tion 3. Applications to computer music are given in sec- tion 4.


Page 2: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

2. Digital Sound Synthesis

2.1. Overview

Four methods for the synthesis of musical sounds are be presented in ascending order of modelling complexity [5]. The first method, wavetable synthesis, is based on samples of recorded sounds with little consideration of their phys- ical nature. Spectral synthesis creates sounds from mod- els of their time-frequency behaviour. The parameters of these models are derived from descriptions of the desired waveforms. Nonlinear synthesis allows to create spectrally rich sounds with very modest complexity of the synthesis algorithms. In contrast to spectral synthesis the parameters of these nonlinear models are not related to the produced waveforms in a straightforward way. The most advanced method, physical modelling, is based on models of the phys- ical properties of the vibrating structure which produces the sound. Rather than imitating a waveform, they simulate the physical behaviour of a string, drum, etc. Such simulations are numerically demanding, but modem hardware allows real-time implementations under practical conditions.

2.2. Wavetable Synthesis

The most widespread method for sound generation in digital musical instruments today is wavetable synthesis, also simply called sampling. Here, the term wavetable syn- thesis will be used, since sampling strictly denotes time dis- cretization of continuous signals in the sense of signal the- ory.

In wavetable synthesis recorded or synthesized musical events are stored in the internal memory and are played back on demand. Therefore wavetable synthesis does not require a parameterized sound source model. It only consists of a database of digitized musical events (the wavetable) and a set of playback tools. The musical events are typically

. temporal parts of single notes recorded from various in- struments and at various frequencies. The musical events must be long enough to capture the attack of the real sounds as well as a portion of the sustain. Capturing the attack is necessary to reproduce the typical sound of an instrument. Recording a sufficiently long sustain period avoids a strict periodicity during playback.

The playback tools consist of various techniques for sound variation during reproduction. The most important components of this toolset are pitch shifting, looping, en- veloping, and filtering. They are discussed here only briefly. See [3, chapter 81 and [5] for a more detailed treatment.

Pitch shifting allows to play a wavetable at different pitches. Recording notes at all possible frequencies for all instruments of interest would require excessive mem- ory. To avoid this situation only a subset of the frequency

range is recorded. Missing keys are reconstructed from the closest recorded frequency by pitch variation during play- back. Pitch shifting is accomplished by sample rate con- version techniques. Pitch variation is only possible within the range of a few semitones without noticeable alteration of the sound characteristics (Micky-Mouse effect).

Looping stands for recursive read out of the wavetable during playback. It is applied due to memory limitations as well as length variations of the played notes. As mentioned above, only a certain period is recorded, long enough to cap- ture the richness of the sound. This period is extended by looping to produce the required duration of the tone. Care has to be taken to avoid discontinuities at the loop bound- aries.

Enveloping denotes the application of a time varying gain function on the looped wavetable. Since the typical attack-decay-sustain-release (ADSR) envelope of an instru- ment is destroyed by looping, i t can be reconstructed or modified by enveloping.

Filtering modifies the time dependent spectral content of a note as enveloping changes its amplitude. Usually recur- sive digital filters of low order with adjustable coefficients are used. This allows not only a better sound-variability than present in the originally recorded wavetables but also time-varying effects which are not possible with acoustic instruments.

Despite these playback tools for sound alteration (and others not mentioned here), the sound variation of wavetable synthesis is limited by the recorded material. However, with the availability of cheap memory, wavetable synthesis has become popular for two reasons: Low com- putational cost and ease of operation. More advanced syn- thesis techniques need more processing power and require more skill of the performing musician to fully exploit their advantages.

2.3. Spectral Synthesis

While wavetable synthesis is based on sampled wave- forms in the time domain, spectral synthesis produces sounds from frequency domain models. There is a variety of methods based on a common generic signal representation: the superposition of basis functions $(t) with time-varying amplitudes Fl ( t )

Only a short description of the main approaches is given here, based'on [3, chapter 91, [5] , and [2 ] . Practical imple- mentations often consist of combinations of these methods.


Page 3: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

Additive Synthesis the superposition of sinusoids

In additive synthesis, (1) describes

f ( t ) = xfi(t) sin(Ql(t)) + 4 t h (2)

Sometimes a noise source n(t) is added to account for the stochastic character which is not modelled well by sinu- soids. In the simplest case, each frequency component 81 ( t ) is given by a constant frequency and phase term &(t) = wl t + + l . In practical synthesis, the time signals in (2) are represented 'by samples and the synthesized sound is pro- cessed in subsequent frames. The time variation of the am- plitude and the frequency of the sinusoids are considered by changing the values of 4, wl, and possibly 41 from frame to frame.


Subtractive Synthesis Subtractive synthesis shapes sig- nals by taking away frequency components from a spec- trally rich excitation signal. This is achieved by exciting time-varying filters with noise. This approach is closely re- lated to filtering in wavetable synthesis. However, in sub- tractive synthesis, the filter input is a synthetic signal rather than a wavetable. Since harmonic tones cannot be well ap- proximated by filtered noise, subtractive synthesis is mostly used in conjunction with other synthesis methods.

Granular Synthesis In granular synthesis the basis func- tions $l(t) in (1) are chosen to be concentrated in time and frequency. These basis functions are called atoms or grains here. Building sounds from such grains is called granular synthesis. Sound grains can be obtained by various means: from windowed sine segments, from wavetables, from Ga- bor expansions, or with wavelet techniques.

2.4. Nonlinear Synthesis

In the previous sections linear sound synthesis methods have been described. They varied from the computational cheap wavetable synthesis with low variability to the com- putational expensive additive synthesis where arbitrary ac- cess on the basics of a sound is possible.

Using nonlinear models for sound synthesis leads to computational cheap methods with rich spectra. The dis- advantage of these methods is that the resulting time func- tions or spectra cannot be calculated analytically in most cases. Also the effect of parameter changes on the tim- bre of the sound cannot be predicted except for very simple schemes. Nevertheless nonlinear synthesis provides com- putational low-cost synthetic sounds with a wide variety of time functions and spectra.

The simplest case of nonlinear synthesis is discussed here. Making the phase term in the sine function time de-



Figure 1. Frequency Modulation

pendent leads to the frequency modulation (FM) method. In its simplest form, the time function f ( t ) is given by

f ( t ) = F ( t ) sin(wot + $(t)) . (3)

The implementation consists of at least two coupled oscil- lators. In (3) the carrier sin(w0t) is modulated by the time- dependent modulator 4(t) such that the frequency becomes time-dependent with w ( t ) = WO + (a/&)q5(t). If the mod- ulator is also sinusoidal with $(t) = q sin(wmt) as shown in Fig. 1 the resulting spectrum consists of the carrier fre- quency WO and side frequencies at WO f nwm, n E N. The relations between the amplitudes of the discrete frequen- cies can be varied with the modulation index q. They are given by the values of the Bessel functions of order n with argument q. Four different FM spectra for WO = 1 kHz and different modulator frequencies and different modula- tion indices q are shown in Fig. 2. The spectrum for q = 1 has a simple rational relation between WO and wm resulting in a harmonic spectrum. Increasing the modulation index to q = 2 preserves the distance of the frequency lines but increases their number (top right). A slight decrease of wm moves the frequency components closer together and pro- duces a non-harmonic spectrum (bottom left). Spectrally very rich sounds can be produced by combining small val- ues of the modulation frequency wm with high modulation indices, as shown for q = 8. However, due to the depen- dence on only a few parameters, arbitrary spectra as in ad- ditive synthesis cannot be produced. Therefore this method fails to reproduce natural instruments. Nevertheless FM is frequently used in synthesizers and in sound cards for per- sonal computers, often with more than just two oscillators in a variety of different connections.

2.5. Physical Modelling

Wavetable synthesis, spectral synthesis as well as the nonlinear synthesis are based on sound descriptions in the time and frequency domain. A family of methods called physical modelling goes one step further by modelling directly the sound production mechanism instead of the sound. Invoking the laws of acoustics and elasticity the- ory results in the physical description of the main vibrat- ing structures of musical instruments by partial differential equations. Most methods are based on the wave equation which describes wave propagation in solids and in air ([17]).


Page 4: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

"0 1 2 f i n kHz

f i n kHz

11 . I

f i n kHz

Figure 2. Typical FM spectra

Finite Difference Methods The most direct approach is the discretization of the wave equation by finite difference approximations of the partial derivatives with respect to time and space. However, a faithful reproduction of the har- monic spectrum of an instrument requires small step sizes in time and space. The resulting numerical expense is consid- erably. The application of this aproach to piano strings has for example been shown by [I]. A physical motivation of the space discretization is given by the mass-spring-models described in [4].

Modal Synthesis Vibrating structures can also be de- scribed in terms of their characteristic frequencies or modes and the associated decay rates. This approach allows the formulation of couplings between different substructures. Except for simple cases, the determination of the eigen- modes can only be conducted by experiments ([4]).

Digital Waveguides A well known theoretical approach to the solution of the wave equation in one spatial dimen- sion is the d'Alembert solution. It separates the wave prop- agation process into a pair of waves travelling into oppo- site directions without dispersion or losses. This separation is the basis of the digital waveguides described in [9], [3, chapter 101, and [5, chapter 71. The digital model consists of a bidirectional delay line with coupling coefficents be- tween the taps approximating losses and dispersion. The digital waveguide method has been refined by proper ad- justmentof the delay lines using fractional delay filters [15]. Applications to string instruments are found in [ 141 and to woodwind instruments in [7].

Couplings between sections with different wave impedances are modelled by scattering junctions. They approximate the partial reflections at discontinuities. Waveguide methods have also been extended to two

and three spatial dimensions, however with considerable increase in computational demand.

Physical modeling by digital waveguides is incorporated into various commercial musical instruments using appro- priate models for excitation (e.g. plucked, struck, and bowed strings) and boundary conditions. Furthermore, it provides a sound basis for the creation of artificial instru- ments like bowed flutes.

Transfer Function Models This relatively new approach starts directly at the partial differential equation (PDE) de- scribing the continuous vibrations in a musical instrument. It transforms the PDE with suitable functional transforma- tions into a multidimensional (MD) transfer function model (TFM). For the time variable the Laplace transformation is used. The spatial transformation depends on the PDE and its boundary conditions. This leads to a generalized Sturm- Liouville type problem whose solutions are the eigenfunc- tions K ( x , p,) and the eigenvalues p,. They are used in the spatial transformation as transformation kernel and as spatial frequency variable [12].

The physical effects modelled by the PDE like longitu- dinal and transversal oscillations, loss and dispersion are treated with this method analytically. Moreover, the TFM explicitely takes initial and boundary conditions, as well as linear and nonlinear excitation functions into account. The discretization of this continuous model for computer imple- mentation based on analog-to-discrete transformations pre- serves not only the inherent stability, but also the natural frequencies of the oscillating body.

All parameters of this method are strictly based on phys- ical parameters (dimensions as well as material parameters) and the output signal is calculated analytically from these parameters.

Digital waveguides and multidimensional transfer func- tion models are covered in more detail in section 3.

2.6. Structured Audio

The techniques described above had initially been con- fined to proprietary hard and software in musical instru- ments or dedicated programs for the generation of digi- tal music. Although there have been tremendous efforts in the standardization of multimedia services, they were mostly directed to the compression of natural audio and video material. Synthetic sounds were of no concern un- til the emergence of MPEG-4 standardization. While still advancing the coded representation of natural audiovisual scenes, MPEG-4 introduced a tool for digital sound synthe- sis under the name of structured audio (SA) [16, 8, 131. The idea is not to transmit coded sounds as in natural audio but a highly parametric description of music (such as a musi- cal score) from which the sound is synthesized at the de-



Page 5: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

coder. Among other tools, structured audio provides score languages to encode the musical paramaters (pitch, dura- tion, etc.) as well as methods for sound synthesis.

To describe musical scores, the very popular MIDI stan- dard has been included into MPEG-4 structured audio. But also a more advanced structured audio score language (SASL) has been created to provide enhanced control of al- gorithmic and wavetable synthesis.

Also for sound synthesis two different methods exist, a programming language for musical synthesis algorithms structured audio orchestra language (SAOL) and a standard for the storage and transmission of wavetables structured audio sample bank format (SASBF).

SAOL is an object oriented programming language with special commands and variable types for real-time sound synthesis. It differs from conventional programming lan- guages by providing three different time scales for the gen- eration of synthetic wave forms. To each time scale belongs a certain data type, such that variables of that type are au- tomatically evaluated at the corresponding rate. The fastest time scale is the a-rate, which is peformed at the sampling rate. A medium time scale is the control rate (k-rate) for up- dating envelopes and other control signals. Typical values for the k-rate are a few cycles per second. An even slower rate is the instrument rate (i-rate) for the initialization of timbre parameters. They may be updated asynchroneously, e.g. at the beginning of each note. Furthermore, SAOL pro- vides special high level sound processing commands like signal and envelope generation, parametric filtering, evalu- ation of MIDI data, and access of wavetables.

Since SAOL is a general purpose language, it can be used to realize any of the sound synthesis algorithms described above. Since SAOL code and sample banks are transmit- ted together with the score data, the synthesized sound at the decoder will be exactly the same as intended at the en- coder side. This is in contrast to the reproduction of MIDI files, where the sound quality is determined by individual instrument or wavetable of the listeners audio equipment.

Although there is not yet a real-time structured audio en- coder to date, it can be expected that synthetic audio repro- duction will become an alternative to coded natural audio in the near future [lS]. It has already been demonstrated that structured audio provides highest sound quality at very low bitrates, not attainable by natural audio coders. Of course synthetic audio is restricted to music which can be com- pletely described by musical scores of some kind. It cannot reproduce sounds without an underlying model, e.g. from microphone recordings. Furthermore also the available syn- thesis methods and sound productions models have to be refined. Advances can be expected on the field of physical models, which are discussed in the following section.

3. Physical Modelling

Sound synthesis by physical modelling requires two es- sential steps: the description of a vibrating structure by the principles of physics and the transformation into a discrete- time, discrete-space model which is suitable for computer implementation. Each step requires certain simplifications and allows variations. These are discussed in the following sections.

3.1. Vibration Models

Deformable bodies may exhibit vibrations in various fre- quency ranges. The exact description of such vibrations re- quires to break up the body into small volume elements. Setting up a balance between the forces of inertia and de- formation for each element leads to a PDE for the deflec- tion from the rest position. The derivation of these PDE de- scriptions for vibrating strings, reeds, membranes, and other elastic bodies can be found [6, 10, 171.

Various PDE models for a vibrating string are presented as examples. The time and space coordinates are denoted by t and x. Only one space coordinate is considered for sim- plicity. ~ ( x , t ) denotes the deflection of the string from the rest position. Furthermore a number of material constants and shape parameters are required such as the Young’s mod- ulus E and the density p of the material, the cross section area A and the moment of inertia I of the string.

The simplest model results for undamped longitudinal waves. It takes the form of the well-known wave equation with second order derivatives for time and space

For sound generation, transversal waves are more impor- tant, since they transmit energy to the resonance body and the surrounding air. They are characterized by a fourth- order spatial derivative

Typically, a string is under strain by a certain force F , re- sulting in an additional second order term

Further refinement of the model by inclusion of rotational vibrations and shear effects finally leads to the Timoshenko equation from elasticity theory.

Rather than refining the model in this direction, we ex- tend (6) by an external force per length f (x , t ) . Further- more damping is considered by additional terms with the


Page 6: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

decay variables dl and d3

a4y a 2 y a”(z:,t) EI- - F- + pA ~

ax4 a x 2 at2

+d1- ay +d3- a3Y = f(z,t). (7) at atax2

Note that for rigid ( E = 0) or very thin ( I = 0) strings with no damping ( d l = d3 = 0 ) (7) has the same structure as the wave equation (4), however with different coefficients. Its solution can be written as superposition of a forward and a backward travelling wave (d’ Alembert solution)

y(., t) = Yl(Z + 4 + Y r b - 4 (8)

where c = d m is the propagation speed of the waves. If the first term in (7) does not vanish, then the travelling waves are subject to dispersion. If the decay variables d l and d3 are nonzero, then the term with d l introduces fre- quency independent damping and the term with d3 intro- duces frequency dependent losses.

The vibration modes of a string are not only determined by the PDE but also by the boundary conditions at the ends z = 20 and z = 2 1 . To solve (7) we need four boundary conditions because the highest order of spatial derivatives is four. In most musical instruments the string is fixed at the ends, as shown in Fig. 3.

* x XO X1

Figure 3. Mechanical fixing of a string

The boundary conditions for this situation require that the deflection (9) and the skewness (IO) at these points are zero [6] (see Fig. 4)

y(zo1t) = 0, y(z1,t) = 0, (9)

yl’(z0,t) = 0, y”(z1]t) = 0. (10)

The double prime denotes the second order spatial deriva- tive y” = d2y/dx2.

Elastic fixing at the ends of the shing or interface con- ditions to the sound board are described in the same way, e.g. by prescribing a certain linear combination of y(z0, t) and y”(z0, t ) . The boundary conditions can also include an excitation function at the boundary as they occur in wood wind instruments.

Typical excitation modes for musical instruments are to pluck or to struck the string. These modes are expressed in mathematical terms as initial conditions of the PDE. Be- cause the highest time derivative in (7) is two, we need two

Figure 4. Boundary conditions for a string fixed at both ends

initial conditions: one for the initial value of the deflection and one for its time derivative.

The plucked string is characterized by a given deflection profile for t = 0, while the time derivatives are zero [6]. The struck string is given by the first order time derivatives while the defection is zero at t = 0. The corresponding initial conditions are

The dot denotes time derivation G(zl t) = ay/%. The ini- tial profile of a string plucked close to z1 is shown in Fig. 5 , while Fig. 6 shows the initial velocity of string struck by a hammer at the position 2,. In general, both yio(z) and yil (z) can be specified independently from each other.

Figure 5. Initial conditions for a plucked string

Figure 6. Initial conditions for a struck string

The PDE descriptions presented in this section consitute highly accurate physical models of strings and other vibrat- ing structures. Extensions to two and three space dimen- sions are strightforward with a more general definition of the spatial differentiation operators. However, a computer implementation of these models requires to implement suit- able discretization schemes for time and space coordinates. Two approaches of practical importance are the digital wave guides method and the functional transformation method. They are described below in detail.


Page 7: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

3.2. Digital Waveguide Method

The digital waveguide method is based on the analogy between wave propagation mechanisms in elastic bodies, reeds and electromagnetic waveguides and their counter- parts in digital delay lines. Its application to computer mu- sic has been presented e.g. in [9], [3, chapter lo], [5, chapter 71 and [15, 14, 71.

The most simple vibration model is wave equation, ac- cording to (4) for longitudinal waves or by simplification of (7) for transversal waves. A representation of the cor- responding travelling wave solution (8) by a continouos waveguide is shown in Fig. 7.

- y r - Y1 Figure 7. Continuous waveguide

It can be transformed into a equivalent discrete structure by sampling the solution y(x, t ) on a space-time grid with x = mh and t = k T where m and k are the discrete space and time coordinates and h and T are the corresponding step sizes. When the time step size h is set equal to the distance that a wave with propagation speed c travels during the time step size h, i.e. h = cT, then the sampled waves y[m, k] = y(mh, k T ) are related by

The spatial shifts by the distance h are realized by sampling the continuous waveguide in x-direction and the time shifts are realized by delay elements (z- ') . The resulting dual de- lay line structure of a digital waveguide is shown in Fig. 8.

Figure 8. Digital waveguide

It is capable of reproducing the travelling wave solution of the wave equation, but it does not consider loss or dis- persion from more detailed vibration models such as (7)). These effects can be approximated by including additional filter elements H ( z ) in the delay lines (see Fig. 9). These elements may consist of

0 a multiplication to model frequency independent damping,

0 a selective filter to model frequency dependent damp- ing, or

an allpass to model frequency dependent delay.

H(z)t---/ 2-1

Figure 9. Digital waveguide with loss and dis- persion filters

Boundary conditions are considerd by a proper termina- tion of the dual delay line waveguide. These are also re- alized by digital filters as shown in Fig. 10. Similar to the filters H ( z ) for loss and dispersion, the boundary reflection filters Rl(z) and RT(z) represent

0 a phase shift for an open or closed line termination,

0 a real constant 0 < r < 1 as frequency independent reflection factor, or

a digital filter for frequency dependent reflections.

Figure 10. Digital waveguide with boundary reflection filters

The double delay line waveguide structure in Fig. 10 gives a complete picture how wave propagation, loss, dispersion, and reflection at the boundaries can be approximated. In addition, initial conditions are realized by the initial values of the delay elements for k = 0.

On the other hand, a practical implementation under real- time constraints calls for some simplifications. At first all delay elements can be combined into a single delay line rep- resented by a multiple delay element by 2M samples. Then the various filters H ( z ) at each delay element, as well as the boundary reflection filters R1 (z ) , and R, ( z ) are combined


Page 8: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

into a smaller number of different filters. In practical imple- mentations, their transfer functions are not derived directly from H ( z ) , Rl(z) , and R,(z). Instead they are designed to produce a certain waveform. It has turned out that three dif- ferent filters are adequate to model the correct pitch, disper- sion and frequency dependent damping [ 141. Fig. 11 shows the resulting arrangement of a single delay line and three digital filters.

t - ffdisp HTP

Figure 11. Efficient realization of a digital waveguide with excitation function f ( t ) and output y ( t )

The functions of these filters are

Hfd(z ) fractional delay filter to produce the exact pitch,

Hdisp(z) dispersion filter for deviations from pure wave propagation,

HTP (2) lowpass filter to model frequency dependent

They are assumed to be orthogonal in the sense that they


can be designed independently from each other.

3.3. Functional Transformation Method

The functional transformation method derives a dis- crete model of the vibrating body from its multidimen- sional transfer function description. Such an approach is well-known from the design of digital filters in the one-dimensional case. It starts from the description of a continuous-time electrical network by an ordinary differen- tial equation. Application of Laplace transformation turns the differential equation into a transfer function. Suitable discretisation schemes like impulse-invariant transormation or others convert the transfer function of the continuous- time network into the transfer function of a discrete-time system, which is suitable for computer implementation. It is worthwhile to review the reasons why the Laplace transfor- mation is well-suited for the derivation of transfer functions from ordinary differential equations:

1. Laplace transformation turns the time derivatives into multiplications with algebraic functions of the com- plex frequency variable.

2. Laplace transformation turns the initial conditions of a differential equation into additive terms.

By virtue of these properties, the differential equation with initial conditions is converted into an algebraic equation. Solving this equation for the Laplace transform of the out- put quantity yields the transfer function of the network.

The same approach can also be applied to multidimen- sional systems described by PDEs. Again, Laplace trans- formation can be applied with respect to the time variable. However, the result will still contain derivatives with respect to space. Now assume that a transformation exists for the space variable which has similar properties as the Laplace transformation for the time variable. Then application of this transformation would turn the boundary-value problem into an algebraic equation.

This approach relies on the existence of a transformation with respect to space with differentiation properties similar to the Laplace transformation. It has been shown how such transformations can be obtained for the PDEs presented in section 3.1 and many others [12, 1 I].

Then the following four-step procedure can be applied to derive a discrete model for vibrating bodies from a PDE model:

1. Application of the Laplace transformation with respect to time removes the time derivatives and turns the initial-boundary-value problem into a boundary value problem for the space variable.

2. Application of a suitable transformatiop for the space variable which removes the spatial derivatives and turns the boundary value problem into an algebraic equation.

3. Solution of the algebraic equation for the transform of the solution of the PDE. The resulting MD transfer function is the frequency domain equivalent to the ini- tial continuous-time, continuous space PDE descrip- tion.

4. Discretization of the MD transfer function to obtain a discrete-time discrete-space model of the vibrating body.

This procedure is now demonstrated by a simple PDE model already considered in section 3.2. Then extensions to other PDE models and other types of boundary conditions will be discussed.

The derivation of a transfer function model of a vibrating string is presented by a simple example. We consider the wave equation for a string with fixed ends (boundary condi-


Page 9: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

tions (9) and with initial conditions according to (11,12).

Y(z,t) - c2y”(x, t ) = 0 20 < x < q Y(Z,O) = YiO(X) 2 0 < z < z1 Y(z,O) = ~ i i ( z ) 20 < 3: < 2 1 (14)

y(zo,t) = 0 2 = zo y(z1,t) = 0 2 = z1

y(z, t ) and ~ ” ( 2 , t ) denote second order derivatives with respect to time and space.

Laplace transformation with respect to the time vari- able Y ( z , s) = L{y(z, t ) } turns this initial-boundary value problem into a boundary-value problem for the space vari- able x

s 2 Y ( x , s ) -c2Y“(z ,s) = SYiO(Z) + Y i l ( Z )

Y ( z 0 , s ) = 0 (15) Y ( z 1 , s ) = 0

Note that the second order time derivative has turned into a multiplication with s2 and that the initial values from (14) appear as additive terms on the right hand side.

To remove also the spatial derivative and to consider the boundary conditions, we apply the spatial transformation



with the transformation kernel

WP,, .) = K, sin (P,(z - zo) )

P, = P- 9 P E W . (18)


and the discrete spatial frequency 7r

21 - 2 0

This special form of the spatial transformation (finite sine- transformation) has been chosen, because the transforma- tion kernel from (17) fulfills the same boundary conditions as the deflection y(z, t ) of the string (compare (14)

The transformation kernel K(P,, z) represents the spatial eigenfunction of the string. In other words, the frequency domain quantities Y(P,) represent the amplitudes of the corresponding eigenfunctions. In reverse, the inverse trans- formation constitutes an expansion of the deflection y(z) in terms of the eigenfunctions K(P,, z)

Figure 12. Shape of the eigenfunctions K(B,, z), P = 112’3

with suitable normalization factors N p . The shape of the first three eigenfunctions is shown in Fig. 12.

Using the conditions (19) and (20) and integration by parts, we can show the differentation property of the trans- formation 7

2 1

7{I”’(z)} = / Y”(z)K(P, z) dz = p; Y ( P , ) . (22) I O

Application of (16) and (22) now turns the boundary- value problem (16) into an algebraic equation

It is straightforward to solve (23) for the transform of the solution Y(P,, s) = T { L { y ( z , t ) } } .

This result is the desired transfer function model. It can also be written in the form


s2 + c”; ’

These transfer function describe the string in the same way as the initial-boundary value from (14). However, in con- trast to the original PDE model, they provide a convient transition to a discrete-time, discrete-space model.

At first, we note that the spatial frequency ,B, is a discrete variable. Thus it is sufficient to discretize the time variable. This is accomplished by any analog-to-discrete transforma- tion, e.g. impulse-, step-, ramp-invariant or bilinear trans- formation. Since the inital values may be seen as the result of impulse function, the impulse-invariant transformation


Page 10: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

provides optimal results. It turns the second order transfer functions into

a decaying oscillation. The individual decay rate for each frequency is determined by the coefficient c,,. The partial results from each recursive system are weighted with the values of the eigenfunctions K ( p , z,) at certain listening position xa along the string. The final result yd(z,, I C ) rep- resents the sampled oscillation of a string element at posi- tion za .

S 2’ - z COS(W,T) + 52 + c2p; z2 - 2.2 COS(W,T) + 1

z sin(w,T)/w, l +

s2 + c”; z2 - 22 COS(W,T) + 1

with w,, = cp,. Then the discrete-time transfer function model takes the form a (P) b (P)

Inverse z-transformation gives finally for each value of the discrete spatial frequency index p one difference equation for the discrete time variable IC

gd(P,, k ) = 2 COS(W,T) . gd(/?,, k - 1) - yd(p,, k - 2) + + Go(/?,,) YO(k) + Figure 14. Parallel arrangement of recursive

systems ~ i l ( P p ) - cos(wpT) . Yio(P,) YO(^ - 1)

(29) Although based on many simplifications, this example showed, that the functional transformation method (FTM) Provides an exact and systematic way from the PDE de- scription of a vibrating string to a discrete model suitable for computer implementation. The coefficients of the dis- crete model are expressed directly by the parameters of the

Extensions of the FTM into many directions have been presented in [12, 111 and other literature cited there. The main topics are briefly discussed:

1 (sinbJ;T) . -

where 70 ( k ) denotes the discrete-time impulse sequence, The structure of this difference equation is shown in Fig. 13. The spatial of the initial value profiles act as in- puts for the first time step k = 0. The second order recursive system computes the time history of the eigenfunction with frequency w,, . physical model.

YiO(SP) Y o u 4 sin(w T) - ( 0: Y < ~ ( P P ) -cos(wpT) i io (Pp) )Yo(k ) I

Figure 13. Second order difference equation

Since the most simple model, the lossless wave equa- tion, has been assumed, the second order system in Fig. 13 shows no decay and would ring forever. Furthermore, there exists such a second order system for each value of p and the final output has to be recovered by the inverse spatial transformation (21) from all partial results yd(p,,, I C ) in the audible frequency range. Fig. 14 shows this situation for the more general string model which contains loss terms. Dif- ferent from Fig. 13, each second order system now exhibits

The PDEs in section 3.1 exhibit more complex differ- entiation operators in time and space than the wave equation presented above. Higher order differential operators with respect to time simply introduce higher order polynomials in s into the transfer functions, re- sulting in recursive systems of higher order.

Higher order spatial operators require a careful con- struction of the spatial transformation 7. The suitable theoretical framework for this task is the theory of spe- cial boundary-value problems of the Sturm-Liouville type.

A more realistic treatment of the fixing of a string or other vibrating bodies requires to consider also bound- ary conditions of second or third kind. This is also possible in the context of the Sturm-Liouville theory mentioned above.


Page 11: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

So far, only problems with one spatial dimension have been presented for simplicity. The extension to two or three dimensions poses no fundamental difficulty. An example for two spatial dimensions is given in [ l 11.

4. Applications to Computer Music

After a presentation of sound synthesis methods by phys- ical modelling, we show some applications to computer mu- sic. The focus is on the functional transformation method (FTM) because it is the most flexible and most accurate physical modelling method.

4.1. Modelling of Musical Instruments

The parallel arrangement of recursive systems shown in Fig. 14 is the core of a number of musical instrument mod- els. They all share the same physical model of a vibrating string, but they differ in the kind of excitation. The most simple one is a string plucked with a certain force profile. The other models for a bowed string and for a string struck by a hammer employ different nonlinear excitation models. Finally a FTM model of drum is presented.

Plucked String. The most simple way to model a plucked -string is to choose the initial value yio (x) according to Fig. 5 and to set yio(x) to zero. A more advanced model uses a certain time and space profile for the excitation by a suitable force per length f(z, t) as in (7). Including this excitation model into the discrete-time system from Fig. 14 results in an excitation of the inputs a(p) while the inputs b ( p ) remain unaffected. This simple but versatile excitation model is shown in Fig. 15


I Rec. Systems I i I Z t


Figure 16. FTM model of a bowed string

Hammered String. To model a real hammer-string in- teraction the dynamic of the hammerhas to be taken into account. The hammer deflection can be modeled by one second order recursive system. The input force for this re- cursive system is the negative input force for the recursive systems of the string. The hammer interacts nonlinear with the string because of the nonlinearity of the force-deflection law of the hammer felt. The input variable is here the initial hammer velocity v h . The algorithm is shown in figure 17. The nonlinear operation includes a delay for computability.

I - - i

- U

Figure 17. FTM model of a hammered string

Figure 15. FTM model of a plucked string

Vibrations of a Drum. The extension of the above string models to membranes leads to a spatial transformation with two-dimensional eigenfunctions. A result from [ l l ] shows the vibrations of a circular drum excited with a drum stick at different points. As is well-known among drummers, an excitation closer to boundary produces a more interesting sond than an excitation in the center.

Bowed String. The action of the bow on a string is not only time dependent but depends also on the velocity of the string. It can be described as a nonlinear stick and slip ac- tion between bow and string. It can be realized with the feedback structure shown in figure 16. The input variable is the bow velocity.


Page 12: Digital Sound Synthesis by Physical · Digital Sound Synthesis by Physical Modelling ... Subtractive Synthesis

Figure 18. Vibrations of a circular drum with excitation in the center and close to the boundary

4.2. Musical Instrument Morphing

So far, we have assumed that the physical properties of the instrument models do not change with time. This is a reasonable constraint for most real instruments. However, for virtual instrument, also the model parameters are at the disposal of the player. The FTM described above permits also sound variations of the following kind: During oper- ation of an instrument, its physical parameters are slowly changed from one set of parameters to another. As a con- sequence, the timbre of the instrument changes gradually, e.g. from a guitar string to a xylophone. Of course also rare combinations of material parameters are possible that can- not appear in real instruments. This well directed change of the sound characteristic of a virtual instrument is called instrument morphing. It requires a close control over the physical parameters of the model as it is provided by the FTM.

5. Conclusions

Digital sound synthesis is an emerging application for multimedia processing. With ever increasing computing power, real-time implementation of demanding physical models has become feasable. The advantage of physical modelling over conventional sound reproduction or synthe- sis methods lies in the combination of highly flexible and at the same time physically correct models. The high flexiblity allows the player of a virtual instrument to control all pa- rameters of the model during operation, while the physical correctness ensures stable operation and meaningful results with all parameter variations.

Future developements are expected in different direc- tions. The complexity of the modells for strings, mem- branes, bells, tubes and other obejcts will cetrainly increase. Furthermore, also the interactions between different kinds of models for different components of an instrument have to be established and implemented. Finally, the control of the player over the virtual instrument will be extended by new, human gesture based interfaces.



[l] A. Chaigne and A. Askenfelt. Numerical simulations of pi- ano strings. I. A physical model for a struck string using finite difference methods. J. Acoust. Soc. Am., 95(2):1112- 11 18, 1994.

[2] M. Goodwin and M. Vetterli. Time-frequency signal models for music analysis, transformation, and synthesis. In Proc. IEEE Int. Symp. on Time-Frequency and Time-Scale Analy- sis, pages 133-136, 1996.

[3] M. Kahrs and K. Brandenburg, editors. Application qf Digi- tal Signal Processing to Audio and Acoustics. Kluwer Aca- demic Publishers, Boston, 1998.

[4] G. D. Poli, A. Piccialli, and C. Roads. Representation of Musical Signals. MIT Press, Cambridge, Mass., 1991.

[5] C . Roads, S. Pope, A. Piccialli, and G. D. Poli, editors. Mu- sical Signal Processing. Swets & Zeitlinger, Lisse, 1997.

[6] T. Rossing and N. Fletcher. Principles of Vibration and Sound. Springer, New York, 1995.

[7] G. P. Scavone. Digital waveguide modeling of the non-linear excitation of single reed woodwind instruments. In Proc. Int. Computer Music Conference, 1995.

[SI E. D. Scheirer, R. Vaaniinen, and J. Houpaniemi. AudioB- IFS: Describing audio scenes with the MPEG-4 multimedia standard. IEEE Transactions on Multimedia, 1(3):237-250, September 1999.

[9] J. 0. Smith. Physical modeling using digital waveguides. Computer Music Journal, 16(4):74-91, 1992.

[ 101 M. Tohyama, H. Suzuki, and Y. Ando. The Nature and Tech- nology ofAcoustic Space. Academic Press, London, 1995.

111 L. Trautmann, S. Petrausch, and R. Rabenstein. Physical modeling of drums by transfer function methods. In Proc. Int. Conf: Acoustics, Speech, and Signal Proc. (ICASSP’OI). IEEE, 2001.

121 L. Trautmann and R. Rabenstein. Digital sound synthesis based on transfer function models. In Proc. Workshop on Applications of Signal Prucessing to Audio and Acoustics (WASPAA). IEEE, 1999.

[13] R. Vaananen. Synthetic audio tools in MPEG-4 standard. In Proc. 108th AES Convention. Audio Engeneering Society, February 2000. Preprint 5080.

[ 141 V. Valimiki, J. Huopaniemi, and M. Karjalainen. Physical modeling of plucked string instruments with application to real-time sound synthesis. Journal Audio Engineering Soc., 44(5):331-353, 1996.

[15] V. Valimaki and T. Takala. Virtual musical instruments - natural sound using physical models. Organised Sound,

[16] B. L. Vercoe, W. G. Gardner, and E. D. Scheirer. Structured audio: Creation, transmission, and rendering of parametric sound representations. Proc. of the IEEE, 86(5):922-940, 1998.

[17] L. J. Ziomek. Fundamentals qf Acoustic Field Theory and Space-Time Signal Processing. CRC Press, Boca Raton, 1995.

[18] G. Zoia and C. Alberti. An audio virtual DSP for multime- dia frameworks. In Proc. Int. Con$ Acoustics, Speech, and Signal Proc. (ICASSP’OI), 2001.

1(2):75-86, 1996.


Related Documents