Digital Signal Processing through Speech, Hearing, and Python Mel Chua PyCon 2013 This tutorial was designed to be run on a free pythonanywhere.com Python 2.7 terminal. If you want to run the code directly on your machine, you'll need python 2.7.x, numpy, scipy, and matplotlib. Either way, you'll need a .wav file to play with (preferably 1-2 seconds long).
76
Embed
Digital signal processing through speech, hearing, and Python
Slides from PyCon 2013 tutorial reformatted for self-study. Code at https://github.com/mchua/pycon-sigproc, original description follows: Why do pianos sound different from guitars? How can we visualize how deafness affects a child's speech? These are signal processing questions, traditionally tackled only by upper-level engineering students with MATLAB and differential equations; we're going to do it with algebra and basic Python skills. Based on a signal processing class for audiology graduate students, taught by a deaf musician.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Digital Signal Processing through Speech, Hearing, and Python
Mel Chua
PyCon 2013
This tutorial was designed to be run on a free pythonanywhere.com Python 2.7 terminal.
If you want to run the code directly on your machine, you'll need python 2.7.x, numpy, scipy, and
matplotlib.
Either way, you'll need a .wav file to play with (preferably 1-2 seconds long).
Agenda
● Introduction● Fourier transforms, spectrums, and spectrograms● Playtime!● SANITY BREAK● Nyquist, sampling and aliasing● Noise and filtering it● (if time permits) formants, vocoding, shifting, etc.● Recap: so, what did we do?
What's signal processing?
● Usually an upper-level undergraduate engineering class
● That first signal we made? Make a wav of it.● Hint: you may need to generate more samples.● Bonus: the flute played a B (494Hz) – generate
a single sinusoid of that.● Megabonus: add the flute and sinusoid signals
and play them together
Your turn: Challenge 2
● Record some sounds on your computer
● Do an FFT on it
● Plot the spectrum
● Plot the spectrogram
● Bonus: add the flute and your sinusoid and plot their spectrum and spectrogram together – what's the x scale?
● Bonus: what's the difference between fft/rfft?
● Bonus: numpy vs scipy fft libraries?
● Bonus: try the same sound at different frequencies (example: vowels)
Sanity break?
Come back in 20 minutes, OR: stay for a demo of the wave library (aka “why we're using scipy”)
note: wavlibraryexample.py contains thewave library demo (which we didn't get to in the actual workshop)
Things people found during break
Problem #1: When trying to generate a pure-tone (sine wave) .wav file, the sound is not audible.
Underlying reason: The amplitude of a sine wave is 1, which is really, really tiny. Compare that to the amplitude of the data you get when you read in the flute.wav file – over 20,000.
Solution: Amplify your sine wave by multiplying it by a large number (20,000 is good) before writing it to the .wav file.
More things people found
Problem #2: The sine wave is audible in the .wav file, but sounds like white noise rather than a pure tone.
Underlying reason: scipy.io.wavfile.write() expects an int16 datatype, and you may be giving it a float instead.
Solution: Coerce your data to int16 (see next slide).
Coercing to int16
# option 1: rewrite the makewav function# so it includes type coerciondef savewav(data, outfile, samplerate): out_data = array(data, dtype=int16) scipy.io.wavfile.write(outfile, samplerate, out_data)
# option 2: generate the sine wave as int16# which allows you to use the original makewav functiondef makesinwav(freq, amplitude, sampling_freq, \ num_samples):
● Introduction● Fourier transforms, spectrums, and spectrograms● Playtime!● SANITY BREAK● Nyquist, sampling and aliasing● Noise and filtering it● (if time permits) formants, vocoding, shifting, etc.● Recap: so, what did we do?
Nyquist: sampling and aliasing
● The sample rate matters.● Higher is better.● There is a tradeoff.
If a function x(t) contains no frequencies higher than B hertz, it is completely determined by
giving its ordinates at a series of points spaced 1/(2B) seconds apart.
Nyquist-Shannon sampling theorem (haiku version)
lowest sample rate
for sound with highest freq F
equals 2 times F
Let's explore the effects of sample rate. When you listen to these .wav files, note that doubling/halfing the
sample rate moves the sound up/down an octave, respectively.
audio = getwavdata('flute.wav')
makewav(audio, 'fluteagain44100.wav', 44100)
makewav(audio, 'fluteagain22000.wav', 22000)
makewav(audio, 'fluteagain88200.wav', 88200)
Your turn
● Take some of your signals from earlier● Try out different sample rates and see what
happens● Hint: this is easier with simple sinusoids at first● Hint: determine the highest frequency (your Nyquist
frequency), double it (that's your highest sampling rate) and try sampling above, below, and at that sampling frequency
● What do you find?
What do aliases alias at?
● They reflect around the sampling frequency● Example: 40kHz sampling frequency● Implies 20kHz Nyquist frequency● So if we try to play a 23kHz frequency...● ...it'll sound like 17kHz.
Your turn: make this happen with pure sinusoids
Bonus: with non-pure sinusoids
Agenda
● Introduction● Fourier transforms, spectrums, and spectrograms● Playtime!● SANITY BREAK● Nyquist, sampling and aliasing● Noise and filtering it● (if time permits) formants, vocoding, shifting, etc.● Recap: so, what did we do?
Remember this?
Well, these are filters.
Noise and filtering it
● High pass● Low pass● Band pass● Band stop● Notch● (there are many more, but these basics)
Notice that all these filters work in the frequency domain.
We went from the time to the frequency domain using an FFT.
# get audio (again) in the time domainaudio = getwavdata('flute.wav')
# convert to frequency domainflutefft = fft.rfft(audio)
We can go back from the frequency to the time domain using an inverse
FFT (IFFT).
reflute.wav should sound identical to flute.wav.
reflute= fft.irfft(flutefft, len(audio))
reflute_coerced = array(reflute, \dtype=int16) # coerce to int16
● Take some of your .wav files from earlier, and try making...● Low-pass or high-pass filters● Band-pass, band-stop, or notch filters● Filters with varying amounts of rolloff
Agenda
● Introduction● Fourier transforms, spectrums, and spectrograms● Playtime!● SANITY BREAK● Nyquist, sampling and aliasing● Noise and filtering it● (if time permits) formants, vocoding, shifting, etc.● Recap: so, what did we do?