Integrating PyAudio & PySimpleGUI
As part of my research on Signal Processing and AI, I wanted to visually represent real time audio data from my microphone, this simple request turned out to be not so simple at all for a number of reasons we will cover, along the way we will see how to integrate PyAudio with a GUI (I use PySimpleGUI but you can modify the code for your UI of choice), and talk a bit about blocking and non blocking operations in Python.
Your time is valuable, I get it. If you just want to copy pasta 🍝some code to integrate PyAudio/PySimpleGUI you are looking for the last code example, there's also a repo with the code samples:https://github.com/KenoLeon/PySimpleGUI-PyAudio
As mentioned all I want is a little widget box that allows me to capture the sound from my microphone and display it in a window, something like this:
I am using PyAudio because at this time it seems the library of choice for dealing with audio in Python, the documentation is a bit sparse though:
PyAudio Documentation - PyAudio 0.2.11 documentation
PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio, you can easily use…
You can read it in a couple of minutes and the 2 lonely examples they provide are the basis for further code, so if you understand them you are golden and might wish to jump to the integration part.
To start things off here’s a crude blocking stream that won’t quit until you terminate the script, usually with
We can see the sound levels via the terminal (I am using vscode), so for instance a louder sound surrounded by quieter noise looks like this:
Before moving on I feel we need to parse a bit more what PyAudio is doing so here’s a little flowchart I made that might help you:
So you have a chunk (here I am only showing 5 chunks) of numbers in a numpy array (1,024 negative and positive numbers between −32,768 to 32,767 cause the format is paInt16) these numbers represent the overall sound wave or vibrations ( the little charts) and if you were to plot them (we won't) you would get a changing waveform, here instead we just get the Maximum value or peak level, we loop the operation in 5 second intervals and each interval has about 215 chunks (from INTERVAL*RATE / CHUNK). Don't worry if it takes a bit to get comfortable with all the moving parts and terminology, sound is a complex signal to process.
Blocking and Non Blocking modes
A blocking operation in general terms is something you ask your script to do ( in this case stream a chunk of audio from your microphone ) that prevents other parts of your code to run, this will make more sense when we integrate an UI that also wants to run continuously. A non blocking operation in contrast works better with other libraries or bits of code because it doesn’t stop the rest of your code from running and can be achieved via callbacks, threads or other patterns.
Here’s the last script in non blocking mode with the use of a callback function:
Integration with PySimpleGUI
We’ll start by integrating the blocking mode of PyAudio by doing a one shot microphone stream request:
The key thing here is that the you have to press the button every time you want to listen, so it works but is not continuous :
If you need help getting started with PySimpleGUI I wrote and overview here:
So at this point we have a working volume meter ( VU meter ), but it is by no means a good one, the first thing that you might have noticed is that you can’t exit the program while the
listen() function is running, why ? Because it is a blocking operation that prevents the exit method to run while it is busy streaming the microphone. Do note that once the
listen() sub routine is done and you clicked the exit button the program terminates, ideally we would want the program to exit at any time, hence we need to integrate the non blocking mode of PyAudio with PySimpleGUI.
Two more upgrades to this script worth mentioning before I give you my solution/code for them:1. As mentioned it is not continuous, that is you click the button and it listens but stops after your INTERVAL is done, we would like to have a toggle style system where it just runs and listens until told otherwise.2. We are now closing the stream and terminating the PyAudio instance ( the bit of code that wasn't running in the previous examples ), this is good because we are releasing resources and avoiding memory bloat and crashes, but we don't really need to terminate the PyAudio instance, just stop the stream.If you want to try on your own, this is a good point to do so, try changing the previous example to a non blocking mode, make it continuous and keep the PyAudio instance alive until you exit the program. Even if you fail I believe your learning will be better.
Non blocking PyAudio integration with PySimpleGUI
This is a more complete script and the solution to integrating non blocking PyAudio was surprisingly simple… The main takeaway here is that you might want to divide or delegate responsibilities and let your libraries work in the way they like, so for instance all the UI logic here consists of 3 events and the rest is done by functions that interact with both PySimpleGUI and PyAudio via the global references ( in
_VARS), I’d also recommend you keep your event loop lean and simple.
But you cheated/changed the SPEC by adding a stop button ! Guilty as charged, but in my defense we are not done yet and the SPEC was not really clear as to how to stop listening, ideally we would want a single button but in order to implement this we first needed to understand how to stop/start a stream, we'll do a bit of UI polishing next.
UI Improvements and testing :
Making a simple script is very different from making a production/robust one and considerable time needs to go into testing and tweaking your program for performance and usability, to keep things simple and wrap things up here I am just going to focus on a bug that crashes my computer if I click very fast or quickly press the listen/stop buttons along with the exit, this tweaks will also get us closer to the original spec…
If you want to try on your own, this is the core question ? How can you make sure your user doesn't get in trouble and crashes your program, even if the user is doing something that you think is irrational ( in this case smashing the buttons like a toddler, no offense though, your target audience might actually be toddlers 😁).
My solution here was to simply toggle the active state of the buttons making them un-clickable according to what the script is doing, I’ve also lowered the progress bar range to make the input more noticeable ( but this is still not a properly calibrated dB meter ) and finally I increased the timeout on PySimpleGUIs loop to make it more responsive and avoid further crashes:
So there you have it, a small but functional script that integrates PySimpleGUI and PyAudio to represent microphone levels and serve as the basis for more complex signal processing scripts, along the way you hopefully learned about blocking and non blocking operations in python and some of the details of integrating libraries into UIs, where you go from here is up to you and your project needs.
I'd be remiss if I didn't mention a few gotchas... I am sampling here from my laptops microphone in a single channel which is conveniently the default choice in PyAudio, but you might need to add some extra UI elements and methods to detect your available devices/channels/formats, great follow up project.You can also accomplish much of what the non blocking mode does by using threads and classes, but this quickly becomes a bigger/complex proposition.
I hope this post helps you in some way, thanks for reading !