meme mac lover guy

Convenience audio setup (part 2): Hardware Knobs

Last time I talked about setting up multiple virtual audio devices to easily adjust game volume separately from music volume. It's nice and all but ther are two major problems:

  1. You have to manually find the application in list and move it to right sink
  2. Besides vol+/vol- keys on device/keyboard (not all of have those) there's no quick way to adjust volume of specific device

The solution for it is to make hotkeys for vol+/vol- on specific devices and make scripts that find focused window and move its audio to predetermined sink. However, we can do better than that. Introducing... MIDI.

MIDI Controllers and stuff

So MIDI is a industry standard from 80s (not 70s, huh) for "synchronizing electronic musical instruments", however its design is so simple and good it can be used for virtually anything, one example I heard about is stage lighting control.

Note: the following is my understanding of the protocol based on what I remember, it also omits a lot of stuff that's not relevant to me like the timings

The protocol itself is rather simple - basically everything is a "message" - anything you do on a MIDI controller - be it a keyboard or a DJ "jog wheel" - sends a message. The message itself looks something like this - status (channel & type (or "function" as midi calls it)) and two data fields one of which is optional, I think it also includes timing information but it's not useful for our purpose here. There are basically 4 "functions" that are very useful to us - note on/off, control change, program change. Note on and off are 2 types of messages that are sent when you press a key on a (piano) keyboard, it has two arguments - which key you pressed and how fast (or how hard) you pressed that key, same thing for release. Control change is more interesting - a lot of midi controllers also have knobs and faders for adjusting effects or tuning instrument on the fly, and changing those controls typically generates a "control change" message. These messages have two attributes - control ID and new value of the control (between 0 and 127). You might already see where I'm getting at - we can just map those controls to pulseaudio volumes! Lastly, there's Program Change message, which has just one attribute - program ID. Unfortunately it's not very useful for us and I'll get to why later.

Hold on, are you really suggesting using a MIDI keyboard, like a piano thing for controlling the volume?

No, because turns out midi controllers come in all shapes and sizes. I personally use an AKAI LPD8:

It's tiny enough, has both pads and knobs and everything can be reprogrammed (i.e. change the IDs and such), pads can be used as both controls and as notes, and can send program change messages as well. There are others, like Korg nanoKONTROL which is probably even better fit for this usecase (i actually plan on getting that some day).

The only modifications I did to mine were: adding some rubber feet to tilt the controller and some shitty printed paper labels.

The mapping

So OK, we have our Control Change message, we can just write a tiny tool that takes that message and map the value of it to pulseaudio virtual device volume based on ID. Done. What about moving application's audio to a different device? Well, as I said, we can write a script for that, technically we don't even need a MIDI controller for that but it can help a lot in this case. The script itself is rather simple - use xdotool or similar to find currently focused window's PID, and then query pulseadio for all streams and find the one that has matching PID, then we send a command to move that stream to a specific device. This works 95% of the time as application that's making sound usually belongs to same window, but sometimes there's two streams from same applications (in which case we just move all) and rarely there's a child process that is actually making a sound, for example in chromium and firefox that's the tab process which isn't the same as main window, and it's much complicated to find which tab is focused and what is its PID. That's why I have separate virtual deviceĀ just for browser, since PulseAudio/PipeWire tends to remember the last device used you only need to do it once.

So how's the MIDI can help? Why use it instead of keyboard shortcuts? Well, MIDI is bi-directional. That's right - you potentially can send messagesĀ back to controler to set the control to specific value OR to press a key (note) on it. I imagine it would be extremely cool on expensive mixer which actually has motorized faders/knobs (or could be much more useful on controllers with free-wheeling knobs), but in this case sending control change or note_on/note_off controls which pads are highlighted, it's like poor man's programmable RGB! That means we can also highlight pads based on which device current application is using! No need to mess with proprietary formats to set key's RGB on your keyboard, nor no need to mess with openRGB, it's part of the protocol! Well, technically, that's probably just a side-effect, and while mine only has orange backlight some controllers do have rgb lighting for pads and I assume that IS proprietary, as MIDI does permit "vendor-specific" commands which are used for exclusive functinality, i.e. reprogramming the controller. This feedback lets you see if app's stream is detected and lets you know that moving was successful or not.

The Tool

The possibilities of application of this are endless - I wrote a tool for this, it basically turns midi messages into events and lets you listen to those. I use it not only for controlling the volumes, but also for muting your microphone, switching scenes in OBS, controlling media player or even home automation! In fact someone already made a thing that translates midi messages into MQTT events.

My tool's interface isn't very user-friendly however, at least for now, as it's just easier to add on stuff to it by writing a new handler and such, however it has solid foundation and since MIDI can be chained (i.e. my tool would listen to events and proxy them to virtual output midi device, which is connected to another tool) or shared this tool's API could be used as a foundation for making usercase-specific tools (OBS tool, PulseAudio tool, etc).