Update: I just upgraded my primary workstation to the full KXStudio as a test. Details at the end of the post
I have long been wanting to replace the Windows operating system on our studio machine with Linux, but haven’t for a number of reasons. Recently the machine has been having fits and garbled a very important interview, so it came time to wipe it and start over.
Our home server took a dump, so our regular studio box took its place. Reconstituted server hardware is now in the studio, however it is less than ideal. It was my primary desktop machine about 7 years ago sporting a 1.1 GHz AMD AthlonXP processor and 1 gig of memory (system bus is only like 200 Mhz), so it’s limited in what can be expected from it performance-wise. Someday the studio will get a proper hardware upgrade. It would help if we sold the friggin’ house though.
An Ubuntu 10.04 (Lucid) system with a very harmonious audio environment recording from my firewire mixer at 48kHz and 24bits. Currently using the KX “low latency” (but not realtime) kernel. More on why I went with Lucid below.
JACK is the full-time audio core and everything goes through it with bridges if apps aren’t JACK native. Ardour works, Linux VST plugins work, many Windows VST plugins work too. I didn’t play with any of that much since my primary goal is podcast production, preferably with Skype remote co-hosts, which works (yay!)
I loaded up Skype, then Ardour. In Ardour I mapped the first two mics on my mixer to tracks 1 + 2 (since my wife and I co-host most of our shows together). I created a 3rd track and mapped the pulseaudio sync to it. This feeds all audio output from pulse applications (browser, media player, Skype) to that track. The outputs of my microphones were already mapped from Jack to Pulse, so I didn’t have to do anything there.
I called the sexy Skype Call Testing robot and voila — I could hear her, and she could hear me! Furthermore, Ardour recorded all of her audio on to track 3 which was completely discreet (neither of my mics were on the track) and my tracks 1 + 2 were completely discreet as well – exactly what I wanted, so mission accomplished!
This was mostly “out of the box” with very little tweaking. The tweaks wouldn’t even have been necessary if I had a USB, PCI, or fully supported firewire interface. KXStudio really does “just work”.
On Twitter today Thomas, Chris and I commented about the new Google+ a little and I think that their “Hangout” feature will be a boon to podcast recording. It allows ten person video conferencing for free. With this setup I could participate in a multi-person video conference and record its audio (or not), and still have clean tracks of my side of the conversation. If each person recorded their side of the conversation and we pull WAV files together, then we’d have pristine sound with the benefit of that facial and body language feedback to help the conversation go smoother.
- My latency is pretty abysmal (24 – 46ms depending on how hard I want to push the cpu), but that’s not important for podcast recording. Nothing is noticeable on the Skype call. I will work on latency when I get back to doing some music composition, but I suspect I will need a new rig for that given the slow system bus and other limited resources of this machine. I was able to run at 2.5ms without affecting my Ardour tracks (I only tried three), but every other program was bogged as the CPU spikes. I suspect the Ardour recording works so well since JACK is running in realtime mode with a high priority so all other programs get very little cpu time.
- You cannot play audio from a pulse source and route that to Skype. For instance, the other people on the call wouldn’t hear a YouTube video if you played it. This is a minor inconvenience. I suspect that you could route a VLC or Audacity instance to feed them audio (thinking about podcast feedback here), but I didn’t have a chance to try it. In a way it’s good because it means that your buddies won’t hear any desktop alerts or other system audio chimes if you forgot to turn them off.
Another benefit of Google+ Hang Out is that you can do shared YouTube watching that syncs between all browsers. If anybody pauses, fast forwards or rewinds it automatically does so on everybody’s YouTube stream. Pretty nifty! If only Netflix or HBOGo would hook into this!
- If you shut your mixer off, it will not come back up in JACK. You need to reboot your whole system before recording again. Also a minor inconvenience, but somewhat annoying since that was something that didn’t seem to bother Windows.
Of course with the audio issues I’ve had lately with recording through Windows, I was profalactically rebooting before every session so it’s a wash. It is likely possible to modprobe the firewire kernel module again after turning the mixer back on and force-restarting JACK, but I didn’t try that.
How I Got There:
When looking at all of the media-centric Linux distributions I decided to go with KXStudio on top of Ubuntu. I found it interesting that they recommend Lucid (10.04) rather than the newest version. There were actually forum comments from FalkTX (the main guy behind KXStudio) essentially saying that 10.04 is still the best platform for audio on Linux due to changes in the newer versions.
I thought briefly of going ahead with 11.04 (KX does support it), but figured I would use what they recommend. The KXStudio team backports all of the kernels, tools and the latest versions of pretty much all audio software to 10.04 so there isn’t much to lose. Also, it is a Long Term Support release for Ubuntu, so it will have security and bug fixes half way through 2013. A recording studio is something that you don’t want to mess around with a lot once you have things dialed in.
I downloaded and installed Ubuntu 10.04, then followed the instructions to add the KX repositories and “upgrade” to KXStudio. It went very smoothly with a couple minor question prompts and some waiting for it to download a couple gigs of software.
One of the steps is picking your desktop environment (they support Gnome, KDE and Unity). I’m most familiar with Gnome so that’s what I went with. Years ago I was a KDE user and I briefly considered going back to it, but this project just isn’t the place to do that.
Another step is to pick a kernel. I was going to go with the realtime kernel (2.6.38-8), but that wouldn’t allow the proprietary nvidia graphics (built into my motherboard) drivers to work. For some reason, the system will not boot into X with the open source nvidia drivers (Nouveau), so I’m kind of stuck here. I went ahead with the “low latency” kernel which is a little older, but still 2.6 (2.6.33, I believe).
I also had to work through some monitor resolution issues. It was stuck at 640×480, then at 800×600. The highest I’ve been able to get it is 1024×768 which is annoying on the widescreen monitor, but acceptable. I’ll work it out later. X configuration has always been a bit of a black art to me so I need to do some more research.
First time bringing up the connection tool I didn’t see the Firewire mixer (an Alesis Firewire 8). My friend Thomas has the same mixer and went through an arduous journey getting his to work which I was hoping to avoid (though, thankful for his notes getting his to work!)
Working through FFADO’s troubleshooting FAQ I found that issue was simply Ubuntu not loading the kernel module. I loaded the module and it showed right up. I added the module to modprobe.conf so it would auto-load on boot.
In my playing with kernels, somehow this stopped working after a reboot and I couldn’t figure out why. It kept saying that the ohci1394 kernel module was missing. It ends up that it was in a blacklist file. I removed it from that file and all was well.
That was it for install and configuration and met all of my first goals of recording microphone and Skype tracks. Next I played a little with reducing latency. The default load was about 24ms (1024 buffer with 2 periods for ALSA and 1024/3 periods for firewire). I dropped this down to 128/2 and was down to 2.5ms, but as mentioned above, the CPU spiked at 100% and Skype audio cracked up. The interesting thing is that no xruns were reported and my microphone tracks in Ardour didn’t have any drops at all. I’m curious to see if monitoring tracks while recording causes drops or xruns but didn’t have a chance to play with it.
I tried a few different settings; 512/2, 512/3, 1024/2 and 2048/2. The default of 1024/2 really was the sweet spot. I don’t know if KXStudio always sets that as default, or if it did it based on my hardware.
I believe if I had a modern machine with dual, quad, or more cores and a faster system bus that everything would work just fine at 5ms, or maybe even 2.5ms. My primary desktop workstation is still no prize winner as it’s almost five years old, but it is at least dual core. I have the KX repositories on it, but have only used them to get the latest builds of Ardour, Audacity and JACK.
Now I’m going to do the full upgrade to KXStudio and see what kind of latency I get on it. Though, the audio interface is either the internal sound card on the motherboard or my Sennheiser usb headphones, so I don’t know how much they’ll impact things.
If I get some time I may haul the mixer upstairs and try it. The wife and I have been talking about making our normal computer room the studio, thus removing the need for a dedicated studio machine anyway. That is not likely going to happen until we move though, so I don’t know that I want to go to the trouble of connecting things and tearing it down again just for testing.
I hope this at least inspires your experimentation, if not helps you – feel free to ask for assistance if you go this route and get stuck!
Here is an excellent reference document that explains (in simple terms) why audio on Linux is so complex:
I went ahead and did the full KXStudio upgrade on my primary workstation. It’s a 4.5 year old Dell with the following specs:
- Intel Core2 1.86 gHz cpu
- 2 GB ram
- 667 mHz bus
- Sennheiser usb headset
- Using the lowlatency kernel (same issue with nvidia driver, so no realtime kernel for me)
Jack is set at 48 kHz (native for the audio chip on the motherboard as well as the Sennheiser cup). I am able to playback and record 16 tracks with 6 effects in Ardour (a couple reverbs, compressor, 4 band parametric eq, fast look-ahead limiter) with a latency of 2.7ms CPU at 80% (spiking to 100) , DSP 17%