Configuring and Using Linux Audio (Running Linux, 4th Edition)

9.5. Configuring and Using Linux Audio

This section covers the configuration of sound cards under Linux and other issues related to Linux sound support.

Sound has historically been one of the most challenging aspects of Linux, and one that did not receive as much attention from Linux distributions as it should have, perhaps because Linux was initially embraced by so many as a server operating system. On the desktop, users have come to take multimedia support and sound for granted. Once you're armed with a little knowledge, the good news is it's not too hard to get a sound card up and running, and in fact Linux is well suited to audio and other multimedia applications for a number of reasons.

We start off this section with a quick overview of digital audio concepts and terminology. Those familiar with the technology may wish to skip over this section. If you don't really care about how it all works or get lost in the first sentence of this section, don't worry, you can get sound on your system without understanding the difference between an MP3 and a WAV file.

We'll then look specifically at how sound is supported under Linux, what hardware is supported, the different device drivers available, and the different approaches to configuring sound taken by Linux distributions.

Next we'll step through the process of configuring sound support, building or locating the necessary kernel drivers and testing and debugging the sound devices. We'll provide some hints for troubleshooting and point out some common pitfalls.

Once you have sound support up and running, you'll want to run some multimedia applications. We'll take a quick look at the types of sound programs available for Linux.

Last, we'll round out this section with some references to more information on Linux audio that will help you get to the next level.

A word of advice: there are minor differences between Linux distributions. The Linux kernel and applications are also undergoing constant change and enhancement. We've made every effort to make the information in this chapter applicable to all Linux systems, and to point out areas where they are likely to differ, but for details you should consult the documentation for your distribution and consult fellow users.

9.5.1. A Whirlwind Tour of Digital Audio

In this section we will give a very quick overview of some concepts relevant to digital audio and sound cards.

Sound is produced when waves of varying pressure travel through a medium, usually air. It is inherently an analog phenomenon, meaning that the changes in air pressure can vary continuously over a range of values.

Modern computers are digital, meaning they operate on discrete values, essentially the binary ones and zeroes that are manipulated by the computer's CPU. In order for a computer to manipulate sound, it converts the analog sound information into digital format.

A hardware device called an analog-to-digital converter converts analog signals, such as the continuously varying electrical signals from a microphone, to digital format for manipulation by the computer. Similarly, a digital-to-analog converter converts digital values into analog form so that they can be sent to an analog output device such as a speaker. Sound cards typically contain several analog-to-digital and digital-to-analog converters.

The process of converting analog signals to digital form consists of taking measurements or samples of the values at regular periods of time, and storing these samples as numbers. The process of analog-to-digital conversion is not perfect, however, and introduces some loss or distortion. Two important factors that affect how accurately the analog signal is represented in digital form are the sample size and sampling rate.

The sample size is the range of values of numbers that are used to represent the digital samples, usually expressed in bits. For example, an 8-bit sample would convert the analog sound values into one of 2⁸ or 256 discrete values. A 16-bit sample size would represent the sound using 2¹⁶ or 65,535 different values. A larger sample size allows the sound to be represented more accurately, reducing the sampling error that occurs when the analog signal is represented as discrete values. The tradeoff with using a larger sample size is that the samples require more storage (and the hardware is typically more complex and therefore expensive).

The sample rate is the speed at which the analog signals are periodically measured over time. It is properly expressed as samples per second, although sometimes informally but less accurately expressed in Hertz. A lower sample rate will lose more information about the original analog signal, while a higher sample rate will more accurately represent it. The sampling theorem states that to accurately represent an analog signal it must be sampled at least twice the rate of the highest frequency present in the original signal.

The range of human hearing is from approximately 20 to 20,000 Hertz under ideal situations. To accurately represent sound for human listening, then, a sample rate of twice 20,000 Hertz should be adequate. CD player technology uses 44,100 samples per second, which is in agreement with this simple calculation. Human speech has little frequency activity above 4,000 Hertz. Digital telephone systems typically use a sample rate of 8,000 samples per second, which is perfectly adequate for conveying speech. The tradeoff involved with using different sample rates is the additional storage requirement and more complex hardware needed as the sample rate increases.

Other issues that arise when storing sound in digital format are the number of channels and the sample encoding format. To support stereo sound, two channels are required. Some audio systems use four or more channels.

The samples themselves can be encoded in different formats. We've already mentioned sample size, with 8-bit and 16-bit samples being the most common. For a given sample size the samples might be encoded using signed or unsigned representation, and when the storage takes more than one byte, the ordering convention must be specified. These issues are important when transferring digital audio between programs or computers to ensure they agree on a common format. File formats, such as WAV, standardize how to represent sound information in a way that can be transferred between different computers and operating systems.

Often, sounds need to be combined or changed in volume. This is the process of mixing, and can be done in analog form (e.g., a volume control) or in digital form by the computer. Conceptually, you can mix two digital samples together simply by adding them, and you can change volume by multiplying the digital samples by a constant value.

Up to now we've discussed storing audio as digital samples. Other techniques are also commonly used. FM synthesis is an older technique that produces sound using hardware that manipulates different waveforms, such as sine and triangle waves. The hardware to do this is quite simple and was popular with the first generation of computer sound cards for generating music. Many sound cards still support FM synthesis for backward compatibility. Some newer cards use a technique called wavetable synthesis that improves on FM synthesis by generating the sounds using digital samples stored in the sound card itself.

MIDI stands for Musical Instrument Digital Interface. It is a standard protocol for allowing electronic musical instruments to communicate. Typical MIDI devices are music keyboards, synthesizers, and drum machines. MIDI works with events representing such things as a key on a music keyboard being pressed, rather than storing actual sound samples. MIDI events can be stored in a MIDI file, providing a way to represent a song in a very compact format. MIDI is most popular with professional musicians, although many consumer sound cards support the MIDI bus interface.

Earlier we mentioned CD audio, which uses a 16-bit sample size and a rate of 44,100 samples per second, with two channels (stereo). One hour of CD audio represents more than 600 MB of data. In order to make the storage of sound more manageable, various schemes for compressing audio have been devised. One approach is to simply compress the data using the same compression algorithms used for computer data. However, by taking into account the characteristics of human hearing, it is possible to compress audio more efficiently be removing components of the sound that are not audible. This is called lossy compression because information is lost during the compression process, but when properly implemented data size is reduced greatly, with little noticeable loss in audio quality. This is the approach that is used with MPEG-1 level 3 audio (MP3), which can achieve compression levels of 10:1 over the original digital audio. Another lossy compression algorithm that achieves similar results is Ogg Vorbis, which is popular with many Linux users because it avoids patent issues with MP3 encoding. Other compression algorithms are optimized for human speech, such as the GSM encoding used by some digital telephone systems. The algorithms used for encoding and decoding audio are sometimes referred to as codecs.

For applications in which sound is to be sent live via the Internet, sometimes broadcast to multiple users, sound files are not suitable. Streaming media is the term used to refer to systems that send audio, or other media, and play it back in real time.

Now that we've discussed digital audio concepts, let's look at the hardware used for audio. Sound cards follow a history similar to other peripheral cards for PCs. The first-generation cards used the ISA bus, and most aimed to be compatible with the SoundBlaster series from Creative Labs. With the introduction of the ISA Plug and Play (PnP) standard, many sound cards adopted this format, which simplified configuration by eliminating the need for hardware jumpers. Modern sound cards now typically use the PCI bus, either as separate peripheral cards or as on-board sound hardware that resides on the motherboard but is accessed through the PCI bus. Some USB sound devices are now available, the most popular being loudspeakers that can be controlled through the USB bus.

Some sound cards now support higher-end features such as surround sound using as many as six sound channels, and digital inputs and outputs that can connect to home theater systems. This is beyond the scope of this book, so we will not discuss such sound cards here. Much useful information on 3D sound can be found at http://www.3dsoundsurge.com. Information on the OpenAL 3D audio library can be found at http://www.openal.org/home.

9.5.2. Audio Under Linux

Now that we've covered the concepts and terminology of digital audio in general, it is time to look at some of the specifics of sound on Linux.

The lowest-level software component that talks directly to the sound hardware is the kernel. Early in the development of Linux (i.e., before the 1.0 kernel release), Hannu Savolainen implemented kernel-level sound drivers for a number of popular sound cards. Other developers also contributed to this code, adding new features and support for more cards. These drivers, part of the standard kernel release, are sometimes called OSS/Free, the free version of the Open Sound System.

Hannu later joined 4Front Technologies, a company that sells commercial sound drivers for Linux as well as a number of other Unix-compatible operating systems. These enhanced drivers are sold commercially as OSS/4Front.

In 1998 the Advanced Linux Sound Architecture, or ALSA project, was formed with the goal of writing new Linux sound drivers from scratch, and to address the issue that there was no active maintainer of the OSS sound drivers. With the benefit of hindsight and the requirements for newer sound card technology, the need was felt for a new design.

Some sound card manufacturers have also written Linux sound drivers for their cards, most notably the Creative Labs Sound Blaster Live! series.

The result is that there are as many as four different sets of kernel sound drivers from which to choose. This causes a dilemma when choosing which sound driver to use. Table Table 9-1 summarizes some of the advantages and disadvantages of the different drivers, in order to help you make a decision. Another consideration is that your particular Linux distribution will likely come with one driver and it will be more effort on your part to use a different one.

Table 9-1. Sound driver comparison

Driver	Advantages	Disadvantages
OSS/Free	Free	Not all sound cards supported
	Source code available	Most sound cards not auto detected
	Part of standard kernel	Does not support some newer cards
	Supports most sound cards	No single maintainer
OSS/4Front	Supports many sound cards	Payment required
	Auto-detection of most cards	Closed source
	Commercial support available
	Compatible with OSS
ALSA	Free	Not all sound cards supported
	Source code available	Not part of standard kernel
	Supports many sound cards	Not fully compatible with OSS
	Actively developed/supported
	Clean design
Commercial	May support cards with no other drivers	May be closed source
		May not be supported

In addition to the drivers mentioned in Table 9-1, kernel patches are sometimes available that address problems with specific sound cards.

The vast majority of sound cards are supported under Linux by one driver or another. The devices that are least likely to be supported are very new cards, which may not yet have had drivers developed for them, and some high-end professional sound cards, which are rarely used by consumers. You can find a reasonably up-to-date list of supported cards in the current Linux Sound HOWTO document, but often the best solution is to do some research on the Internet and experiment with drivers that seem likely to match your hardware.

Many sound applications use the kernel sound drivers directly, but this causes a problem: the kernel sound devices can be accessed by only one application at a time. In a graphical desktop environment, a user may want to simultaneously play an MP3 file, associate window manager actions with sounds, be alerted when there is new e-mail, etc. This requires sharing the sound devices between different applications. To address this, modern Linux desktop environments include a sound server that takes exclusive control of the sound devices and accepts requests from desktop applications to play sounds, mixing them together. They may also allow sound to be redirected to another computer, just as the X Window System allows the display to be on a different computer from where the program is running. The KDE desktop environment uses the artsd sound server and GNOME provides esd. As sound servers are a somewhat recent innovation, not all sound applications are written to support them yet.

This section will not cover software development issues, but for those who want to develop multimedia applications, a number of toolkits provide sound support more easily than the low-level kernel API. ALSA includes a sound library, and there are many sound toolkits, such as SDL (intended mainly for games) and OpenAL (for 3D audio). If you are a multimedia developer you should investigate these libraries to avoid reinventing work done by others.

9.5.3. Installation and Configuration

In this section we will discuss how to install and configure a sound card under Linux.

The amount of work you have to do depends on your Linux distribution. As Linux matures, some distributions are now providing automatic detection and configuration of sound cards. The days of manually setting card jumpers and resolving resource conflicts are becoming a thing of the past as sound cards become standardized on the PCI bus. If you are fortunate enough that your sound card is detected and working on your Linux distribution, the material in this section won't be particularly relevant because it has all been done for you automatically.

Some Linux distributions also provide a sound configuration utility such as sndconfig which will attempt to detect and configure your sound card, usually with some user intervention. You should consult the documentation for your system and run the supplied sound configuration tool, if any, and see if it works.

If you have an older ISA or ISA PnP card, or if your card is not properly detected, you will need to follow the manual procedure we outline here. These instructions also assume you are using the OSS/Free sound drivers. If you are using ALSA, the process is similar, but if you are using commercial drivers (OSS/4Front or a vendor-supplied driver), you should consult the document that comes with the drivers as the process may be considerably different.

The information here also assumes you are using Linux on an x86 architecture system. There is support for sound on other CPU architectures, but not all drivers are supported and there will likely be some differences in device names, etc.

9.5.3.1. Collecting hardware information

Presumably you already have a sound card installed on your system. If not, you should go ahead and install one. If you have verified that the card works with another operating system on your computer, that will assure you that any problem you encounter on Linux is caused by software at some level.

You should identify what type of card you have, including manufacturer and model. Determine if it is an ISA, ISA PnP, or PCI card. If the card has jumpers you should note the settings. If you know what resources (IRQ, I/O address, DMA channels) the card is currently using, note that information as well.

If you don't have all this information, don't worry. You should be able to get by without it; you just may need to do a little detective work later. On laptops or systems with on-board sound hardware, for example, you won't have the luxury of being able to look at a physical sound card.

9.5.3.2. Configuring ISA Plug and Play (optional)

Modern PCI bus sound cards do not need any configuration. The older ISA bus sound cards were configured by setting jumpers. ISA PnP cards are configured under Linux using the ISA Plug and Play utilities. If you aren't sure if you have an ISA PnP sound card, try running the command pnpdump and examining the output for anything that looks like a sound card. Output should include lines like the following for a typical sound card:

# Card 1: (serial identifier ba 10 03 be 24 25 00 8c 0e)
# Vendor Id CTL0025, Serial Number 379791851, checksum 0xBA.
# Version 1.0, Vendor version 1.0
# ANSI string -->Creative SB16 PnP<--

The general process for configuring ISA PnP devices is:

Save any existing /etc/isapnp.conf file.
Generate a configuration file using the command pnpdump >/etc/isapnp.conf.
Edit the file, uncommenting the lines for the desired device settings.
Run the isapnp command to configure Plug and Play cards (usually on system startup).

Most modern Linux distributions take care of initializing ISA PnP cards. You may already have a suitable /etc/isapnp.conf file, or it may just require some editing.

For more details on configuring ISA PnP cards, see the manpages for the isapnp, pnpdump, and isapnp.conf and read the ISA Plug and Play HOWTO from the Linux Documentation Project.

9.5.3.3. Configuring the kernel (optional)

You may want to compile a new kernel. If the kernel sound driver modules you need are not provided by the kernel you are currently running, you will need to do this. If you prefer to compile the drivers directly into the kernel rather than use loadable kernel modules, a new kernel will be required as well.

In the most common situation where you are running a kernel that was provided during installation of your Linux system, all sound drivers should be included as loadable modules and this step should not be necessary.

See Section 7.4 in Chapter 7 for information on rebuilding your kernel.

9.5.3.4. Configuring kernel modules

In most cases the kernel sound drivers are loadable modules, which the kernel can dynamically load and unload. You need to ensure that the correct drivers are loaded. You do this using a configuration file, such as /etc/conf.modules. A typical entry for a sound card might look like this:

alias sound sb
alias midi opl3
options opl3 io=0x388
options sb io=0x220 irq=5 dma=1 dma16=5 mpu_io=0x330

You need to enter the sound driver to use and the appropriate values for I/O address, IRQ, and DMA channels that you recorded earlier. The latter settings are needed only for ISA and ISA PnP cards because PCI cards can detect them automatically. In the preceding example, which is for a 16-bit SoundBlaster card, we had to specify the driver as sb in the first line, and specify the options for the driver in the last line.

Some systems use /etc/modules.conf and/or multiple files under the /etc/modutils directory, so you should consult the documentation for your Linux distribution for the details on configuring modules. On Debian systems, you can use the modconf utility for this task.

In practice, usually the only tricky part is determining which driver to use. The output of pnpdump for ISA PnP cards and lspci for PCI cards can help you identify the type of card you have. You can then reference this to documentation available either in the Sound HOWTO or in the kernel source, usually found on Linux systems in the /usr/src/linux/Documentation/sound directory.

For example, a certain laptop system reports this sound hardware in the output of lspci:

00:05.0 Multimedia audio controller: Cirrus Logic CS 4614/22/24 [CrystalClear 
SoundFusion Audio Accelerator] (rev 01)

For this system the appropriate sound driver is "cs46xx". Some experimentation may be required, and it is safe to try loading various kernel modules and see if they detect the sound card.

9.5.3.5. Testing the installation

The first step to verify the installation is to confirm that the kernel module is loaded. You can use the command lsmod; it should show that the appropriate module, among others, is loaded:

% /sbin/lsmod
Module                  Size  Used by
parport_pc             21256   1 (autoclean)
lp                      6080   0 (autoclean)
parport                24512   1 (autoclean) [parport_pc lp]
3c574_cs                8324   1
serial                 43520   0 (autoclean)
cs46xx                 54472   4
soundcore               3492   3 [cs46xx]
ac97_codec              9568   0 [cs46xx]
rtc                     5528   0 (autoclean)

Here the drivers of interest are cs46xx, soundcore, and ac97_codec. When the driver detected the card the kernel should have also logged a message that you can retrieve with the dmesg command. The output is likely to be long, so you can pipe it to a pager command, such as less:

PCI: Found IRQ 11 for device 00:05.0
PCI: Sharing IRQ 11 with 00:02.0
PCI: Sharing IRQ 11 with 01:00.0
Crystal 4280/46xx + AC97 Audio, version 1.28.32, 19:55:54 Dec 29 2001
cs46xx: Card found at 0xf4100000 and 0xf4000000, IRQ 11
cs46xx: Thinkpad 600X/A20/T20 (1014:0153) at 0xf4100000/0xf4000000, IRQ 11
ac97_codec: AC97 Audio codec, id: 0x4352:0x5914 (Cirrus Logic CS4297A rev B)

For ISA cards, the device file /dev/sndstat shows information about the card. This won't work for PCI cards, however. Typical output should look something like this:

% cat /dev/sndstat
OSS/Free:3.8s2++-971130
Load type: Driver loaded as a module
Kernel: Linux curly 2.2.16 #4 Sat Aug 26 19:04:06 PDT 2000 i686
Config options: 0

Installed drivers:

Card config:

Audio devices:
0: Sound Blaster 16 (4.13) (DUPLEX)

Synth devices:
0: Yamaha OPL3

MIDI devices:
0: Sound Blaster 16

Timers:
0: System clock

Mixers:
0: Sound Blaster

If these look right you can now test your sound card. A simple check to do first is to run a mixer program and verify that the mixer device is detected and that you can change the levels without seeing any errors. You'll have to see what mixer programs are available on your system. Some common ones are aumix, xmix, and kmix. Set all the levels to something reasonable.

Now try using a sound file player to play a sound file (e.g., a WAV file) and verify that you can hear it play. If you are running a desktop environment, such as KDE or GNOME, you should have a suitable media player; otherwise look for a command-line tool such as play.

If playback works you can then check recording. Connect a microphone to the sound card's mic input and run a recording program, such as rec or vrec. See whether you can record input to a WAV file and play it back. Check the mixer settings to ensure that you have selected the right input device and set the appropriate gain levels.

You can also test whether MIDI files play correctly. Some MIDI player programs require sound cards with an FM synthesizer, others do not. Some common MIDI players are playmidi, kmid, and kmidi. Testing of devices on the MIDI bus is beyond the scope of this book.

A good site for general information on MIDI and MIDI devices is http://midistudio.com. The official MIDI specifications are available from the MIDI Manufacturers Association. Their web site can be found at http://www.midi.org.

9.5.3.6. Troubleshooting and common problems

This section lists some common problems and possible solutions.

Kernel modules not loaded

This could be caused by incorrect module configuration files. It will also occur if the kernel module loader (kerneld or kmod) is not running. Make sure the module is available for loading in the appropriate directory (typically something like /lib/modules/2.4.17/kernel/drivers/sound).

Sound card not detected

You are probably using the wrong kernel driver or the wrong settings for I/O address, IRQ, or DMA channel.

IRQ/DMA timeout or device conflicts

You are using the wrong settings for I/O address, IRQ, and DMA, or you have a conflict with another card that is using the same settings.

No sound after rebooting

If sound was working and then stopped when the system was rebooted, you probably have a problem with the module configuration files. This can also occur if the system init scripts are not configured to initialize PnP cards or to load the modules.

If the drivers are loaded, it could be that the mixer settings are set too low to hear any audio.

Sound works only for root

This probably indicates a permissions problem with the device files. Many systems allow only users who are members of the group "audio" to access the sound devices. Add the user(s) to this group or change the permissions on the audio devices using the chmod.

No sound is heard but there are no error messages

If sound programs appear to be playing but nothing is heard, it is probably a problem with the mixer settings, or a problem with the connection of the speakers.

Unable to record audio

This could indicate a problem with the mixer settings. You need to set the levels and select the input device. You might also have a bad microphone or are using the wrong input jack on the sound card.

Device busy error

Either you likely have a device conflict, or another application is using the sound devices. This could be because you are running a sound server program, such as esd or artsd.

No sound when playing audio CD

To play audio CDs you need a cable from the CD-ROM drive to your sound card. Make sure you have selected CD input using a mixer program. Try connecting headphones to the front panel jack of the CD-ROM drive. If you can hear audio, the problem is not with the drive itself. If you can't hear audio from the headphones, the problem is with the drive or CD player program.

Cannot play MIDI files

Some MIDI applications work only with a sound card that has an FM synthesizer, and not all cards have this hardware (or the kernel driver for the sound card may not support it). Other MIDI applications use the standard audio device.

9.5.4. Linux Multimedia Applications

Once you have your sound card up and running under Linux you'll want to run some audio applications. So many are available for Linux that they can't possibly be listed here, so we will just describe some of the general categories of programs that are available. You can look for applications using the references listed here. We'll also go into a bit more detail about one of today's most popular audio applications, playing MP3 files.

Mixer programs, for setting record and playback gain levels
Media players, for file formats, such as WAV, MP3, and MIDI
CD players, for playing audio CDs
Recording tools, for generating sound files
Effects and signal processing tools, for manipulating sound
Speech tools, supporting speech recognition, and synthesis
Games, which use audio to add realism
Desktop environments, such as KDE and GNOME, which support multimedia

9.5.5. MP3 Players

MP3 (MPEG-1 Layer 3) is one of the most popular file formats for digital audio, and there are a number of MP3 player applications for Linux. If you are running a desktop environment, such as KDE or GNOME, you likely already have an MP3 player program. If so, it is recommended that you use this player since it should work correctly with the sound server used by these desktop environments.

These are some of the features you should look for when selecting an MP3 player application:

Support for different sound drivers (e.g., OSS and ALSA) or sound servers (KDE and GNOME).
An attractive user interface. Many MP3 players are "skinnable," meaning that you can download and install alternative user interfaces.
Support for playlists, allowing you to define and save sequences of your favorite audio tracks.
Various audio effects, such as a graphical equalizer, stereo expansion, reverb, voice removal, and visual effects for representing the audio in graphical form.
Support for other file formats, such as audio CD, WAV, and video formats.

If you want to create your own MP3 files you will need an encoder program. There are also programs that allow you to extract tracks for audio CDs.

While you can perform MP3 encoding with open source tools, certain patent claims have made the legality of doing so in question. Ogg Vorbis is an alternative file format and encoder that claims to be free of patent issues. To use it, your player program needs to support Ogg Vorbis files because they are not directly compatible with MP3. However, many MP3 players like Xmms support Ogg Vorbis already; in other cases, there are direct equivalents (like ogg123 for mpg123).

Installation of an MP3 player typically requires that you install the appropriate package (in RPM or deb format, depending on your Linux distribution). You may also choose to build it from source code. An MP3 player should install MIME types to associate it with MP3 and other supported file types so that you can launch it from applications in the same way as file managers, web browsers, and email clients.

9.5.6. References

Listed here are a few sources of information related to sound under Linux:

The Linux Sound HOWTO, available from the Linux Documentation Project at http://www.tlpd.org
The ALSA Project web site at http://www.alsa-project.org
The 4Front Technologies web site at http://www.opensound.com
The Sound and MIDI Software for Linux web site at http://sound.condorow.net
The book Linux Multimedia Guide, published by O'Reilly
The book Linux Music and Sound, published by No Starch Press

A number of mailing lists are related to sound and Linux. See the Sound HOWTO for details on how to join the lists.