[Desktop_architects] Making Sound On Linux Just Work

Paul Davis paul at linuxaudiosystems.com
Tue Dec 13 12:19:24 PST 2005


Given the ongoing firefighting over such trivialities as printing and
usablenessfulility, it seems time for a Now For Something Completely
Different moment. After an insane week at work, I finally managed to
collate and organize a first pass at an initial analysis of what audio
needs. Now its your turn to shoot it down, tear it up or pat me on the
back. Anyone calling me a FUCKING IDIOT will be met with indifference
and ambiguous affectations. 

Round Two of this will involve suggestions for how to satisfy the final
state of this analysis.

--p

			Making Sound Just Work
		       ------------------------

One of the "second tier" of requirements mentioned several times at
the OSDL Portland Linux Desktop Architects workshop was "making audio
on Linux just work". Many people find it easy to leave this
requirement lying around in various lists of goals and requirements,
but before we can make any progress on defining a plan to implement
the goal, we first need to define it rather more precisely.

This list is intended to avoid any implementation details, and is
focused entirely on a task oriented analysis of the issues. Your input
is sought to complete, improve and clarify this analysis.

DEFINING THE GOAL
=================

The list below is a set of tasks that a user could reasonably expect
to perform on a computer running Linux that has access to zero, one
or more audio interfaces, as well as zero one or more network
interfaces. 

The desired task should either work, or produce a sensible and
comprehensible error message explaining why it failed. For example,
attempting to control input gain on a device that has no hardware
mixer should explain that the device has no controls for input gain.

 CONFIGURATION (see also MIXING below)

          - identify what audio h/w exists on the system
	  - identify what network audio destinations are available
	  - choose some given audio h/w or network endpoint
	       as the default for input
          - ditto for output
	  - enable/disable given audio h/w
	  - easily (auto)load any kernel modules required for
	       given functionality

 PLAYBACK
         
          - play a compressed audio file 
	        * user driven (e.g. play(1))
		* app driven (e.g. {kde,gnome_play}_audiofile())
	  - play a PCM encoded audio file (specifics as above)
	  - hear system sounds
	  - VOIP
	  - game audio
	  - music composition
	  - music editing
	  - video post production

 RECORDING
      
          - record from hardware inputs
	      * use default audio interface
	      * use other audio interface
	      * specify which h/w input to use
	      * control input gain
	  - record from other application(s)
	  - record from live (network-delivered) audio
          	  streams
              * PCM/lossless compression (WAV, FLAC etc)
	      * lossy compression (mp3, ogg etc)


 MIXING

	  - control h/w mixer device (if any)

	       * allow use of a generic app for this
	       * NOTE to non-audio-focused readers: the h/w mixer
	         is part of the audio interface that is used
		 to control signal levels, input selection
		 for recording, and other h/w specific features.
		 Some pro-audio interfaces do not have a h/w mixer,
		 most consumer ones do. It has almost nothing
		 to do with "hardware mixing" which describes
		 the ability of the h/w to mix together multiple
		 software-delivered audio data streams.

          - multiple applications using soundcard simultaneously
	  - control application volumes independently
	  - provide necessary apps for controlling specialized
	       hardware (e.g. RME HDSP, ice1712, ice1724, liveFX)

 ROUTING

          - route audio to specific h/w among several installed devices
	  - route audio between applications
	  - route audio across network
          - route audio without using h/w (regardless to whether or
             	  not h/w is available; e.g. streaming media)
	  
 MULTIUSER

          - which of the above should work in a multi-user scenario?

 FORMATS

	  - basically, the task list if covered by the above list,
	    but there are some added criteria:

	  - audio data formats divide into:
	         - direct sample data (e.g. RIFF/WAV, AIFF)
		 - losslessly compressed (e.g. FLAC)
		 - lossy compression (e.g. Vorbis, MP3)
          - apps that can handle a given division should all
	        handle the same set of formats, with equal prowess.
		i.e. apps don't have to handle lossy compression
		formats, but if they do, they should all handle
		the same set of lossy compression formats. Principle:
		minimize user suprise.
          - user should see no or limited obstacles to handling
	        proprietary formats

 MISC
      
          - use multiple soundcards as a single logical device
	  - use multiple sub-devices as a single logical device
	        (sub-devices are independent chipsets on
		 a single audio interface; many soundcards
		 have analog i/o and digital i/o available
		 as two different sub-devices)
	            





More information about the Desktop_architects mailing list