What is “audio scene description”? How does it work in MPEG-4?
As video scenes are made from visual objects, audio scenes may be usefully described as the spatiotemporal combination of audio objects. An “audio object” is a single audio stream coded using one of the MPEG-4 coding tools, like CELP or Structured Audio. Audio objects are related to each other by mixing, effects processing, switching, and delaying them, and may be spatialized to a particular 3-D location. The effects processing is described abstractly in terms of a signal-processing language (the same language used for Structured Audio), so content providers may design their own empirically, and include them in the bitstream.