Author Topic: How should a proper audio system look like?  (Read 684 times)

Crimson Wizard

  • Local Moderator
  • AGS Project Tracker Admins
    • Best Innovation Award Winner 2013, for spearheading the AGS 3.3.0 project
    •  
    • Lifetime Achievement Award Winner
    •  
    • Crimson Wizard worked on a game that was nominated for an AGS Award!
      Crimson Wizard worked on a game that won an AGS Award!
How should a proper audio system look like?
« on: 19 May 2018, 15:17 »
First of all, I'd like to ask to not assume that me, or anyone else, will jump into doing this right away (this is to prevent premature expectations).

It's well know to anyone who has every worked with contemporary AGS audio system, that its design is somewhat lacking (to say the least), causes lots of confusion, and feels clunky even if you know how it works.

There is an issue with unreliable audio clip index, which make it impossible to reference particular clips when using a formula or saved data (without creating additional arrays to store clip pointers in the wanted order). I am still aware of that, but personally think it's a separate issue, more related to resource organization in game (you will have same problem with any other item in game, such as characters or views).

What I'd like to discuss is, what a good audio API should be.

In the past there was a suggestion to introduce new type, something like "AudioPlayback", which is returned from AudioClip.Play instead of the channel. By assumption, that would reduce the problems people have when storing AudioChannel pointer for a long time, which may later get other clip played on it. Is it a good idea?

Other questions that come to mind, if there are properties that control sound, such as speed, volume, and so on, should they be a property of AudioPlayback or a channel? Should the user be able to set up a volume of the channel, perhaps?
Should channels be created by user in the project tree? If yes, then should there be a general list of channels, or channels per audio type?

Even if you cannot or do not want to consider how this should look like in script, from the user's perspective, what would you like to be able to do there, what items or parameters are essential to control for a game developer?


PS. I know that tzachs is designing an audio system in MonoAGS too, although he took the abstraction further compared to AGS. For example, in his system there is no such thing as "audio type", instead there are "tags" which are used to setup rules for playing a clip, such as how to choose a channel, and so on. While that approach is more flexible, idk if it will be used often to full degree. AGS could have simplier built-in behavior, but provide sufficient API to let users script their own playback rules. For example: let user to explicitly set clip to the particular channel.
« Last Edit: 19 May 2018, 15:54 by Crimson Wizard »

tzachs

  • AGS Baker
  • Mittens Vassal
  • Parking Goat- games that goats like!
    • I can help with translating
    •  
    • tzachs worked on a game that was nominated for an AGS Award!
Re: How should a proper audio system look like?
« Reply #1 on: 19 May 2018, 16:27 »
In the past there was a suggestion to introduce new type, something like "AudioPlayback", which is returned from AudioClip.Play instead of the channel. By assumption, that would reduce the problems people have when storing AudioChannel pointer for a long time, which may later get other clip played on it. Is it a good idea?

This is what MonoAGS is doing (only the result is called "Sound" and not "AudioPlayback"), and yes, I think it's a good idea. MonoAGS has no concept of "channel" at all which I think makes things simpler with less confusion.
The only use-cases I see for channels are:
1. Having a "radio" in your game: you can still simulate channels if you need it.
2. Exhausting all the hardware channels: less of a problem on newer devices, but you can still resolve it by allowing setting priorities for sounds without needing to introduce channels.

Crimson Wizard

  • Local Moderator
  • AGS Project Tracker Admins
    • Best Innovation Award Winner 2013, for spearheading the AGS 3.3.0 project
    •  
    • Lifetime Achievement Award Winner
    •  
    • Crimson Wizard worked on a game that was nominated for an AGS Award!
      Crimson Wizard worked on a game that won an AGS Award!
Re: How should a proper audio system look like?
« Reply #2 on: 19 May 2018, 16:53 »
The only use-cases I see for channels are:
1. Having a "radio" in your game: you can still simulate channels if you need it.
2. Exhausting all the hardware channels: less of a problem on newer devices, but you can still resolve it by allowing setting priorities for sounds without needing to introduce channels.

It may be that the channels in game engine are not "real" but rather an abstract "playback slot" thing, which may be assigned different attributes or purposes.

For example, a way to restrict number of simultaneous playbacks (which may not be only necessary for compatibility with device, but also for gameplay or aesthetic reasons). A way to apply sound config presets, more explicit crossfading/mixing, and so on.

The actual question is, whether it is convenient to have channel API on engine level, or leave it for script modules. In the latter case audio API should provide sufficient capabilities to create your own channel/mixer system.


For example, right now in AGS it is easy to restrict number of simultaneous music clips to 1 only by entering this in audio type settings (iirc that's default). And every next music will be always automatically replacing old one.
If there will be no built-in channel settings, all users would have to script this on their own: store the playback pointer in global variable and stop it before starting next track. I do not think that would be very good for beginners.

Of course, it does not have to be "channels", it could be something else with same effect, so long as it allows to do what is necessary.
« Last Edit: 19 May 2018, 17:21 by Crimson Wizard »

Re: How should a proper audio system look like?
« Reply #3 on: 19 May 2018, 17:27 »
Quote
The actual question is, whether it is convenient to have channel API on engine level, or leave it for script modules. In the latter case audio API should provide sufficient capabilities to create your own channel/mixer system.

That's really the question I think. In engine advanced features and a module on top to make things easier for begginers.

In my game I made two main modules, a SFXplayer and a MusicPlayer. Unfortunately, I have to deal with each clip type through the explorer since the types are read-only. Also the channels have to be defined through Editor instead of script, adding manual steps before being able to use the module.

But I think that's pretty much it, most of the confusion happens from the lack of a module.

About more advanced interfaces, like mixers, I love this idea, just want to point out that WebAudio API, has never been stable, breaks a lot, so I feel that make audio stuff is just VERY hard.

Snarky

  • Global Moderator
  • Mittens Earl
  • Private Insultant
    • I can help with proof reading
    •  
    • I can help with translating
    •  
Re: How should a proper audio system look like?
« Reply #4 on: 19 May 2018, 18:02 »
I actually think there isn't much that is seriously wrong with the AGS audio system: the main problem is that it's very poorly documented and explained, with some of the information in the manual actively misleading.

There are a lot of little fixes that could be made, but I think the greatest improvement would be something like:

In the past there was a suggestion to introduce new type, something like "AudioPlayback", which is returned from AudioClip.Play instead of the channel. By assumption, that would reduce the problems people have when storing AudioChannel pointer for a long time, which may later get other clip played on it. Is it a good idea?

I see AudioPlayback (or whatever we want to call it) as essentially a stateful wrapper for AudioChannel, that keeps track of whether the clip is still playing, perhaps has a loop counter (for tracks playing on loop), and the Position property currently held by AudioChannel. If the clip has ended, all function calls and property changes are ignored.

I would also store the original volume, since I seem to remember there being a problem with dynamic audio reduction because this isn't stored. Oh, and maybe the priority, allowing you to change it while it's playing (for example, if you're crossfading two tracks manually, you might want to drop the priority of the one that is fading out, to ensure that if anything is interrupted, it's the track that is ending, not the one that is starting).

I'm not certain whether we then need the plain AudioChannel any more. Might come in handy for some things, or maybe we should instead have a class for AudioTypes and do all the "general" sound countrols on those.

Oh, and if we do introduce something like AudioPlayback, it becomes even more urgent to provide a way to access it for frame-linked audio such as footsteps.

Other feature enhancement requests:

-Fix the problem where all audio functions return null if sound is disabled, forcing game devs to wrap every call in a null check.
-Make it possible to override the AudioType of an AudioClip when you play it (so that you could e.g. play a Music track as a Sound)
-Provide some way to access a useful identifier for each AudioClip, e.g. the script name as a String property. Very helpful for debugging.
-Should MaxChannels really reserve the channels exclusively? There are pros and cons. (You don't want the music to stop because there are too many character footsteps at the same time, for example.) Maybe it should be a setting. Or maybe have both MaxChannels and MinChannels properties.

Re: How should a proper audio system look like?
« Reply #5 on: 27 May 2018, 12:31 »
I like the AudioPlayback idea. It feels more natural to expect the playing instance of the audioclip for which you can change panning, volume etc, whether it's actually being played on the channel or not.
Also just because something is not playing on the channel, there's no reason to kill the audio playback instance, maybe the game dev needs it for lipsync or to check when it has finished playing. Audio playback should be deterministic.
It would be also nice if there were audio buses/mixers and ways to set special effects, but that's a plus.

- Alan



Crimson Wizard

  • Local Moderator
  • AGS Project Tracker Admins
    • Best Innovation Award Winner 2013, for spearheading the AGS 3.3.0 project
    •  
    • Lifetime Achievement Award Winner
    •  
    • Crimson Wizard worked on a game that was nominated for an AGS Award!
      Crimson Wizard worked on a game that won an AGS Award!
Re: How should a proper audio system look like?
« Reply #6 on: 01 Jul 2018, 17:24 »
-Make it possible to override the AudioType of an AudioClip when you play it (so that you could e.g. play a Music track as a Sound)

Snarky, could you give an example why this may be necessary? Would that mean that user wants to apply different properties of AudioType for some particular sound?
In that case, maybe AudioType concept itself is not perfect, and some kind of "AudioSettings" preset would be more fit?


On a different topic, this recent discussion made me think about implementing PlayVoice function. If such function existed, that would make it easier to create custom speech. Also it will be non-blocking on its own, making possible to use with background speech. Or, perhaps, SayBackground could support voice tags.
This brings another question: do we need extra channels for voice-over? Or actual customizable AudioType for voice (with volume, MaxChannels properties).
Should "AudioVolumeDrop" activate whenever any speech plays, or only foreground blocking one?

Re: How should a proper audio system look like?
« Reply #7 on: 01 Jul 2018, 18:30 »
The idea of MaxChannels seems to be mainly for accommodating a fixed number of channels. To start with I think it needs a mixer, where any number of streams can be played back. If there is a user defined limit, it makes more sense to apply that based on the context of what you are playing (fixed AudioType isn't really flexible enough in this case), and if there is a system defined limit it would only be there to protect against distortion.

I'd suggest tags that are assigned to the mixer channels on playback, which could be applied by default based on AudioType.

As a very rough example of what I mean:
Code: Adventure Game Studio
  1. // define an AudioType with some tags
  2. AudioType music = new AudioType("music", "repeating", "crossfade");
  3. aMusic.type = music;
  4.  
  5. // play with default tags
  6. aMusic.Play();
  7.  
  8. // play with extra tags
  9. aMusic.Play("room2", "room3", "quiet");
  10.  
  11. // return all channels tagged as "music"
  12. MixerChannel[] channels = Mixer.GetChannels("music")
  13.  
  14. // turn up any music which was initially tagged as "quiet" and re-tag it
  15. foreach (MixerChannel channel in  Mixer.GetChannels("quiet", "music"))
  16. {
  17.     channel.Volume = 100;
  18.     channel.RemoveTag("quiet");
  19.     channel.AddTag("loud");
  20. }

Potentially you could have reserved tags like "repeating" that are implemented internally, but as long as there are no arbitrary limits or assignments on the channels I think a script module would probably be able to implement something with traditional limits (if for some reason, someone still wants them).