How should a proper audio system look like?

Started by Crimson Wizard, Sat 19/05/2018 15:17:59

Previous topic - Next topic

Crimson Wizard

First of all, I'd like to ask to not assume that me, or anyone else, will jump into doing this right away (this is to prevent premature expectations).

It's well know to anyone who has every worked with contemporary AGS audio system, that its design is somewhat lacking (to say the least), causes lots of confusion, and feels clunky even if you know how it works.

There is an issue with unreliable audio clip index, which make it impossible to reference particular clips when using a formula or saved data (without creating additional arrays to store clip pointers in the wanted order). I am still aware of that, but personally think it's a separate issue, more related to resource organization in game (you will have same problem with any other item in game, such as characters or views).

What I'd like to discuss is, what a good audio API should be.

In the past there was a suggestion to introduce new type, something like "AudioPlayback", which is returned from AudioClip.Play instead of the channel. By assumption, that would reduce the problems people have when storing AudioChannel pointer for a long time, which may later get other clip played on it. Is it a good idea?

Other questions that come to mind, if there are properties that control sound, such as speed, volume, and so on, should they be a property of AudioPlayback or a channel? Should the user be able to set up a volume of the channel, perhaps?
Should channels be created by user in the project tree? If yes, then should there be a general list of channels, or channels per audio type?

Even if you cannot or do not want to consider how this should look like in script, from the user's perspective, what would you like to be able to do there, what items or parameters are essential to control for a game developer?


PS. I know that tzachs is designing an audio system in MonoAGS too, although he took the abstraction further compared to AGS. For example, in his system there is no such thing as "audio type", instead there are "tags" which are used to setup rules for playing a clip, such as how to choose a channel, and so on. While that approach is more flexible, idk if it will be used often to full degree. AGS could have simplier built-in behavior, but provide sufficient API to let users script their own playback rules. For example: let user to explicitly set clip to the particular channel.

tzachs

Quote from: Crimson Wizard on Sat 19/05/2018 15:17:59
In the past there was a suggestion to introduce new type, something like "AudioPlayback", which is returned from AudioClip.Play instead of the channel. By assumption, that would reduce the problems people have when storing AudioChannel pointer for a long time, which may later get other clip played on it. Is it a good idea?

This is what MonoAGS is doing (only the result is called "Sound" and not "AudioPlayback"), and yes, I think it's a good idea. MonoAGS has no concept of "channel" at all which I think makes things simpler with less confusion.
The only use-cases I see for channels are:
1. Having a "radio" in your game: you can still simulate channels if you need it.
2. Exhausting all the hardware channels: less of a problem on newer devices, but you can still resolve it by allowing setting priorities for sounds without needing to introduce channels.

Crimson Wizard

#2
Quote from: tzachs on Sat 19/05/2018 16:27:20
The only use-cases I see for channels are:
1. Having a "radio" in your game: you can still simulate channels if you need it.
2. Exhausting all the hardware channels: less of a problem on newer devices, but you can still resolve it by allowing setting priorities for sounds without needing to introduce channels.

It may be that the channels in game engine are not "real" but rather an abstract "playback slot" thing, which may be assigned different attributes or purposes.

For example, a way to restrict number of simultaneous playbacks (which may not be only necessary for compatibility with device, but also for gameplay or aesthetic reasons). A way to apply sound config presets, more explicit crossfading/mixing, and so on.

The actual question is, whether it is convenient to have channel API on engine level, or leave it for script modules. In the latter case audio API should provide sufficient capabilities to create your own channel/mixer system.


For example, right now in AGS it is easy to restrict number of simultaneous music clips to 1 only by entering this in audio type settings (iirc that's default). And every next music will be always automatically replacing old one.
If there will be no built-in channel settings, all users would have to script this on their own: store the playback pointer in global variable and stop it before starting next track. I do not think that would be very good for beginners.

Of course, it does not have to be "channels", it could be something else with same effect, so long as it allows to do what is necessary.

eri0o

Quote
The actual question is, whether it is convenient to have channel API on engine level, or leave it for script modules. In the latter case audio API should provide sufficient capabilities to create your own channel/mixer system.

That's really the question I think. In engine advanced features and a module on top to make things easier for begginers.

In my game I made two main modules, a SFXplayer and a MusicPlayer. Unfortunately, I have to deal with each clip type through the explorer since the types are read-only. Also the channels have to be defined through Editor instead of script, adding manual steps before being able to use the module.

But I think that's pretty much it, most of the confusion happens from the lack of a module.

About more advanced interfaces, like mixers, I love this idea, just want to point out that WebAudio API, has never been stable, breaks a lot, so I feel that make audio stuff is just VERY hard.

Snarky

I actually think there isn't much that is seriously wrong with the AGS audio system: the main problem is that it's very poorly documented and explained, with some of the information in the manual actively misleading.

There are a lot of little fixes that could be made, but I think the greatest improvement would be something like:

Quote from: Crimson Wizard on Sat 19/05/2018 15:17:59
In the past there was a suggestion to introduce new type, something like "AudioPlayback", which is returned from AudioClip.Play instead of the channel. By assumption, that would reduce the problems people have when storing AudioChannel pointer for a long time, which may later get other clip played on it. Is it a good idea?

I see AudioPlayback (or whatever we want to call it) as essentially a stateful wrapper for AudioChannel, that keeps track of whether the clip is still playing, perhaps has a loop counter (for tracks playing on loop), and the Position property currently held by AudioChannel. If the clip has ended, all function calls and property changes are ignored.

I would also store the original volume, since I seem to remember there being a problem with dynamic audio reduction because this isn't stored. Oh, and maybe the priority, allowing you to change it while it's playing (for example, if you're crossfading two tracks manually, you might want to drop the priority of the one that is fading out, to ensure that if anything is interrupted, it's the track that is ending, not the one that is starting).

I'm not certain whether we then need the plain AudioChannel any more. Might come in handy for some things, or maybe we should instead have a class for AudioTypes and do all the "general" sound countrols on those.

Oh, and if we do introduce something like AudioPlayback, it becomes even more urgent to provide a way to access it for frame-linked audio such as footsteps.

Other feature enhancement requests:

-Fix the problem where all audio functions return null if sound is disabled, forcing game devs to wrap every call in a null check.
-Make it possible to override the AudioType of an AudioClip when you play it (so that you could e.g. play a Music track as a Sound)
-Provide some way to access a useful identifier for each AudioClip, e.g. the script name as a String property. Very helpful for debugging.
-Should MaxChannels really reserve the channels exclusively? There are pros and cons. (You don't want the music to stop because there are too many character footsteps at the same time, for example.) Maybe it should be a setting. Or maybe have both MaxChannels and MinChannels properties.

Alan v.Drake

I like the AudioPlayback idea. It feels more natural to expect the playing instance of the audioclip for which you can change panning, volume etc, whether it's actually being played on the channel or not.
Also just because something is not playing on the channel, there's no reason to kill the audio playback instance, maybe the game dev needs it for lipsync or to check when it has finished playing. Audio playback should be deterministic.
It would be also nice if there were audio buses/mixers and ways to set special effects, but that's a plus.

- Alan



Crimson Wizard

Quote from: Snarky on Sat 19/05/2018 18:02:24
-Make it possible to override the AudioType of an AudioClip when you play it (so that you could e.g. play a Music track as a Sound)

Snarky, could you give an example why this may be necessary? Would that mean that user wants to apply different properties of AudioType for some particular sound?
In that case, maybe AudioType concept itself is not perfect, and some kind of "AudioSettings" preset would be more fit?


On a different topic, this recent discussion made me think about implementing PlayVoice function. If such function existed, that would make it easier to create custom speech. Also it will be non-blocking on its own, making possible to use with background speech. Or, perhaps, SayBackground could support voice tags.
This brings another question: do we need extra channels for voice-over? Or actual customizable AudioType for voice (with volume, MaxChannels properties).
Should "AudioVolumeDrop" activate whenever any speech plays, or only foreground blocking one?

morganw

The idea of MaxChannels seems to be mainly for accommodating a fixed number of channels. To start with I think it needs a mixer, where any number of streams can be played back. If there is a user defined limit, it makes more sense to apply that based on the context of what you are playing (fixed AudioType isn't really flexible enough in this case), and if there is a system defined limit it would only be there to protect against distortion.

I'd suggest tags that are assigned to the mixer channels on playback, which could be applied by default based on AudioType.

As a very rough example of what I mean:
Code: ags

// define an AudioType with some tags
AudioType music = new AudioType("music", "repeating", "crossfade");
aMusic.type = music;

// play with default tags
aMusic.Play();

// play with extra tags
aMusic.Play("room2", "room3", "quiet");

// return all channels tagged as "music"
MixerChannel[] channels = Mixer.GetChannels("music")

// turn up any music which was initially tagged as "quiet" and re-tag it
foreach (MixerChannel channel in  Mixer.GetChannels("quiet", "music"))
{
    channel.Volume = 100;
    channel.RemoveTag("quiet");
    channel.AddTag("loud");
}


Potentially you could have reserved tags like "repeating" that are implemented internally, but as long as there are no arbitrary limits or assignments on the channels I think a script module would probably be able to implement something with traditional limits (if for some reason, someone still wants them).

Dualnames

#8
Apologies for the bump, been looking at this a lot today. In my honest opinion, the only needed formats audio wise would be ogg, wav and midi, the rest is really pushing it. I can explain from an audio perspective. Midi is midi of course, wav is lossless (flac is too) and ogg is compressed. I would argue including mp3, but that's about it imho. Anyhow, personally my biggest issue, is that audiochannels are wonky, they replace each other and there's no way unless, each time I play a sound, i get a state of the channels and ensure they are as they were before + the new sound I'm playing.

I haven't checked what AGS uses internally as an audio lib, and I wonder if increasing the limit of the audio channels will fix that (from 8 to more) issue. I remember Calin coded an fmod plugin, which at the time I didn't check if it also had the issue of the 8 channels, but a lot of users in that topic said that it fixed the audio stutters. Anyhow, I don't think FMOD is the way because of its licencing.
Worked on Strangeland, Primordia, Hob's Barrow, The Cat Lady, Mage's Initiation, Until I Have You, Downfall, Hunie Pop, and every game in the Wadjet Eye Games catalogue (porting)

Snarky

Guess I never responded to this...

Quote from: Crimson Wizard on Sun 01/07/2018 17:24:20
Quote from: Snarky on Sat 19/05/2018 18:02:24
-Make it possible to override the AudioType of an AudioClip when you play it (so that you could e.g. play a Music track as a Sound)

Snarky, could you give an example why this may be necessary? Would that mean that user wants to apply different properties of AudioType for some particular sound?

Let's say you have some audio that you're using as atmospheric sound (bird tweets, traffic noises...), playing randomly on a music/background audio channel (i.e. as a special audio type). Now you'd like to use the same clip for a scripted sound effect when you're interacting with a bird or a car drives by.

Or say you have a certain audio clip you're using for a particular effect in the game, for example the TARDIS sound of your vessel teleporting. You have it as a sound effect audio type. However, in some places (e.g. when you start a new game), you want to do a thing where the sound plays, and then a music track begins right after it finishes. The easiest way to do this would be to play it as a music track and then play the music queued. However, since they're not the same audio type, you can't do that currently, and have to write quite a bit of logic to align them manually (or import the clip twice).

Quote from: Crimson Wizard on Sun 01/07/2018 17:24:20
In that case, maybe AudioType concept itself is not perfect, and some kind of "AudioSettings" preset would be more fit?

I think it's important to have some default behavior specified, but it would be nice to be able to override it. I think AudioType makes sense if it's linked to channel counts. If we get rid of the channel count limitation altogether, AudioSettings might be better.

Quote from: Crimson Wizard on Sun 01/07/2018 17:24:20
On a different topic, this recent discussion made me think about implementing PlayVoice function. If such function existed, that would make it easier to create custom speech. Also it will be non-blocking on its own, making possible to use with background speech. Or, perhaps, SayBackground could support voice tags.

That would be great! If non-blocking, we would need a way to know how long the clip is/whether it's still playing. It also means we would need multiple voice channels...

Quote from: Crimson Wizard on Sun 01/07/2018 17:24:20
This brings another question: do we need extra channels for voice-over? Or actual customizable AudioType for voice (with volume, MaxChannels properties).
Should "AudioVolumeDrop" activate whenever any speech plays, or only foreground blocking one?

Well, AGS reserves one of its 8 channels for voice-over already. I tend to think allowing multiple simultaneous voice clips would break a lot of logic both in the engine and in game scripts, for something that very rarely would actually be used or useful.

I'd rather suggest keeping the voice channel as-is (1 channel, blocking), but instead have a way to play voice clips as audio clips, and maybe voice versa (I'm not gonna fix that typo!), similarly to what I was saying about overriding AudioTypes.

Dualnames

The issue lies that with 8 channels almost being always used, I get either sounds cut off or not playing.
8 channels would equal
1. Speech
2. Music
3. Footsteps
4. Ambient
5. UI Click/Selection click
6. Game sound playing (example would be here a Seastar (Strangeland is a weird game) responding to a selection on a dialog UI)
7. Score Sound playing
8. Game sound playing (a pitched sound rising because of your selection (Strangeland is a weird game).

Generally if we could somehow allow the number of simultaneous sounds being played I think that would be stellar, I've been trying to devise a way around this, so I'm open to suggestions at this point. I don't know the technicalities behind the limit of 8 channels, but from a look in the code it seemed not to be something specifically set for a reason, it could be of course it, the source code is vast.
Worked on Strangeland, Primordia, Hob's Barrow, The Cat Lady, Mage's Initiation, Until I Have You, Downfall, Hunie Pop, and every game in the Wadjet Eye Games catalogue (porting)

Snarky

Do you really need 3, 5 and 7 on separate channels playing simultaneously? Isn't it OK to skip footsteps and selection clicks while the score sound is playing, and maybe selection clicks while footsteps are playing?

Dualnames

Well, that's not what I'm saying, I worded it poorly, what I'm saying is there are multiple instances where more than 6 sounds are playing, given that 2 channels are predominantly used by AGS for Speech (channel 0) and Music (Channel 1). And 6 sounds can be easily cluttered imho, conceptually. Given that you could have an ambient sound, that results in 5 sounds simultaneously played, and I personally run into that a lot of the times, especially cause there are some parts of the game that use 2 ambient sounds - 3 ambient sounds that can't be combined into one. What I'm saying is my biggest problem and one I've been trying and trying and trying to patch up, is the need for more than 6 simultaneous sounds played. I've given up on using 2 music channels instead of one to do some custom crossfading, cause I can't afford taking an extra channel over the sound effects. The sounds don't have to start at the same time, there can be a point in the game, where all the channels are full.

Easy example to illustrate my point.

One track of music playing.
One ambient sound of wind.
One ambient sound of a cicada whirring (this is a character that moves from room to room)
One sound of getting an item
One sound of a UI opening
One sound of a click selection
One sound of an animation related

You could be opening a MOTOR UI and solving the puzzle relatively fast as the UI opened.
Poof, there goes the channels. And this happens, I can upload a savegame where something like that is happening.
Worked on Strangeland, Primordia, Hob's Barrow, The Cat Lady, Mage's Initiation, Until I Have You, Downfall, Hunie Pop, and every game in the Wadjet Eye Games catalogue (porting)

Crimson Wizard

These channels are more logical things, we may rise their number easily or completely remove the limit (but in that case we perhaps should demand that "reserved channels" is always actual value, with default of 4-8 for example).
I do not know what happens if library runs out of real system resources though.

Dualnames

What my question is mostly about is, is that 8 channel limit based on some sort of value? What I mean is it a system limit or is that like the number of simultaneous sounds a sound card can play at the very least?
Worked on Strangeland, Primordia, Hob's Barrow, The Cat Lady, Mage's Initiation, Until I Have You, Downfall, Hunie Pop, and every game in the Wadjet Eye Games catalogue (porting)

morganw

I think this is based on the trying to safely initialise the audio system and avoid distortion. If you request more 'voices' than hardware can deliver the audio initialisation will fail and you'll have no sounds at all. The audio implementation in Allegro 4 treats the first 8 voices differently by default (doesn't adjust volume to avoid distortion) so that would suggest that at some point 8 audio channels was the best default and most compatible value to use. Increasing the channel count as it stands may lead to inconsistent volume levels between channels or distortion.

Snarky

It's hard to research this online because (non-expert) sources are pretty vague and often outdated, but best I can tell, it used to be limited by the polyphony (the number of "voices" that could be rendered simultaneously) of the sound card. (Specifically FM polyphony, which could potentially be different from MIDI-type polyphony.) The EAX 1.1 standard from 1999, which might be what Allegro used as a baseline, defines "8 simultaneous voices processable in hardware" as a requirement. Any more than that, and the audio had to be downmixed in software, on the CPU.

However, around 2005 or so (by which time EAX 5.0 on the latest SoundBlaster cards had support for 128 voices in hardware), doing this (for a "reasonable" number of simultaneous audio clips) got so fast on CPUs that the question of hardware support became irrelevant, and most newer audio cards/chips for general consumers no longer have hardware support for it at all, relying completely on software downmixing.

Crimson Wizard

#17
Part of the design problem of AudioChannels (I think I mentioned that earlier in this thread) is that they are not only restricting maximal clips playing simultaneously, but also restrict clip type to lower number of channels. For example, if you have 10 clip types (arbitrary number), then you cannot let each of those have 2 fixed channels. Only first 4 will get this reservation and the rest will probably not playing at all regardless of priority settings since there's no channel that is not reserved and would be free to use for other types.

So I believe it may be worth to split current AudioChannel concept into real playback channels which are not bound for any type at all, and "rules" - the settings that help deciding whether new clip should be playing depending on priorities and reservation settings.

Snarky

Quote from: Crimson Wizard on Wed 20/02/2019 13:55:50
Part of the design problem of AudioChannels (I think I mentioned that earlier in this thread) is that they are not only restricting maximal clips playing simultaneously, but also restrict clip type to lower number of channels. For example, if you have 10 clip types (arbitrary number), then you cannot let each of those have 2 fixed channels. Only first 4 will get this reservation and the rest will probably not playing at all regardless of priority settings since there's no channel that is not reserved and would be free to use for other types.

So I believe it may be worth to split current AudioChannel concept into real playback channels which are not bound for any type at all, and "rules" - the settings that help deciding whether new clip should be playing depending on priorities and reservation settings.

I'm a little worried that this would make the system even more complicated and be a case of overengineering. The specific problem you describe could be fixed by splitting "MaxChannels" (which is currently a misnomer) into two properties: MaxChannels and MinChannels.

MusicallyInspired

#19
Allow referencing/seeking music file playback locations/positions by sample count instead of just milliseconds. You can already do this with sound effects but not music IIRC. This functionality is even built right into Allegro itself. I'm rather confused why seeking via samples for music hasn't been supported at all. This is frustrating as it would make looping music much more accurate, precise, and seamless. Currently we've had to have two separate music files, one for an "intro" and one for the loop section. But even that doesn't work nicely all the time as there's the ever so slightest of delays and the length of that delay seems to be random which causes an audible click. Sometimes the second music track starts too soon before the "intro" track has entirely finished, causing the music to jump. Most modern games loop audio by seeking sample counts within the same music track. This even goes right back to the Nintendo GameCube. I don't know if it's as easy as exposing the audio sample seek function of Allegro as a new possible parameter in the existing music functions in AGS or if it needs major retooling, but this is something that would make working with seamless music so much easier. Counting by milliseconds alone is simply not precise enough and unreliable.

Being able to start a loop point at any sample count in a music file (where the song will bounce back to instead of the very beginning) and then trigger that loop point at another sample count (not necessarily the end of the track) is the ideal scenario. There could be a much more robust dynamic music system scripted with this ability.
Buy Mage's Initiation OST (Bandcamp)
Buy Mage's Initiation (Steam, GOG, Humble)

Snarky

As it seems like AGS 4.0 is moving closer to release, I would like to revive this thread, as I feel an improved audio API should be part of that revision.

I think the basic idea of getting rid of AudioChannel in favor of something tied to the playback instance, and guaranteed to be non-null, is probably the right approach. The API might not need to change all that much, but there are a few different things to consider:

- AudioClip
- Voice clips
- Frame-linked audio (esp. on looping animations)
- AudioType
- Playback with repeat

For example, with a looping or repeating sound, is it always the same playback instance, or a new one each time? And how would you configure frame-linked audio (volume, etc)? If there was a persistent playback instance, that might offer a good API.

I think @Crimson Wizard has also sometimes mentioned more advanced effects, like changing playback speed or adding echo. I would suggest that this shouldn't be a priority to implement at this time, but it might be worth keeping in mind how it might be added in future.

Crimson Wizard

Quote from: Snarky on Sat 29/03/2025 13:09:38I think the basic idea of getting rid of AudioChannel in favor of something tied to the playback instance, and guaranteed to be non-null, is probably the right approach.

Not ready to give a full reply right now, just couple of quick notes.
I am not certain about getting rid of AudioChannel is a good thing or not. It may be a good thing to not have a channel as a return value from Play(), but at the same time I found audio channels to be an interesting logical concept in how multiple simultaneous playbacks are organized.

In regards to a "null pointer" problem, there's something that hit me recently, and I wonder why did not I think about this earlier. What if Play() returned a dummy audio playback? There are 2 reasons for Play to fail: failure to decode and failure to play, the former happens when the file cannot be opened at all, and the latter when the sound device is not working, or game set to use "no driver".
We are mostly interested to prevent script errors in the second case, because the first case will be noticed immediately during the game development. In audio software that works with filter chains there's a concept of "null" output, where the processed sound is simply discarded. So what we could do is to process the sound internally (which will even update its Position property), but then discard it through such "null" output. I did not check yet, but it's possible that our sound library supports that already.

Snarky

#22
Quote from: Crimson Wizard on Sat 29/03/2025 13:19:35Not ready to give a full reply right now, just couple of quick notes.
I am not certain about getting rid of AudioChannel is a good thing or not. It may be a good thing to not have a channel as a return value from Play(), but at the same time I found audio channels to be an interesting logical concept in how multiple simultaneous playbacks are organized.

I think if there is something like an AudioPlayback instance for everything that's currently playing, that will basically take the place of AudioChannels. You might still want a way to iterate through all the playing sounds (for example if you need to adjust the overall volume), but I'm not sure a fixed list of AudioChannels (that exist regardless of whether anything is playing on the channel or not) is the right way to do that if the limit on simultaneous playback is removed.

AudioTypes seem more useful to me as a way to easily set up game logic for sound. MaxChannels would still help control whether a track replaces what is currently playing or not, for example. It's also important to consider how speech will work. It would be good if the system could support voiced background speech; some way to control the speech "channel(s)"/get speech playback instances would be needed.

Quote from: Crimson Wizard on Sat 29/03/2025 13:19:35In regards to a "null pointer" problem, there's something that hit me recently, and I wonder why did not I think about this earlier. What if Play() returned a dummy audio playback? There are 2 reasons for Play to fail: failure to decode and failure to play, the former happens when the file cannot be opened at all, and the latter when the sound device is not working, or game set to use "no driver".
We are mostly interested to prevent script errors in the second case, because the first case will be noticed immediately during the game development. In audio software that works with filter chains there's a concept of "null" output, where the processed sound is simply discarded. So what we could do is to process the sound internally (which will even update its Position property), but then discard it through such "null" output. I did not check yet, but it's possible that our sound library supports that already.

I think it's clear that if it must never return null but playback may fail for whatever reason, it must then return a "dummy." If I understand you correctly, though, what you're saying is that the dummy may still "pretend" to play, specifically in terms of reporting playback position. Yes, I think that could be useful because other game logic may depend on it (use it for timing, for example, like if a cut-scene has been designed to sync to a music track), but at the same time I think it's important that it should be possible to tell somehow that playback has failed.

I think there may be cases where the file fails to open/play but this isn't known at design time (e.g. if using AudioClip.GetByName() or Game.PlayVoiceClip(), or using an external AUDIO.VOX file; perhaps a way to load audio files from the file system or even streaming will also be reintroduced in the future), but there is already an AudioClip.IsAvailable property that can deal with that, if the AudioClip isn't simply null.

Crimson Wizard

#23
I've been busy with other things, but now I will probably return to this issue. It's been a while since I've given any thought to the audio system, so will have to spend some time and think this through again.

I do believe that there has to be a way of limiting the number of simultaneous playbacks according to user-defined rule. The question is whether we support this as a native engine feature, or expose enough API to let users write their own "audio mixer", or both. Native feature may still be useful, because limiting e.g. music to 1 channel and having an automatic replacement is a very convenient functionality and used almost in every game.

There's one thing regarding the ideas mentioned above, which I do not like:

Quote from: Snarky on Sun 30/03/2025 08:00:23AudioTypes seem more useful to me as a way to easily set up game logic for sound. MaxChannels would still help control whether a track replaces what is currently playing or not, for example. It's also important to consider how speech will work. It would be good if the system could support voiced background speech; some way to control the speech "channel(s)"/get speech playback instances would be needed.

The reason why I do not like this approach is that if AudioTypes control number of simultaneous tracks, then that means there are still "audio channels" out there, except now they are "hidden" if AudioChannel is removed. We remove AudioChannels from script, and instead get secret "channels" in AudioTypes.

If we allow to access a list of AudioPlayback instances per AudioType, that makes AudioType to be a sort owning container of playback references (which seems like a strange organization to me). And if we would like to let users read the contents of these containers, then a) we have to expose AudioType as a struct in script, and b) we practically recreate audio channel in a less convenient way.

Why not keep AudioChannel, but make it
* not having a hard engine limit, but let users define as many as they like
* strip their API by moving playback control to AudioPlayback, and keep the channel object only as a "slot" which may contain a playback?

eri0o

One thing I would like is to be able to pass a volume right when using Play, and have that volume be a percentage that gets multiplied by whatever is the volume type percentage - so I could have a slider for the sounds per type in a menu.

Crimson Wizard

Quote from: eri0o on Mon 02/06/2025 21:03:21One thing I would like is to be able to pass a volume right when using Play, and have that volume be a percentage that gets multiplied by whatever is the volume type percentage - so I could have a slider for the sounds per type in a menu.

Having volume as a multiplier everywhere is a correct way IMO too. I've posted this somewhere before, the volume should be adjusted as a combination of multipliers:

System (master) volume -> Audio Type volume -> Clip volume

and optionally -> Emitter volume (e.g. character) -> Animation volume (for view frame linked sounds)

eri0o

Yeah, and also, the volume could be something optional right in the play call instead of requiring the need of channel?

I think from what I remember from previous discussions people forget to deal with the null audio channel. I think if there was some way to grasp for errors of audio - like if there is no audio device or the audio file doesn't exist - not sure how.

The other thing is some people like positional audio, but they understand as either it being relative to the player character or relative to the position on screen. There was also at some point a question about having different regions in a room with different music and some crossover transition. But these nuances could also be left for scripting.

The other thing I remember being asked is something like a filter for when the player is in a cave or under water. I don't think this is easy to do with mojoAl. I think this was done in strangeland through a plugin and maybe Laura Hunt asked about it too at some point.

eri0o

It would be nice if there was some way to connect something from audio output to the shader input for fun music synced effects

Crimson Wizard

Quote from: eri0o on Sat 14/06/2025 13:42:59It would be nice if there was some way to connect something from audio output to the shader input for fun music synced effects

Changes in shaders is done by setting constants. What remains is reading audio. I'd just generate a separate file with instructions, similar to how voice-based lipsync is done, read that file and apply changes to shaders using timestamps.

Crimson Wizard

#29
I keep getting distracted..., but I suppose this issue has to be dealt with somehow, at least by making a draft.

Still must sit down and think, and get a full picture in my head.

Meanwhile, as a few quick notes...



I remembered a "audio mixer" I've coded for the tzachs's MonoAGS project. Tzachs did not want to have AudioChannels also iirc, so I wanted to see how a engine's user could code ones themself. The mixer was written as a separate library, here's its code (it's in C#):
https://github.com/ivan-mogilko/MonoAGSGames/tree/master/Libs/AudioMixer

Its structure is this:

A Mixer allocates a number of AudioChannels; their number is dynamic and may be changed whenever.
A Mixer also has a dictionary of "Audio Rules" attached to Tags. Audio Rules include things like default priority, default volume, and so on. A Tag is just a unique string.
AudioClips may have any combination of tags.
AudioChannels may have tags.
A clip may be only played on a channel that has no tags or that match tags with the clip (If channel has tags, then the clip must have at least one matching tag). If it's allowed, then Mixer creates a AudioPlayback object and places on a suitable channel.

In the above concept the AudioTypes that we know in AGS are replaced by AudioRule objects associated with Tags. Reserving channels is done by applying tags to them: e.g. you may create "Music" tag, create 8 channels, and then assign "Music" tag to 4 of them. So it's kind of done other way around.

There's a "demo game" which uses this system:
https://github.com/ivan-mogilko/MonoAGSGames/blob/master/Games/AudioMixerGame/Rooms/MixerRoom.cs

I am posting this here just as an additional thought.



Then, I did not reply to some notes left by @Snarky above:

Quote from: Snarky on Sat 29/03/2025 13:09:38For example, with a looping or repeating sound, is it always the same playback instance, or a new one each time? And how would you configure frame-linked audio (volume, etc)? If there was a persistent playback instance, that might offer a good API.

I think that looping sound should be same playback instance, it works like that internally now, and that's how VideoPlayer works in AGS 4.0 too. I doubt if an opposite approach would work well.

About frame-linked audio. Naturally one would need to configure the future playback instead of doing that repeatedly every time one plays (and that won't work stably). OTOH I do not think that having a "persistent playback" - in the sense that it exists always and just have different sounds played - is a good idea. I believe that playback object should be valid only until the sound stops. Also, character frames may run several sounds in quick succession - and these may be different sounds, even played simultaneously.

I'd rather suggest to classify objects that may play linked sounds as a Audio Emitter, and have "emitter properties" on them. I've been following this principle when adding Character.AnimationVolume in AGS 3.6.0:
https://adventuregamestudio.github.io/ags-manual/Character.html#characteranimationvolume
https://adventuregamestudio.github.io/ags-manual/Object.html#objectanimationvolume

SMF spam blocked by CleanTalk