Speech lip-sync with Lucasarts system

Started by bx83, Fri 07/04/2017 05:25:01

Previous topic - Next topic

bx83

Hi all
I've recently been playing with lip-sync in the latest AGS (3.4.0). I've heard that Lucasarts style games (the one I also used in my game) don't support lip sync; or that they support 2 frames; or vowels are frame 0... or something :/
I've also tried Lipsync Centre here, but it's very complex. It appears to add Pamela/.PAM file to Lucasarts games; but will only play them (in the portrait) of Sierra style games. Eg. the player character just does same animation over and over, but portrait animation chooses frames based on phonemes.

Also: when I choose "Type: Text (automatic)" in the lip-sync Properties box; is this automatically lip-synced to audio, or not lip-synced and runs continuously, or...? In Lucasarts, it just runs continuisly.

Basically I just can't find any information on speech lip-sync. Is there any way to get into Lucasarts games?
Existing plugins or scripts? All except Lipsync Centre are broken links.
Some tried a true method I haven't discovered yet?

Please help.

Khris

#1
The most basic way of lip-syncing is to set it to "Text (automatic)", then enter letters into the boxes.
How to do this is even explained right on the editor pane.

Create speech frames with the mouth being open, closed, wide, etc; then assign the frames to the letters by typing the letters into the box corresponding to the frame number (first speech loop frame is #0, second is #1, etc)
Skip the box that will get the most letters, then enter that number as default in the properties.

This gives quite reasonable results for low res games.

Finally, the manual has an entire section on Lip sync (Other Features -> Lip sync). Did you read that?

bx83

#2
By letters I assume you mean in the Lip Sync editor, where you provide multiple letters (A/F/R/...) for a frame?
Doesn't work for Lucasarts games. Just plays the whole set of frames, over and over.

I've read the manual.

"Skip the box that will get the most letters, then enter that number as default in the properties." - what does this mean?

Scavenger

For some reason, lip sync isn't supported for Lucasarts style speech. It's been asked for a lot but I'm not sure how much of a job it is to change the engine to support it.

Khris

#4
Not sure what to tell you, but it works fine for me. I imported an old 2.72 game where I had it set up into AGS 3.40 and it worked flawlessly.
Speech style: Lucasarts
Lip sync Type: Text (automatic)



Edit: right, in my example there's no audio involved. For some reason I kept assuming this was about matching text to speech frames, not audio.

Crimson Wizard

#5
Quote from: Scavenger on Sat 08/04/2017 05:57:25
For some reason, lip sync isn't supported for Lucasarts style speech. It's been asked for a lot but I'm not sure how much of a job it is to change the engine to support it.

As far as I remember (I double-checked the manual), it is "voice sync" that only works with Sierra-style speech (Pamela and Papagayo). Text-based lip sync should work with Lucas arts speech too.

Well, it seems that Khris's example demonstrates the latter.

I never checked how lip sync works in detail, to be honest, so I too do not know about possible difficulties there.

bx83

So if I understand correctly: Lip-sync will not be based on a player's speech (audio file) for a Lucasarts game.
It will instead be based on a sequence of frames e.g. 1,2,3,4,5,...
Whereas Sierra speech will do lipsync correctly, and choose animation frames based on .pam file/speech sounds in lip sync editor/etc.
??

In which case: Is there anything a game developer can do, in code exposed through the AGS editor (ie. script files, not AGS's DLLs), to fix things for Lucasarts?
Why is it, specifically, that Lucasarts fails? Is there an untreated bug in the system; or do "Lucasarts pty ltd" games not have lyp sinc, so "Lucasarts style games in AGS" don't have lip sync?

bx83

Will 'Automatic (text)' speech, in a Lucasarts style game, base frames it chooses on: a) the sound files  b) the text written on screen for character.Say  c) nothing at all, and just do frame sequence over and over? (ie. 1,2,3,4,1,2,3,4,1,2,3,4,...)

Snarky

Quote from: bx83 on Tue 11/04/2017 06:28:30
So if I understand correctly: Lip-sync will not be based on a player's speech (audio file) for a Lucasarts game.
It will instead be based on a sequence of frames e.g. 1,2,3,4,5,...
Whereas Sierra speech will do lipsync correctly, and choose animation frames based on .pam file/speech sounds in lip sync editor/etc.
??

No. AGS has support for two forms of "lipsync": one is based on the letters in the written line, and is independent of the speech audio clip. It will go through the text, and for each letter (or combination of letters) it will display a certain frame, like in Khris's example. This is the "standard lipsync" that AGS officially supports, and it works for both Sierra-style and LucasArts speech.

There's also an ability to analyze the speech clips with Pamela or Papagayo and play lipsync animations based on that. This "voice sync" feature is listed as "unofficial" and does not work with LucasArts speech.

All this is explained quite clearly in the manual.

Quote from: bx83 on Tue 11/04/2017 06:28:30In which case: Is there anything a game developer can do, in code exposed through the AGS editor (ie. script files, not AGS's DLLs), to fix things for Lucasarts?
Why is it, specifically, that Lucasarts fails? Is there an untreated bug in the system; or do "Lucasarts pty ltd" games not have lyp sinc, so "Lucasarts style games in AGS" don't have lip sync?

It's just never been implemented. Probably because usually with LucasArts-style speech, the characters aren't detailed enough that you get a lot out of it. And few AGS games use lipsync anyway (even most commercial AGS games don't bother with lip animation).

Grundislav

Someone made a module that supported this a few years back, but unfortunately all the links seem to be broken.

Snarky

#10
That's this thread, for the record: http://www.adventuregamestudio.co.uk/forums/index.php?topic=45301.0

Fortunately Calin posted his raw original code to the forum, so it shouldn't be too difficult to recreate a module:

http://www.adventuregamestudio.co.uk/forums/index.php?topic=36284.msg554642#msg554642

... in fact, I'm not busy, I'll give it a shot.

Crimson Wizard

#11
Quote from: Snarky on Wed 12/04/2017 17:25:57
http://www.adventuregamestudio.co.uk/forums/index.php?topic=36284.msg554642#msg554642

Welp, here is another AGS wishlist which I've never seen before... or maybe seen but forgot. I wonder how many of these ended up documented in the suggestion tracker?

Snarky

#12
In another thread there was a link to a tracker entry (Edit: Here - http://www.adventuregamestudio.co.uk/forums/index.php?topic=25227.msg320328#msg320328 - The link is broken and the tracker ID now points to a different issue, but maybe there's some way you can find it.). Mind you, that was at least ten years ago.

I've hit a little bit of a snag in modularizing the code, and to test I need some audio clips, matching PAM files and a character with lip sync animation frames. In particular it would be very helpful if someone has a character ready.


Snarky

OK, so here's an alpha version of a general-purpose lip sync module: https://www.dropbox.com/s/uytb6r05qfa5xue/TotalLipSync.scm?dl=0

It should work for all speech styles, and the idea is that it will eventually support a number of different lip sync data formats (PAMELA, Papagayo, Annosoft SAPI, Rhubarb): they're all pretty similar and pretty straightforward to parse, so that's fairly easy to do. However, at the moment only Pamela support has been fully implemented. You can also choose whether you want to distinguish stressed and unstressed vowels, or ignore that distinction (which makes the phoneme mapping more straightforward).

To use, initialize the module on launch:

Code: ags
function game_start()
{
   TotalLipSync.Init(eLipSyncPamelaUnstressed);
   TotalLipSync.AutoMapPhonemes();                // This sets up a default mapping of phonemes to frames. Or you can map them manually.
}


The phoneme mapping should match the frames in your character's speech view. Then when you want someone to use lip sync speech, just call the method Character.SaySync():

Code: ags
  cDaniel.SaySync("&312 It's a gosh-darn debacle, is what it is!");


This will, by default, look for the file dani312.pam (i.e. the same filename format AGS expects) in the $INSTALLDIR$ folder to read the lip sync data from. If someone finds it necessary, I might at some point add the capability to bundle all the individual sync files into one big file.

The module has not been thoroughly tested, so if you find any bugs, let me know.

bx83

Holy balls,  I never knew getting a module would be so easy :D  :P

Do I enter my own phoneme-to-frame mapping in the Lip Sync editor (there only seems to be about 20 phonemes...?);
Or does your module find phonemes auomatically, and it's necessary to make the animation frames? How do I match one ot the other?

This is great work, thank you :)

Snarky

#15
The module works as a replacement for AGS's built-in lip sync support. Like AGS, it doesn't create the lip syncing for you: it just reads phoneme sync data from files and plays the corresponding animation. You have to use some other tool to analyze the voice clips and create the phoneme sync data. And obviously you have to draw the different animation frames: see references here and here - there are others with more and fewer frames and in other art styles, just google "lip sync animation".

I've updated the module so it can read files from these applications/in these formats:
So you first need to decide which of these lip sync tools to use. Note that PAMELA and Papagayo are manual: you enter the line as text to convert it into phonemes, but you then have to align them to the audio clip by hand. In contrast, the Lip Sync Manager plugin, SAPI5.1 Lipsync and Rhubarb work by speech recognition and are automatic (though the plugin also supports editing to fix speech recognition errors).

So I would recommend Rhubarb or the AGS Lip Sync Manager plugin. (In my very minimal experience, based on testing with just a couple of clips, Rhubarb gives much better results and doesn't require any manual editing, so that would be my first choice.) Your choice of program determines the set of phonemes you have to map to animation frames.

The Lip Sync Manager plugin uses the Pamela set of phonemes, which is very large, so you'll almost certainly want to map multiple phonemes to the same animation frame (though unlike the built-in AGS system, this module has no limitation on the number of different frames you can have). To do so, you call TotalLipSync.AddPhonemeMappings(String phonemes, int frame), with the phonemes string containing all the phonemes you want to map to that frame, separated by a '/'. For example, TotalLipSync.AddPhonemeMappings("AY/AA/AH/AE", 1) means that all those phonemes will use frame 1 of the character's speech view (frame 0 is used for the mouth closed/silent frame, by convention).

To make this job simpler, I've added the TotalLipSync.AutoMapPhonemes() method, which sets up a default phonemes-to-frames mapping for each data format (.anno mapping not yet implemented). The Pamela automapping currently uses this mapping, suggested by AGD2. This may change in the final release, as I'm trying to figure out a mapping that is consistent between the different formats. For Rhubarb (which doesn't output the phoneme itself, but a phoneme class labeled A-H or X), the default mapping is what is shown on the Rhubarb page, with X as frame 0. In Pamela terms, this roughly means:

0: X (None)
1: A (P/B/M)
2: B (EE/K/S/T/D/TH/DH...)
3: C (EH/AH/EY)
4: D (AA/AE/AY)
5: E (AO)
6: F (UW/OY/W)
7: G (F/V)
8: H (L)

Edit: Oh, and have a look at the module header. It has a comment explaining the whole thing.

bx83

Okay, think I'll use Rhubarb - though it's not yet implemented :P (found a todo next to parseRhubarb, maybe I'm wrong?)
By the look of Rhubarb page, I need to download and compile for Windows and Linux, and then it will be called by your script?


Snarky

#17
If you download v0.2 from the link in the last post, it has the Rhubarb implementation: https://www.dropbox.com/s/o0xxxjyyerl8o6i/TotalLipSync-0.2.scm?dl=0

You don't need to compile Rhubarb, just get the latest release for Windows or OSX: https://github.com/DanielSWolf/rhubarb-lip-sync/releases

The script doesn't call Rhubarb, you'll have to do all of that. Take all the speech clips, convert them to .wav if necessary, copy them into the Rhubarb directory, and for each one, call "rhubarb.exe myclip.wav > myclip.tsv" (where "myclip" is the name of the clip). You can also put the text corresponding to each clip in individual .txt files to assist with the speech recognition, and then you'd call "rhubarb.exe myclip.wav -d myclip.txt > myclip.tsv". Then once that's done, copy all the .tsv files over into the directory of your compiled AGS game, and the module will read them.

Obviously this process is tedious, and also Rhubarb takes quite a while to process each clip, so if you have more than a couple of dozen clips you'll definitely want to automate it (you could write a batch file to go through and process each .wav file in the directory), but that's sort of outside the scope of the module. There might also be a way to automatically or at least more easily export the text of each line with an associated voice clip into a separate text file, perhaps using the Speech Center plugin, but I haven't looked into it.

Snarky

#18
OK, I wrapped up the code and added some more documentation, fixed some bugs (Sierra-style speech wasn't actually working before: I didn't realize just how hardcoded that whole thing is in the engine, and it required a ridiculous workaround to make happen, but now it's there), and released it as a proper module: http://www.adventuregamestudio.co.uk/forums/index.php?topic=54722.0

Edit: Further discussion moved to the module thread.

SMF spam blocked by CleanTalk