The module works as a replacement for AGS's built-in lip sync support. Like AGS, it doesn't
create the lip syncing for you: it just reads phoneme sync data from files and plays the corresponding animation. You have to use some other tool to analyze the voice clips and create the phoneme sync data. And obviously you have to draw the different animation frames: see references
here and
here - there are others with more and fewer frames and in other art styles, just google "lip sync animation".
I've
updated the module so it can read files from these applications/in these formats:
So you first need to decide which of these lip sync tools to use. Note that PAMELA and Papagayo are manual: you enter the line as text to convert it into phonemes, but you then have to align them to the audio clip by hand. In contrast, the Lip Sync Manager plugin, SAPI5.1 Lipsync and Rhubarb work by speech recognition and are automatic (though the plugin also supports editing to fix speech recognition errors).
So I would recommend Rhubarb or the AGS Lip Sync Manager plugin. (In my very minimal experience, based on testing with just a couple of clips, Rhubarb gives much better results and doesn't require any manual editing, so that would be my first choice.) Your choice of program determines the set of phonemes you have to map to animation frames.
The Lip Sync Manager plugin uses the Pamela set of phonemes, which is very large, so you'll almost certainly want to map multiple phonemes to the same animation frame (though unlike the built-in AGS system, this module has no limitation on the number of different frames you can have). To do so, you call
TotalLipSync.AddPhonemeMappings(String phonemes, int frame), with the
phonemes string containing all the phonemes you want to map to that frame, separated by a '/'. For example,
TotalLipSync.AddPhonemeMappings("AY/AA/AH/AE", 1) means that all those phonemes will use frame 1 of the character's speech view (frame 0 is used for the mouth closed/silent frame, by convention).
To make this job simpler, I've added the
TotalLipSync.AutoMapPhonemes() method, which sets up a default phonemes-to-frames mapping for each data format (.anno mapping not yet implemented). The Pamela automapping currently uses
this mapping, suggested by AGD2. This may change in the final release, as I'm trying to figure out a mapping that is consistent between the different formats. For Rhubarb (which doesn't output the phoneme itself, but a phoneme class labeled A-H or X), the default mapping is what is shown on the
Rhubarb page, with X as frame 0. In Pamela terms, this roughly means:
0: X (None)
1: A (P/B/M)
2: B (EE/K/S/T/D/TH/DH...)
3: C (EH/AH/EY)
4: D (AA/AE/AY)
5: E (AO)
6: F (UW/OY/W)
7: G (F/V)
8: H (L)
Edit: Oh, and have a look at the module header. It has a comment explaining the whole thing.