TotalLipSync
(Character head/animation by Preston Blair. King Graham sprite ripped from King's Quest II VGA by AGDI. Background from AGS Awards Ceremony by Ali.)TotalLipSync is a module for voice-based lip sync. It allows you to play back speech animations that have been synchronized with voice clips. TotalLipSync has all the same capabilities as the voice-based lip sync that is built into AGS, and offers the following additional advantages:
- It works with LucasArts-style speech (as well as with Sierra-style and full-screen speech modes).
- In addition to the Pamela (.pam) and Papagayo/Moho Switch (.dat) formats supported by AGS, it also reads Annosoft SAPI 5.1 (.anno) and Rhubarb (.tsv) lip sync files.
- In particular, Rhubarb support means that lip syncing can be 100% automated (with decent results): no manual tracking of the speech clips is required.
- It is more flexible: You can switch speech-styles mid-game, change the phoneme mapping, use files with different data formats, etc.
- You don't have to do the phonemes-to-frames mapping manually: The module comes with a default auto-mapping.
How to use- Create the lip sync data files for the speech clips. You can use one of these tools (personally I would recommend Papagayo for manual tracking, and Rhubarb for automatic lip syncing, but the Lip Sync Manager plugin is good too):
The filename of each sync file should be the same as the speech clip except for the extension, and you need to place them in your compiled game folder (by default, in a folder names "sync/" inside the game folder). - Create the speech animation for your character(s), with different animation frames for the different phonemes (see below), and set it as their speech view.
- Download and import the TotalLipSync module into your AGS project.
- Make sure your game settings are correct: the AGS built-in lip sync (in the project tree under "Lip sync") should be set to "disabled".
- If you are going to use Sierra-style (or full-screen) speech for your lip sync animations, you must create a dummy view. Make sure to give it exactly one loop and one frame. If you name the view TLS_DUMMY it will automatically be used by the module. Otherwise you can set the view to use with TotalLipSync.SetSierraDummyView().
You are now ready to use the module. Add the code to initialize TotalLipSync on startup:
function game_start()
{
TotalLipSync.Init(eLipSyncRhubarb); // Or whatever lip sync format you're using
TotalLipSync.AutoMapPhonemes();
}
Or if you want a custom phonemes-to-frames mapping:
function game_start()
{
TotalLipSync.Init(eLipSyncPamelaIgnoreStress);
TotalLipSync.AddPhonemeMappings("None",0);
TotalLipSync.AddPhonemeMappings("B/M/P",1);
TotalLipSync.AddPhonemeMappings("S/Z/IH/IY/SH/T/TH/D/DH/JH/N/NG/ZH",2);
TotalLipSync.AddPhonemeMappings("EH/CH/ER/EY/G/K/R/Y/HH",3);
TotalLipSync.AddPhonemeMappings("AY/AA/AH/AE",4);
TotalLipSync.AddPhonemeMappings("AO/AW/UH",5);
TotalLipSync.AddPhonemeMappings("W/OW/OY/UW",6);
// Frame 7 unassigned to match default Moho mapping
TotalLipSync.AddPhonemeMappings("F/V",8);
TotalLipSync.AddPhonemeMappings("L",9);
}
To speak a line with lip syncing, you simply call the extender functions
Character.SaySync() or
Character.SayAtSync(), using a speech clip prefix:
cGraham.SaySync("&1 This line will be animated with lip sync");
cGraham.SayAtSync(320, 100, 240, "&2 ... and so will this"); // x_left, y_top, width, message
And that's all there is to it! (If you don't use a speech clip prefix, or if there is no matching sync file, the speech animation won't play at all.)
Phoneme-to-frame mappingsThe principle of lip syncing is that different sounds (phonemes) correspond to different mouth shapes. If we display an animation frame with the right mouth shape at the same time as that sound appears in the audio being played, the animation will seem to match the speech. The first step, then, is to identify the phonemes and timing of them in the speech (that's what the tools listed above are for), and the second step is to choose an appropriate animation frame for each phoneme. We usually don't use different animation frames for all the different phonemes, so we combine phonemes into groups that are all mapped to a single frame. The different tools have different sets of phonemes (or phoneme groups), so we have to define different mappings from phonemes to frames.
So here is the default mapping for each data format used by TotalLipSync. It has been set up for a speech animation with ten different frames, each representing a different mouth position. (This is a fairly standard setup.) If you stick to these frames and these mappings, you can use the same speech view no matter what lip sync tool or data format you use:
Frame | Description | | | | Rhubarb phoneme ID | Moho phoneme | Pamela phonemes |
0 | Mouth closed (or slack) | [slack or same as 1] | | | X | rest | None |
1 | M, B, P | | | | A | MBP | M/B/P |
2 | Various consonants, (Rhubarb: Ee-type sounds) | | | | B | etc | K/S/T/D/G/DH/ TH/R/HH/CH/Y/N/ NG/SH/Z/ZH/JH |
3 | Eh-type sounds, (Non-Rhubarb: Ee-type sounds) | | | | C | E | IH/IY/EH/AH/ EY/AW/ER |
4 | Ah-type and I-type sounds | | | | D | AI | AA/AE/AY |
5 | Aww-type sounds, Ow-type sounds (can also go in 6) | | | | E | O | AO/OW |
6 | U-type and Oo-type sounds (Non-Moho: W) | | | | F | U | UW/OY/UH |
7 | Moho: W | | [same as 6] | [same as 6] | [same as 6] | WQ | W |
8 | F, V | | | | G | FV | F/V |
9 | L (Th-type sounds can also go here, rather than in 2) | | | | H | L | L |
Where to get itTotalLipSync is hosted on Github (mainly just as a way for me to learn about how Github works):
https://github.com/messengerbag/TotalLipSyncYou can download the current release from there:
Known bugsNone
Change log0.5
-Added APIs to get the currently lip syncing character, the current phoneme and current frame.
0.4
-Fixed support for Sierra-style speech
-Minor bug fixes for edge-cases
-Documentation
0.2 (pre-release)
-Added support for Papagayo/Moho Switch (.dat), Annosoft SAPI 5.1 LipSync (.anno) and Rhubarb (.tsv)
0.1 (pre-release)
-Pamela support for LucasArts-style speech
Originally based on
code by Calin Leafshade (though very little of it remains in the current version).
Thanks to Grundislav for providing a speech view used in development and testing of the module!