stuh: "Theoretical barrier" doesn't necessarily mean "IMPOSSIBLE! IMPOSSIBLE! IMPOSSIBLE!". It does, however, mean "IMPOSSIBLE! IMPOSSIBLE! At least without insane advances in AI, for example!", in this case. There are, as DarkStalkey mentioned, pitch shifters that can alter the pitch of a sound and its formants independently, so we have something a little better than just speeding things up or slowing them down. They're still not all that useful for your purposes if you don't do your own acting, though, and their range is limited (not even talking about quality - that's a whole different issue).
The difference is harder to tell than you think, though I wouldn't necessarily call it impossible with today's technology - and making the change doesn't magically follow from it. It's all the more impossible with emotional content - there are so many things to consider. Little sounds on the edge of speech. Short transitions to, oh, shouting or growling or whispering or wailing or almost doing one of these at fitting points of the text. And there are too many different ways to do each of these, all kinds of little details that make up an individual's voice.
This means that even if we had the technology for all that, you'd most likely still have to keep track of a large amount of parameter and do much of the work yourself by sequencing this or that, adjusting transitions... and on top of that it's a lot harder to do that to a recording than to synthesise it outright. Maybe we'll start seeing software for that in a few years, and perhaps it'll sound almost acceptable after a while.
I don't recommend holding your breath.
Quote
if the distinction between male and female can be determined, then the change can also be made...
The difference is harder to tell than you think, though I wouldn't necessarily call it impossible with today's technology - and making the change doesn't magically follow from it. It's all the more impossible with emotional content - there are so many things to consider. Little sounds on the edge of speech. Short transitions to, oh, shouting or growling or whispering or wailing or almost doing one of these at fitting points of the text. And there are too many different ways to do each of these, all kinds of little details that make up an individual's voice.
This means that even if we had the technology for all that, you'd most likely still have to keep track of a large amount of parameter and do much of the work yourself by sequencing this or that, adjusting transitions... and on top of that it's a lot harder to do that to a recording than to synthesise it outright. Maybe we'll start seeing software for that in a few years, and perhaps it'll sound almost acceptable after a while.
I don't recommend holding your breath.