speech to text v2

BlitzMax Forums/BlitzMax Programming/speech to text v2

Chris C(Posted 2006) [#1]
is now using wav's for allophones, and should be x-platform

Where I have the allophone wavs right the words sound *really* good

HOWEVER, I'm no sound editor and I need YOUR help

If you have a good ear for sounds, a nice voice and can chop up wavs (is the any latency with oggs?)

Please contact me via the email address in my profile

Thanks


Chris C(Posted 2006) [#2]
getting there

http://homepage.ntlworld.com/chris.camacho/example.ogg

still need help...


Physt(Posted 2006) [#3]
Hi Chris,

Concatonating wavs will improve the quality of the speech however the phonemes will not blead well. Have a listen to my ChipTalk software. It uses .wavs from the original SP0256-AL2. It would sound better if all the wavs were re-recorded with a human voice but since I was trying to recreate the SP0256 sound, I never bothered.

http://www.speechchips.com/shop/item.asp?itemid=6

If you were going to do a PC speech synth you would have the luxury of recording 100 or 200 common words and only resorting to phonemes when you encounter an unknown word. The best speech synthesizer uses thousands of pre-canned words and phrases and only build a small percentage of their words on the fly.

Also, when building words, they sometimes store the transitions between phonemes so that there is some smoothing between sounds.


Chris C(Posted 2006) [#4]
recording 100 or 200 common words seems a bit(lot) like cheating to me!

Its comming along okay, I'll probably have a bash at redoing the allophones some time now I have a working framwork

what allophone's did the SP0256-AL2 use?


Physt(Posted 2006) [#5]
Check the spo256-al2 datasheet for a list of phonemes. Also, there is a sub dir under the chip talk installation with all the phoneme wavs.