It’s a sad state of affairs when Siri, Apple’s voice recognition assistant, becomes national news. But that’s exactly what happened last week when people wrongly ascribed political leanings to Apple in the wake of reports that Siri didn’t direct interested users to abortion clinics. Never mind the fact that places like Planned Parenthood don’t advertise themselves as such, but as you know, people have a proclivity for complaining about anything that might make some headlines.
Abortion nonsense aside, Siri is by all accounts useful for the vast majority of iPhone 4S users who use it. But let’s be honest, for as great as Siri is, it does comes up short in a few respects. After all, Siri, remember, is still in Beta.
Now some cynics were quick to accuse Apple of releasing beta software only as a means to stave off any potential disappointment with the lack of a larger screened iPhone 5. This line of reasoning, however, completely ignores the way voice recognition software works and the factors necessary for it to evolve and improve.
Benoit Maison recently wrote a blogpost detailing why a software feature like Siri necessarily should be released in Beta. With years of experience working on speech recognition at IBM, Maison explains that the key ingredient in developing world class voice recognition software is data. A helluva lot of it.
Transcribed speech recording are used to train acoustic models (how sound waveforms relate to phonemes), pronunciation lexicons (how do people actually mis-pronounce words, specially people and place names), language models (spoken phrases rarely conform to the English grammar), and natural language processors. And that for each supported language! More training data means the recognizer can handle more variations in voices, accents, manners of speech, etc.
In other words, Siri, in order to be as effective and efficient as possible, needed and needs to be trained. Extensive testing by Apple engineers might be sufficient to get it off the ground and acceptable for launch, but that by itself pales in comparison to the amount of data Apple is currently accumulating from the millions of iPhone 4S users around the globe who use Siri on a daily basis.
Siri had to be “released in its current form,” Maison concludes, “to get exposure to as much variability as possible all the way from the acoustics to the interpertation of natural language.