Hands-On Natural Language Processing with Python
上QQ阅读APP看书,第一时间看更新

Converting voice-to-text

Sometimes, a voice has to be converted to text. This is a speech recognition problem. The Google speech recognition system works in 120 languages. The audio can be streamed, or a prerecorded video can be sent. Formatting can be done for different categories, such as proper nouns and punctuation. The following example is from https://cloud.google.com/speech-to-text/:

There are different models provided, for videos, phone calls, and search-based audio. This works even when there is background noise, and the system can filter inappropriate content.