微软的文本转语音服务，已经听不出是机器了-技术圈

微软的文本转语音服务，已经听不出是机器了

共 2931字，需浏览 6分钟

2021-12-19 13:54

今天刷到了微软在 2021 年 5 月发布的文本转语音服务（TTS），试了下，真的听不出这是机器在读，而且，可以分辨出中文的多音字，如士大夫和大(dai)夫，儿化音也可以连起来，可以自动推断出语气和情感，非常智能。感觉以后的播音员要失业了。

大家先来听一段官方的样例，看一看能否听得出这是机器人读的？

如果感兴趣，可以在这里^[1]自己测试下。

要是微信读书里面的机器人可以这么读，那体验就更好了。

微软也给出了 Python 语言调用该服务的代码：

import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and service region (e.g., "westus").
speech_key, service_region = "YourSubscriptionKey", "YourServiceRegion"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

# Creates a speech synthesizer using the default speaker as audio output.
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config)

# Receives a text from console input.
print("Type some text that you want to speak...")
text = input()

# Synthesizes the received text to speech.
# The synthesized speech is expected to be heard on the speaker with this line executed.
result = speech_synthesizer.speak_text_async(text).get()

# Checks result.
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized to speaker for text [{}]".format(text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        if cancellation_details.error_details:
            print("Error details: {}".format(cancellation_details.error_details))
    print("Did you update the subscription info?")

运行上述代码，需要你在微软的 Azure 注册一个账号，可以免费试用，具体教程^[2]见文末。

最后的话

相信在不久的将来，我们完全分辨不出听到的声音是真人发出的还是机器人发出的。

‍留言‍
参考资料

[1]

这里: https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/?ocid=AID3027325#features

[2]

教程: https://docs.microsoft.com/zh-cn/azure/cognitive-services/speech-service/get-started-text-to-speech?tabs=script%2Cwindowsinstall&pivots=programming-language-python

微软的文本转语音服务，已经听不出是机器了

最后的话

‍留言‍参考资料

‍留言‍
参考资料