Select Page
Amazon to the rescue with TTS

Amazon to the rescue with TTS

2 weeks ago I published a blog post about Microsoft shutting down their Data Market place with a deadline of March 25th 2017, leaving many smart home owners with almost no options for having a somehow decent quality Text-to-Speech (TTS) event notification and announcement offering for their smart homes.

To make things worse, Microsoft is now redirecting all customers from http://datamarket.azure.com to their standard azure website. As an existing user of their Translation to Text engine you will try to find your existing service and your authorization keys with no luck. All their links on their azure website will try to make you sign up for a new Azure account with a $200 credit… unless you have the old URL/Link available for your service, which is https://datamarket.azure.com/dataset/bing/microsofttranslator or alternatively you can use https://datamarket.azure.com/account/.

As an end user I have to say, that this kind of customer handling is unacceptable especially after Microsoft emailed every customer, that their access will be available until March 25th 2017 and this was even stated on their old data market place website in a top banner. I posted screenshots about those in my previous blog here http://homeautomation.expert/azure-datamarket-shutdown.

With all this uncertainty about the future of Text-to-Speech (TTS) for smart home owners Amazon announced yesterday the release of their new service called “Amazon Polly” https://aws.amazon.com/polly/.

“Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products. Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Polly includes 47 lifelike voices spread across 24 languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries.”

Here is an example of the quality of Amazon Polly.

Amazon Polly is offered under the Amazon Free Tier concept for 12 months free of charge, from the day an end user creates his/her AWS account. Under the Free Tier account an end user can submit up to 5.000.000 characters per month. After the Free Tier trial period has ended the end user receives 1.000.000 characters per month for the price of $4.00 per month.

Let’s compare the currently active Microsoft TTS and the Amazon Polly service, despite Microsoft is shutting down their Data Market place and moving this feature under their “Cognitive Services accounts” category in Azure, currently available under preview only with no pricing information ,unless you sign up for an Azure account:

Amazon also provides example use cases enabling end users to estimate, how many characters certain voice tasks will consume. The examples range from number of requests with number of characters per request, emails, book examples and news articles. For this exercise, I examined a typical standard smart home usage using the following formula:

~50 characters per request x 14 requests per hour x 24 hours per day x 30 days per month = 504.000 characters / month

Those numbers are average numbers over the duration of 1 year normalized. A smart home owner would have to double the amount of requests or the length of the announcements to overcome the 1.000.000 character barrier into the next price range of Amazon Polly.

Efficiency

The other important aspect of comparing those two Text-to-Speech (TTS) services is their efficiency. By efficiency the aspect of file size and transfer time is important.

The example voice output above consumes 48kb using Amazon Polly. The same text synthesized using the Microsoft TTS engine consumes 142kb. Taking into account the time to upload the text to be synthesized, the amount of time it takes to actually synthesize this text into a voice output and then pushing it back to the end user, will be impacted by the file size and amount of characters.

Both engines allow the output to be defined in terms of the file format, while the most commonly used output is and will continue to be .mp3 in terms of smart home usage from a compatibility perspective.

Amazon offers a comprehensive tutorial about Polly and code examples using Python, IOS and Android. Microsoft offers examples for Ajax, Soap and HTTP. For both TTS services the end user has to create credentials to use the actual service. For Microsoft the end user creates a client ID and a client secret, which will be used to authenticate the application/end user.

With Amazon the security model is much more sophisticated. Identity and Access Management (IAM) is being used with Amazon, where the end user has a root account, which can be and should be protected with multi-factor authentication. From there the end user can create various users and groups, which can actually use the Amazon Polly service.

The actual Polly service offers two groups per default. The Full access and Read Only access group policies and those can be assigned to user accounts to user the Amazon Polly service utilizing the signature version 4 Test Suite from Amazon for the signing process.

One more important item to mention is that Amazon Polly supports Speech Synthesis Markup Language (SSML). Amazon Polly generates speech from both plain text input and Speech Synthesis Markup Language (SSML) documents that conform to SSML version 1.1. Using SSML tags, you can customize and control aspects of speech such as pronunciation, volume, and speech rate as defined in the W3C recommendation https://www.w3.org/TR/2010/REC-speech-synthesis11-20100907/.

In summary… Amazon released their Text-to-Speech (TTS) service Amazon Polly at the right time, offering superior efficiency in terms of response time and file size, while being 2.5x more cost efficient than Microsoft’s Text-to-Speech (TTS) service today.

First integration attempts into smart home hubs are already in progress e.g. LUA code sharing within 48 hours of Amazon releasing Polly.

HomeAutomation.Expert

Disclaimer: This blog and tweets represent my own view points and not of my employer, Amazon Web Services.

Chinese Alexa or Chinese Google Home

Chinese Alexa or Chinese Google Home

With Google Home entering the VCD (Voice Command Device) market, people assumed that there will be a rivalry between Google and Amazon. Out of the blue a third competitor entered the stage out of China. This device named “Ling Long Ding Dong” (name is for real) will address the Chinese VCD market.

The DingDong, which costs the equivalent of $118, provides news, weather, and stock updates. It answers questions, manages schedules, provides directions, and plays music and audiobooks. It is the first product from Beijing LingLong Co., a $25 million joint venture between JD.com, China’s largest online retailer, and voice recognition powerhouse iFlytek.

The gadget weighs about 3 pounds and stands 9.5 inches tall. It is circular at the top and square on the bottom, and available in white, red, black, and purple. The shape symbolizes tiānyuán dìfāng—the notion that “heaven is round, Earth is square,” a concept that Liu says is central to LingLong’s design language. The colors also are imbued with meaning; white is associated with purity, and red with prosperity.

Three commands wake the device: DingDong DingDong, Xiaowei Xiaowei (a girl’s nickname), and BaiLing BaiLing(skylark). The DingDong comes in Mandarin and Cantonese versions (the engines required to understand the languages are too complex to include them both in one device). Most people speak Mandarin, and the myriad accents and dialects present a Herculean challenge. Still, the company claims the DingDong understands roughly 95 percent of the population.

The company will have their own skill market place, which will include applications and skills for home automation.

Azure DataMarket shutdown

Azure DataMarket shutdown

Website screenshot
Shutdown Email

Microsoft announced the shutdown of their datamarket place as you can see in that email. One of their services being used in smart home deployment is their TTS (Text to Speech) service allowing smart homes to announce events using voice options in different languages and genders.

This service was free of charge for up to 2.000.000 characters, which was more than enough for the most common smart homes. Anything beyond 2M was reasonably priced, if needed.

This Microsoft TTS service became very popular when Google implemented CAPTCHA (a program or system intended to distinguish human from machine input, typically as a way of thwarting spam and automated extraction of data from websites) resulting in no longer having the capability to announce events using voice in smart homes from Google.

There are other options like Mary TTS, FreeTTS, Acapela, etc, where you can install a local TTS server at your home to replace a cloud based TTS service. However, not everybody has the skills and knowledge to install and maintain a local TTS server. The benefits of having a local TTS server are being independent and even if your internet connectivity is down, you still get voice announcements for your smart home events.

VoiceRSS is another cloud based option offering up to 350 requests per day at no cost. With an average of ~45 characters per request x 350 requests per day x 30 days per month = ~500.000 characters compared to Microsoft’s 2.000.000 characters per month service.

However, quality of voice is another aspect to consider. There are plenty of TTS services out there and THE biggest complaint about those is the robotic sound of those voices or even worse not being able to understand sentences, while understanding single words. This is a huge challenge, as you want a smart home to sound like a smart home and not like a robot from the 70s.

This will be an interesting market to watch and more options will arise in the future, but for now people are looking for alternatives to Microsoft’s TTS service given that it is being shut down March 31st 2017.

Ecobee3 Lite

Ecobee3 Lite

Ecobee released a new product – the Ecobee 3 lite. Unlike other competitors, where they release a gen 2 or gen 3 product, Ecobee has downsized the features for households which don’t need all the fancy features allowing them to bring the price down significantly.

In order to allow a lower price point features like the remote sensors, GEO fencing, etc had to go. All the main features of a smart thermostat are still there with the same look and feel and even the interface is the same.

If you don’t need the follow me feature using the remote sensors or if your HVAC system doesn’t provide the option to heat certain parts of your house separately, then you don’t need the Ecobee 3 and you have the option now to go with the Ecobee 3 lite. Remote sensors cannot be added later to the Ecobee 3 lite, but having the option now to downsize the price for features you really need is convenient and helps with your budget.

The Ecobee 3 comes at $249 while the new Ecobee 3 lite has a price point now of $169. This price difference is not insignificant. Time will tell, how many households require the higher feature rich model especially as all the other features are still there e.g. IFTT, support for Amazon Alexa Echo, Echo dot, IFTTT, Samsung, Wink, Vera, etc integration for home automation.

GoogleHome

GoogleHome

Google finally revealed more details around their Google Home assistant product on Oct 4th 2016. They also allow now pre-orders to be placed for their official release date of Nov 4th 2016.

Google is now entering the Voice control market, after they announced earlier this year working on an “Alexa like product but much better”. At the Oct 4th 2016 Google event, where Google announced a variety of new products and product refreshes, they provided a detailed insight into the new Google Home product line.

Let’s go over their major features and functions announced for Google Home. The appearance is the first item Google addressed by offering a variety of textures and colors to chose from. Amazon Echo is also available in white now and so will be the Echo Dots, when they are released October/E 2016.

Here is a video from Google demonstrating some use cases for Google Home.  Check it out.

A very important part of any voice assistant product line is their interoperability with “smart devices”. Alexa from Amazon has a huge head start, but Google is now starting and investing heavily in this area as well. Here is a comparison chart between Google Home and Amazon Alexa.

Far-Field Speaker comparison

Comparing the two systems

The Google Home Speaker seems impressive around all corners and in all directions. Once the product is released more detailed sound comparisons can be conducted.

The Alexa Speaker has proven to be quite impressive in terms of sound quality and base waves. Given the size of that product, the sound it produces is quite excellent.

In summary:

Google is releasing a major competitor to Amazon’s Echo Alexa family. It lacks in some areas, while it has an advantage in other areas. In the end the ease of use, the integration and interoperability options will drive customer adoption. In either way, competition in the market place is always good because it benefits the end consumer… us!

 

 

 

Disclaimer: This blog and tweets represent my own view points and not of my employer, Amazon Web Services.

x  Powerful Protection for WordPress, from Shield Security
This Site Is Protected By
Shield Security
Verified by ExactMetrics