Today at AWS re:Invent, Amazon Web Services announced five new machine learning services and a deep learning-enabled wireless video camera for developers.
Amazon SageMaker is a fully managed service for developers and data scientists to quickly build, train, deploy, and manage their own machine learning models.
AWS also introduced AWS DeepLens, a deep learning-enabled wireless video camera that can run real-time computer vision models to give developers hands-on experience with machine learning.
And, AWS announced four new application services that allow developers to build applications that emulate human-like cognition: Amazon Transcribe for converting speech to text; Amazon Translate for translating text between languages; Amazon Comprehend for understanding natural language; and, Amazon Rekognition Video, a new computer vision service for analysing videos in batches and in real-time.
Amazon SageMaker and AWS DeepLens make machine learning accessible to all developers
Today, implementing machine learning is complex, involves a great deal of trial and error, and requires specialised skills.
Developers and data scientists must first visualise, transform, and pre-process data to get it into a format that an algorithm can use to train a model.
Even simple models can require massive amounts of compute power and a great deal of time to train, and companies may need to hire dedicated teams to manage training environments that span multiple GPU-enabled servers.
All of the phases of training a model—from choosing and optimising an algorithm, to tuning the millions of parameters that impact the model’s accuracy—involve a great deal of manual effort and guesswork.
Then, deploying a trained model within an application requires a different set of specialised skills in application design and distributed systems.
As data sets and variables grow, customers have to repeat this process again and again as models become outdated and need to be continuously retrained to learn and evolve from new information. All of this takes a lot of specialised expertise, access to massive amounts of compute power and storage, and a great deal of time.
To date, machine learning has been out of reach for most developers.
Amazon SageMaker is a fully managed service that removes the heavy lifting and guesswork from each step of the machine learning process.
Amazon SageMaker makes model building and training easier by providing pre-built development notebooks, popular machine learning algorithms optimized for petabyte-scale datasets, and automatic model tuning.
Amazon SageMaker also dramatically simplifies and accelerates the training process, automatically provisioning and managing the infrastructure to both train models and run inference to make predictions using these models.
AWS DeepLens was designed from the ground up to help developers get hands-on experience in building, training, and deploying models by pairing a physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services to support learning and experimentation.
“Our original vision for AWS was to enable any individual in his or her dorm room or garage to have access to the same technology, tools, scale, and cost structure as the largest companies in the world. Our vision for machine learning is no different,” says Swami Sivasubramanian, AWS Machine Learning VP.
“We want all developers to be able to use machine learning much more expansively and successfully, irrespective of their machine learning skill level. Amazon SageMaker removes a lot of the muck and complexity involved in machine learning to allow developers to easily get started and become competent in building, training, and deploying models.”
With Amazon SageMaker developers can:
- Easily build machine learning models with performance-optimized algorithms: Amazon SageMaker is a fully managed machine learning notebook environment makes it easy for developers to explore and visualize data they have stored in Amazon Simple Storage Service (Amazon S3), and transform it using all of the popular libraries, frameworks, and interfaces. Amazon SageMaker includes ten of the most common deep learning algorithms (e.g. k-means clustering, factorization machines, linear regression, and principal component analysis), which AWS has optimised to run up to ten times faster than standard implementations. Developers simply choose an algorithm and specify their data source, and Amazon SageMaker installs and configures the underlying drivers and frameworks. Amazon SageMaker includes native integration with TensorFlow and Apache MXNet with additional framework support coming soon. Developers can also specify any framework and algorithm they choose by uploading them into a container on the Amazon EC2 Container Registry.
- Fast, fully managed training: Amazon SageMaker makes training easy. Developers simply select the type and quantity of Amazon EC2 instances and specify the location of their data. Amazon SageMaker sets up the distributed compute cluster, performs the training, outputs the result to Amazon S3, and tears down the cluster when complete. Amazon SageMaker can automatically tune models with hyper-parameter optimisation, adjusting thousands of different combinations of algorithm parameters to arrive at the most accurate predictions.
- Deploy models into production with one click: Amazon SageMaker takes care of launching instances, deploying the model, and setting up a secure HTTPS end-point for the application to achieve high throughput and low latency predictions, as well as auto-scaling Amazon EC2 instances across multiple availability zones (AZs). It also provides native support for A/B testing. Once in production, Amazon SageMaker eliminates the heavy lifting involved in managing machine learning infrastructure, performing health checks, applying security patches, and conducting other routine maintenance.
With AWS DeepLens, developers can:
- Get hands-on machine learning experience: AWS DeepLens is the first of its kind: a deep-learning enabled, fully programmable video camera, designed to put deep learning into the hands of any developer, literally. AWS DeepLens includes a HD video camera with on-board compute capable of running sophisticated deep learning computer vision models in real-time. The custom-designed hardware, capable of running over 100 billion deep learning operations per second, comes with sample projects, example code, and pre-trained models so even developers with no machine learning experience can run their first deep learning model in less than ten minutes. Developers can extend these tutorials to create their own custom, deep learning-powered projects with AWS Lambda functions. For example, AWS DeepLens could be programmed to recognize the numbers on a license plate and trigger a home automation system to open a garage door, or AWS DeepLens could recognize when the dog is on the couch and send a text to its owner.
- Train models in the cloud and deploy them to AWS DeepLens: AWS DeepLens integrates with Amazon SageMaker so that developers can train their models in the cloud with Amazon SageMaker and then deploy them to AWS DeepLens with just a few clicks in the AWS Management Console. The camera runs the models, in real-time, on the device.
New speech, language, and vision services allow app developers to easily build intelligent applications
For those developers who are not experts in machine learning, but are interested in using these technologies to build a new class of apps that exhibit human-like intelligence, Amazon Transcribe, Amazon Translate, Amazon Comprehend, and Amazon Rekognition video provide high-quality, high-accuracy machine learning services that are scalable and cost-effective.
"Today, customers are storing more data than ever before, using Amazon Simple Storage Service (Amazon S3) as their scalable, reliable, and secure data lake,” Subramaniam says.
“These customers want to put this data to use for their organization and customers, and to do so they need easy-to-use tools and technologies to unlock the intelligence residing within this data.
“We’re excited to deliver four new machine learning application services that will help developers immediately start creating a new generation of intelligent apps that can see, hear, speak, and interact with the world around them,” he adds.
- Amazon Transcribe (available in preview) converts speech to text, allowing developers to turn audio files stored in Amazon S3 into accurate, fully punctuated text. Amazon Transcribe has been trained to handle even low fidelity audio, such as contact center recordings, with a high degree of accuracy. Amazon Transcribe can generate a time stamp for every word so that developers can precisely align the text with the source file. Today, Amazon Transcribe supports English and Spanish with more languages to follow. In the coming months, AmazonTranscribe will have the ability to recognize multiple speakers in an audio file, and will also allow developers to upload custom vocabulary for more accurate transcription for those words.
- Amazon Translate (available in preview) uses state of the art neural machine translation techniques to provide highly accurate translation of text from one language to another. Amazon Translate can translate short or long-form text and supports translation between English and six other languages (Arabic, French, German, Portuguese, Simplified Chinese, and Spanish), with many more to come in 2018.
- Amazon Comprehend (available today) can understand natural language text from documents, social network posts, articles, or any other textual data stored in AWS. Amazon Comprehend uses deep learning techniques to identify text entities (e.g. people, places, dates, organizations), the language the text is written in, the sentiment expressed in the text, and key phrases with concepts and adjectives, such as ‘beautiful,’ ‘warm,’ or ‘sunny.’ Amazon Comprehend has been trained on a wide range of datasets, including product descriptions and customer reviews from Amazon.com, to build best-in-class language models that extract key insights from text. It also has a topic modelling capability that helps applications extract common topics from a corpus of documents. Amazon Comprehend integrates with AWS Glue to enable end-to-end analytics of text data stored in Amazon S3, Amazon Redshift, Amazon Relational Database Service (Amazon RDS), Amazon DynamoDB, or other popular Amazon data sources.
- Amazon Rekognition Video (available today) can track people, detect activities, and recognize objects, faces, celebrities, and inappropriate content in millions of videos stored-in Amazon S3. It also provides real-time facial recognition across millions of faces for live stream videos. Amazon Rekognition Video’s easy-to-use API is powered by computer vision models that are trained to accurately detect thousands of objects and activities, and extract motion-based context from both live video streams and video content stored in Amazon S3. Amazon Rekognition Video can automatically tag specific sections of video with labels and locations (e.g. beach, sun, child), detect activities (e.g. running, jumping, swimming), detect, recognize, and analyze faces, and track multiple people, even if they are partially hidden from view in the video.