AWS launches machine learning driven text and data extractor Amazon Textract
Amazon Web Services has launched Amazon Textract for general availability, adding a powerful machine learning tool to its cloud services offering.
Textract, a fully managed service, can automatically lift text and data from the vast majority of document types without necessitating manual reviews, development of custom coding or experience with machine learning solutions.
According to AWS’s press release, Textract is capable of contextualising the information it is reading based on its format and the fields presented.
The firm offered names, social security numbers, tax documents, mortgage guarantees, contracts and product SKUs as a small sample of the types of information and documents it can identify.
That information can then be used to populate smart searches for large archives, or be uploaded to a database for use in the operations or development of software applications.
“The power of Amazon Textract is that it accurately extracts text and structured data from virtually any document with no machine learning experience required,” said Swami Sivasubramanian, Vice President of Amazon Machine Learning, in AWS’s press release.
“Subsequently, developers can analyze and query the extracted text and data using our database and analytics services like Amazon Elasticsearch Service, Amazon DynamoDB, and Amazon Athena and integrate with other machine learning services like Amazon Comprehend, Amazon Comprehend Medical, Amazon Translate, and Amazon SageMaker to help customers derive deeper meaning from the extracted text and data.”
He continued: “In addition to the integration with other AWS services, the rich partner community developing around Amazon Textract makes it possible for customers to gain real meaning from their file collections, operate more efficiently, improve security compliance, automate data entry, and facilitate faster business decisions.”