Amazon Textract

  • automatically detect text, handwriting, and data
    • also the relationship between text (eg item/ price in invoice)
    • metadata (where text occurs)
    • document analysis (names, address, bday)
    • receipt analysis (price, vendor, line items, dates)
    • identity document (abstracts fields eg DocumentID)
  • input = document (JPEG, PNG,PDF,TIFF)
  • output = extracted text, structure of text & analysis
  • synchronized (real-time) for most doc, asynchronized for large doc
  • pay as you go
  • integrate with your own applications