Amazon DataZone

  • Data management service
    • Facilitates cataloging, discovering, sharing, and governing data
  • Supports
    • AWS
    • On-Premises data
    • Third-party data
  • Users
    • Engineers
    • Data scientists
    • Product managers
    • Analysts, business users
  • Easy and secure access to data with governance and transparency
  • Becoming subsumed into SageMaker Unified Studio
  • Producers: S3, redshift, glue
  • consumers: Athena, SageMarker, QuickSight

Use Case

  • Catalog and discover data
    • Automated metadata management
  • Govern data access
    • Fine-grained access controls
    • Governance workflows
  • Collaborate across teams
    • Projects, shared analytics tools
  • Automate workflows
    • Share data between producers and consumers

Key Component

  • Domains
    • Organizational entities to group users, data, and projects
  • Data portal
    • Web application, outside of AWS console
    • Catalog, discover, govern, share, analyze data
    • IAM authentication
  • Business Data Catalog
    • Define taxonomy / glossary
  • Data projects
    • Groups people, data sets, analytics tools
  • Data environments
    • Provides infrastructure within projects (storage, analytics tools)
  • Governance and access control
    • Built-in workflows for requesting data access and approving it
    • Manages permissions via Lake Formation, Redshift