Unity Catalog
- Centralized governance solution across all your workspaces on any cloud.
- Unify governance for all data and AI assets
- files, tables, machine learning models, dashboards
- based on SQL
- set configuration and rule once for all workspaces
- metatstore separated out of workspace and managed through account console, which can be assigned to workspace(s)
UC metastore != Hive metastore
Hive metastore only tides with the workspace. UC metastore offer improved security and advanced features on top level.
UC added the 3rd names space:
//from
SELECT * FROM schema.table
//to
SELECT * FROM catalog.schema.table
3 types of identities:
- Users: identified by e-mail addresses
- Account administrator
- Service Principles: identified by Application IDs
- Service Principles with administrative privilege
- Groups: grouping Users & Service Principles
- group can be nested with other groups
- e.g. HR + Finance group within Employees group
Identify Federation
- account can be created once in console, then assigned to workspace
- therefore, no need to do individual configuration for each workspace
- at account level
- at workspace level
Privileges in UC:
- CREATE
- USAGE
- SELECT
- MODIFY
- READ FILES
- WRITE FILES
- EXECUTE (allow executing user defined functions)
Accessing legacy Hive megastore
- Metastore can be accessed even the unity catalog is enabled
- Metastore still exist to the individual workspace
Features of Catalog:
- Centralized governance for data and AI
- Built-in data search and discovery
- Automated lineage
- No hard migration required when being enable