Snowflake - Schema Evolution

Semi-structured data tends to evolve over time. Systems that generate data add new columns to accommodate additional information, which requires downstream tables to evolve accordingly.

INFER SCHEMA
- The schema infers when loading in a file, so it is schema-on-read

-- Create a file format that sets the file type as Parquet.  
CREATE OR REPLACE FILE FORMAT my_json_format  
TYPE = json,  
STRIP_OUTER_ARRAY=TRUE;  
  
  
-- Query the INFER_SCHEMA function.  
--- the JSON file name is "json_sample_v1.json" & The file format is "my_json_format"  
SELECT *  
FROM TABLE(  
INFER_SCHEMA(  
LOCATION=>'@stage_for_json/json_sample_v1.json'  
, FILE_FORMAT=>'my_json_format'  
)  
);

enable_schema_evolution=true
- as a table setting
- new data loads in → the columns will be updated

--enable for the schema evolution  
alter table demo_json_table_v1 set enable_schema_evolution=true;

--Load the table with new column using the COPY INTO command, a new column will be added
COPY INTO demo_json_table_v1  
FROM '@stage_for_json/'  
FILES=('json_file_2.json')  
FILE_FORMAT=my_json_format  
match_by_column_name=case_insensitive ---This is the important step  
purge=true;

https://docs.snowflake.com/en/user-guide/data-load-schema-evolution

Stanley Chan's Note🧠

Explorer

Snowflake - Schema Evolution

Graph View

Backlinks