Snowflake is 1. Cloud-based Data Warehousing platform.
2. Provides a fully-managed, scalable, and high-performance solution for storing, processing, and analyzing data.
3. It is designed to handle both structured and semi-structured data.
STGAE
- In Snowflake, a stage is a location where data is temporarily stored before it is loaded into or after it has been unloaded from Snowflake tables.
- Stages are used to manage data as it moves in and out of Snowflake and serve as an intermediary storage layer.
- By default, each user and table in Snowflake is automatically allocated an internal stage for staging data files to be loaded.
- In addition, you can create named internal stages. Types: CSV,
Parquet, Avro, Json, ORC, XML.
- User stage: Each user has a Snowflake stage, by default, for storing files.
a. convenient option->files will only be accessed by a single user, but need to be copied into multiple tables.
b. User stages are referenced @~; e.g. use LIST @~; to list the files in a user stage.
c. Unlike named stages, user stages cannot be altered or dropped.
- Table stages: each table has a Snowflake stage allocated to it for storing files.
a. Use when->if you only need to copy files into a single table, but want to make the files accessible to multiple users.
b. A table stage has the same name as the table.
c. A table stage is also not appropriate if you need to copy file data into multiple tables.
d. Table stages don’t support transforming data while loading it
- Named stages: Named stages are database objects that provide the greatest degree of flexibility for data loading.
a. recommended : when you plan regular data loads that could involve multiple users and/or tables
b. Users with the appropriate privileges on the stage can load data into any table.
File Format: Creates a named file format that describes a set of staged data to access or load into Snowflake tables.
Example of named stage:
Step 1: create table
create table employee.employee_schema.EMPLOYEE_1
(Emp_ID varchar,
First_Name varchar,
last_name varchar );
Step 2: create file format
create file format employee.employee_schema.FILE_FORMAT_CSV
type='csv',
field_delimiter=',',
skip_header=1;
Step 3: create stage: From Snowflake cloud UI open EMPLOYEE/EMPLOYEE_SCHEMA schema> create>Stage> Snowflake Managed>Enter stage name as: EMPLOYEE_STAGE>select
Client-side encryption>create. Then upload csv file.
Step 4: copy data load data from staging to table
copy into EMPLOYEE_1
from @EMPLOYEE_STAGE
file_format=(format_name='employee.employee_schema.FILE_FORMAT_CSV');
Please let me know if faced any challenges. Thanks😄😄