2023-06-30

How to Implement Separate Development and Production Environments in Snowflake

Separate Dev and Prod Environments in Snowflake

In the world of software development, the concepts of development and production environments are fundamental. A development environment is a space where developers work on new features or fix bugs, testing their changes without affecting the actual live system, or "production" environment. This separation helps ensure the stability and security of the production environment by preventing untested or potentially unstable code from reaching it.

Separate environments are also crucial in the realm of data management and data warehouse platforms like Snowflake. Establishing separate environments for development, testing, and production allows teams to experiment with data, test new code, and implement changes without risking disruption to the live production database. In Snowflake, this typically involves creating different databases or accounts for each environment.

One Database per Environment

One approach to separate environments in Snowflake is to use a single Snowflake account and create one database for each environment. Following this structure, you would create databases like DEV_MY_DATABASE, TEST_MY_DATABASE, and PROD_MY_DATABASE for development, testing, and production environments, respectively.

Creating separate databases for each environment provides a clear boundary between them, reducing the risk of accidental data manipulation in the wrong environment. It also offers an intuitive naming convention, helping teams easily identify and switch between databases.

One Schema per Environment

An alternative approach is to have a single database with a separate schema for each environment. In this scenario, you might have MY_DATABASE with schemas DEV_MY_SCHEMA, TEST_MY_SCHEMA, and PROD_MY_SCHEMA.

Creating separate schemas for each environment within a single database can help keep the data organized and accessible. Each schema acts as a separate namespace within the database, allowing different environments to coexist without interfering with each other.

However, there are significant drawbacks to this approach. There is a risk of accidentally running development code on the production schema, which could result in severe data issues. Also, securing each schema can be tedious and prone to errors. Consequently, this method is generally not recommended unless the team is highly disciplined and experienced in managing data security at the schema level.

Schema Naming Conventions

A well-thought-out naming convention makes it easier for developers, data analysts, and other team members to understand the purpose of a given schema at a glance.

Consider the following prefixes for schema names:

LND
To indicate a landing schema used to hold newly ingested data.
RAW
To suggest a raw staging area where data first arrives before any processing or transformation.
INT
To denote an integration area where raw data is combined and cleaned before analysis.
MRT
To signify a data mart holding conformed and cleaned data ready for reporting.

These prefixes provide clarity about the schema's role in the data lifecycle. While these prefixes are just examples, they represent common practices in data warehousing and can be a good starting point for designing your schema naming convention.

One Account per Environment

One Account per Environment approach might work for some very large systems as it maintains strict isolation. However, this approach has significant drawbacks:

How to Implement Separate Development and Production Environments in Snowflake

Separate Dev and Prod Environments in Snowflake

One Database per Environment

One Schema per Environment

Schema Naming Conventions

One Account per Environment

References

Stored Procedures vs. UDFs in Snowflake

Data Types in Snowflake

Ryusei Kakujo