Materialized Views β When to Use & When to Avoid
β¨ Story Time β βWhy is My Query So Slow?ββ
Emma, a data analyst, runs a complex aggregation on a 1 billion-row sales table.
Query runtime: 5 minutes.
Dashboard refresh: slow.
Coffee: cold.
Her friend, Alex, says:
βWhy donβt you try a Materialized View? Snowflake can precompute and store the results!β
Materialized Views (MV) are Snowflakeβs performance optimization superheroes β but they come with caveats.
π§© What is a Materialized View?β
A Materialized View is a precomputed table that stores the results of a query.
- Unlike regular views, MVs store data physically
- Automatically refresh as the underlying table changes
- Ideal for queries that are repeated often or computationally expensive
Example:
CREATE MATERIALIZED VIEW top_customers_mv AS
SELECT customer_id, SUM(amount) AS total_spent
FROM orders
GROUP BY customer_id;
π How It Worksβ
-
Snowflake executes the query in the MV once and stores results
-
When underlying table changes:
- MV is incrementally updated
- Querying MV is faster than scanning the base table
Analogy: Think of it as caching the query results in a smart, auto-refreshing way.
π― When to Use Materialized Viewsβ
β 1. Repeated Heavy Queriesβ
- Aggregations like SUM, COUNT, AVG
- Joins between large tables
- Queries that power dashboards
β 2. Performance-Critical Dashboardsβ
- Users expect near-real-time response
- MV reduces load on warehouses and speeds up queries
β 3. Incremental Refresh Fits Your Dataβ
- Tables with frequent but moderate changes
- MV updates incrementally without reprocessing entire table
β When to Avoid Materialized Viewsβ
- Tables with high-volume, rapid inserts
- Queries that change structure often
- Very small tables (MV overhead > benefit)
- Complex joins with very large fact tables that refresh frequently
- If you rely on ad-hoc queries with different columns or filters
Key: MV is a trade-off between storage cost, refresh overhead, and query speed.
π§ͺ Real-World Exampleβ
Scenario: Retail company wants daily top-selling products for dashboards:
- Base table:
sales(10M rows/day) - MV:
CREATE MATERIALIZED VIEW daily_top_products_mv AS
SELECT product_id, SUM(quantity) AS total_sold
FROM sales
WHERE sale_date = CURRENT_DATE
GROUP BY product_id;
- Queries run instantly
- Dashboard updates automatically as new sales arrive
- No need to scan billions of rows each time
β‘ Benefitsβ
- Speeds up frequent queries
- Reduces compute cost on large tables
- Automatically updated (incremental refresh)
- Works seamlessly with Tasks for pipeline automation
π§ Best Practicesβ
- Use for frequently accessed, compute-heavy queries
- Monitor refresh time β avoid on ultra-high insert tables
- Combine with clustering keys for better performance
- Query MV using direct SELECT; do not over-index
- Avoid overly complex queries that MV cannot incrementally maintain
π Summaryβ
- Materialized Views store precomputed query results physically for faster access.
- Best for frequently queried, heavy aggregation queries and dashboard optimization.
- Avoid on rapidly changing or very large tables where refresh cost outweighs benefit.
- Integrates with Tasks, Streams, and Cloning for automated pipelines.
- MV is a powerful tool but requires careful planning to balance performance, cost, and maintenance.
π Next Topic
Secure Views & Secure Data Sharing Between Teams