Warehouse Sizing β Small vs Medium vs Large Explained With Cost
β¨ Story Time β βWhich Warehouse Size Should We Use?ββ
Meet Daniel, a data engineer who joined a company that recently migrated to Snowflake. Everything is going well⦠until his manager asks:
βDaniel, which warehouse size should we use: Small, Medium, or Large?
And how much will it cost us?β
Daniel freezes.
Small is cheaperβ¦
Medium is fasterβ¦
Large is powerfulβ¦
XL, 2XL, 3XL existβ¦
Which one is right?
Choosing wrong = you waste money or slow down the company.
So letβs walk through Snowflake warehouse sizing in the simplest possible way.
π§© What Is a Snowflake Virtual Warehouse?β
A warehouse in Snowflake is your compute engine used for:
- SELECT queries
- INSERT/UPDATE/MERGE
- ELT pipelines
- BI dashboards
- Data transformations
Storage and compute are separate, so warehouse size affects speed + concurrency + cost, not storage.
ποΈ Snowflake Warehouse Sizes (XS β 6XL)β
Snowflake offers 8 warehouse sizes:
| Size | Power Multiplier | Cost per Hour* |
|---|---|---|
| X-Small | 1x | Cheapest |
| Small | 2x | 2Γ XS |
| Medium | 4x | 4Γ XS |
| Large | 8x | 8Γ XS |
| X-Large | 16x | 16Γ XS |
| 2X-Large | 32x | 32Γ XS |
| 3X-Large | 64x | 64Γ XS |
| 4X-Large | 128x | 128Γ XS |
π° Actual cost depends on cloud + region, but scaling is linear.
If XS costs $2/hour, then:
- SMALL = $4/hour
- MEDIUM = $8/hour
- LARGE = $16/hour
You pay more β you get more compute power β faster and more concurrent work.
β‘ How Snowflake Scales Performanceβ
Larger warehouses give you:
β More compute powerβ
Faster execution of large transformations.
β More parallelismβ
Snowflake splits your query into micro-partitions and tasks.
β Higher concurrencyβ
More users or pipelines can run at once without queuing.
β Better performance for:β
- Massive JOINs
- Heavy aggregations
- Large MERGE statements
- Complex ETL pipelines
But this doesn't mean Large is always better.
You might pay 8Γ more with no noticeable speed improvement.
π Small vs Medium vs Large β The Real Differencesβ
π© X-Small / Smallβ
Best for:
- Light BI dashboards
- Simple SELECT queries
- Small data loads
- Development workloads
- Ad-hoc SQL
Why use it:
- Cheapest
- Surprisingly powerful for small datasets
- Autoscaling makes it even better
π§ Mediumβ
Best for:
- Moderate ETL pipelines
- Multi-user analytical workloads
- Use cases where Small is βalmost enough but not quiteβ
Why use it:
- 2Γ the power of Small
- Handles larger joins
- Good balance of cost and performance
π₯ Largeβ
Best for:
- Heavy transformations
- Huge JOIN operations
- Daily ELT workloads processing 100M+ rows
- High-concurrency dashboards
- MERGEs on large tables
Why use it:
- 2Γ Mediumβs compute
- Scales up CPU + threads + parallelism
- Produces noticeable speed boosts
π§ͺ Real-World Story β Daniel Runs a Testβ
Daniel runs the same query on three warehouse sizes:
Query:β
SELECT customer_id, SUM(amount)
FROM transactions
GROUP BY 1;
Dataset size: 250 million rows
Results:β
| Size | Time | Cost per Run |
|---|---|---|
| Small | 4 min | $0.25 |
| Medium | 1.8 min | $0.24 |
| Large | 1.2 min | $0.32 |
What Daniel learned:β
- Large was fastest, but not the most cost-efficient
- Medium gave best cost-per-performance
- Small was too slow for production workloads
He chose Medium β smart & balanced.
π¦ Cost Efficiency Tips (Most Companies Ignore)β
β 1. Enable Auto-Suspendβ
Set to 1 minute if possible.
No one should pay for idle warehouses.
β 2. Use Auto-Resumeβ
Instant resume = no delays.
β 3. Use Multi-Cluster Only When Neededβ
Multi-cluster is great for concurrency, but costs grow quickly.
β 4. Scale Up Instead of Scaling Outβ
For big queries: β‘ Scale up (Medium β Large) instead of scaling out.
For concurrency: β‘ Scale out (multi-cluster Small).
β 5. Donβt oversize warehousesβ
Larger doesnβt always mean faster β your query may not be parallelizable.
β 6. Measure performance using Query Profileβ
Check bottlenecks before increasing warehouse size.
π§ Simple Rules for Choosing the Right Sizeβ
π¦ Choose X-Small or Small for:β
- dev/testing
- simple SELECT queries
- BI dashboards under 10 users
π§ Choose Medium for:β
- daily ETL
- multi-user SQL workloads
- medium dashboards
- tables up to ~1β5 TB
π₯ Choose Large for:β
- very large joins
- 100Mβ2B row transformations
- BI with 20β50 concurrent users
- major nightly pipelines
π Summaryβ
- Snowflake warehouse size impacts speed, concurrency, and cost.
- Scaling is linear: Medium = 4Γ XS; Large = 8Γ XS.
- Bigger isn't always better β cost-per-performance matters.
- Most production workloads run best on Medium or Large.
- Use auto-suspend, auto-resume, and performance testing to avoid overspending.
Choosing the right warehouse is not about powerβ¦ Itβs about balance, smart scaling, and real workload understanding.
π Next Topic
Automatic Query Optimization β How Snowflake Internally Works