All articles
Cost·9 min read

Why is my Snowflake bill so high?

Five reasons your Snowflake bill keeps climbing in 2026, the sixth one nobody is pricing in, and the fixes you can ship this week.

MT
Melt TeamBuilders of Melt · May 18, 2026

Every Monday morning, somebody on a data team opens the warehouse credit chart, sees the line going up and to the right, and posts a screenshot in Slack with a single emoji. The bill is high again. Nothing obviously broke. Nobody added a workload they can name.

If that's you, this is the field guide. We've gone through dozens of agent-era Snowflake bills with design partners, query by query. Here are the reasons bills run hot in 2026, ranked by how often they're the real cause, and the fixes that actually move the number.

The short version

If your Snowflake bill feels too high, it is almost always one of six things:

  1. Idle warehouse time dominates short-query workloads.
  2. Warehouses sized for the worst query, not the average.
  3. Too many warehouses, each with its own idle tax.
  4. Refresh schedules that run more often than the business needs.
  5. Multi-cluster scaling that never scales back down.
  6. Agent-shaped traffic, the one most teams have not priced in yet.

The first five are well-known and you can attack them with config changes you'll make this week. The sixth one is the reason a lot of bills suddenly broke their historical pattern in late 2025. We'll get to that.

1. Idle warehouse time is the biggest cost most teams ignore

Snowflake bills warehouse credits by the second after a 60-second minimum on each spin-up. That minimum matters more than people think. A warehouse that runs a 200 ms query and then sits idle for 58 seconds still bills the full minute. If queries land sparsely, the bulk of your credits go to a warehouse doing literally nothing.

Run this on WAREHOUSE_METERING_HISTORY against the past 7 days. Total credits minus active credits is the idle portion of your bill:

select
  warehouse_name,
  sum(credits_used)                                  as total_credits,
  sum(credits_used_compute)                          as compute_credits,
  sum(credits_used_cloud_services)                   as cloud_services_credits,
  round(100.0 * (sum(credits_used)
                 - sum(credits_used_compute))
              / nullif(sum(credits_used), 0), 1)     as idle_pct
from snowflake.account_usage.warehouse_metering_history
where start_time >= dateadd(day, -7, current_timestamp())
group by 1
order by total_credits desc;

The fix is mechanical:

  • Set AUTO_SUSPEND=60 on every warehouse. The default is 600 seconds, which means a warehouse stays warm for ten minutes after the last query. For bursty traffic, that is mostly idle.
  • Drop MIN_CLUSTER_COUNT to 1 for any multi-cluster warehouse that doesn't need a permanent floor.

For most accounts we've reviewed, those two changes alone take 15–30% off the bill in the first billing cycle. They're free; do them today if you haven't.

2. Warehouses are sized for the worst query, not the average

The pattern: a team picks a LARGE warehouse because one nightly aggregate runs on it. Then every dashboard refresh, every BI query, every ad-hoc SELECT lands on the same LARGE. Snowflake's per-size credit rate doubles each step (XS=1, S=2, M=4, L=8, XL=16, 2XL=32 credits/hour), so an XSMALL-shaped query running on a LARGE costs eight times what it should.

Use QUERY_HISTORY to find the queries that are most over-warehoused. A simple heuristic: short queries on big warehouses are the smoking gun.

select
  warehouse_name,
  warehouse_size,
  count(*)                            as small_short_queries,
  round(avg(total_elapsed_time) / 1000, 2) as avg_seconds
from snowflake.account_usage.query_history
where start_time >= dateadd(day, -7, current_timestamp())
  and warehouse_size in ('Large', 'X-Large', '2X-Large')
  and total_elapsed_time < 2000
  and bytes_scanned < 100 * 1024 * 1024
group by 1, 2
order by small_short_queries desc;

The classic fix is to split the workload across multiple warehouses by size. The honest tradeoff is that splitting adds idle time (reason #3), so it only pays off if each warehouse stays busy enough.

3. Too many warehouses means too much idle time

The reflex after #2 is to fan out: a small warehouse for BI, a medium for dbt, a large for analytics, a 2XL for the nightly job. The problem is that each of those warehouses pays its own 60-second minimum. If you spin up six warehouses and the morning dashboard refresh hits four of them, you're paying four minimes of warm credits for what should have been one warehouse-minute of work.

The honest rule: fewer, busier warehouses beat more, idler ones. Aim to keep each warehouse above ~40% utilization during business hours. If you can't, consolidate.

This is also where the cache helps. Snowflake's local disk cache is per-warehouse, per-cluster. Splitting a workload across warehouses defeats the cache.

4. Refreshes run more often than the business needs

The vast majority of "the bill is too high" investigations end at a scheduler. dbt that runs hourly when the business needs daily. A materialized view that refreshes on every write. A Looker dashboard with auto-refresh on for an unattended TV.

Audit refresh cadence against actual consumption. If nobody looks at a model between 10pm and 6am, don't run it on a 4-hour cadence overnight. Materialized views in particular surprise people: they're billed serverless, and a high-churn source table can make them more expensive than the dashboard they back.

5. Multi-cluster scaling that never scales back

Multi-cluster warehouses are intended to absorb concurrency spikes. They're priced per cluster, so if you set MIN_CLUSTER_COUNT=2 "just in case," you're paying for two warehouses' worth of idle time 24/7.

The fix:

  • Set MIN_CLUSTER_COUNT=1 unless you have a verified concurrency floor.
  • Set SCALING_POLICY='ECONOMY' on warehouses that can tolerate a small queue. The default 'STANDARD' scales aggressively, which is great for latency and bad for cost.

6. Agent-shaped traffic, the one nobody priced in

Here's the pattern we started seeing on bills late last year: a tidy historical line, then a step change, then a curve that doesn't flatten out. The first time we saw it, we assumed it was a scheduling bug. By February 2026, we'd seen six of them, all on accounts where a product team had stood up an agent that talks to Snowflake.

Snowflake's billing model assumes humans drive the workload. Analysts refresh dashboards in the morning, dbt models materialize overnight, somebody asks a question and somebody runs a query. The warehouse spins up, runs, suspends. Most queries are scheduled, batched, and roughly predictable.

Agents don't run on that cadence. One prompt fans out into dozens of small filters and aggregates while the model iterates toward an answer. Across a fleet of agents, the warehouse never spins down, the cluster keeps autoscaling, and the bill chart goes up and stays up.

What makes it expensive is not that any single agent query is slow. It's that there are thousands of them and they all look the same: scans under 200 MB, single-table filters, simple group-bys, top-N. The kind of query DuckDB answers in milliseconds against a Parquet file on S3. On Snowflake, every one of them spins up a warehouse, holds it for the minimum minute, and bills accordingly.

Here's the diagnostic. Run this and look at agent_shaped_pct:

select
  count(*)                                            as queries,
  sum(case when bytes_scanned < 200 * 1024 * 1024
            and total_elapsed_time < 5000
            and query_type = 'SELECT'
            then 1 else 0 end)                        as agent_shaped,
  round(100.0 * sum(case when bytes_scanned < 200 * 1024 * 1024
                          and total_elapsed_time < 5000
                          and query_type = 'SELECT'
                          then 1 else 0 end)
              / nullif(count(*), 0), 1)               as agent_shaped_pct
from snowflake.account_usage.query_history
where start_time >= dateadd(day, -30, current_timestamp());

For the accounts we've reviewed, the answer ranges from 12% (bills still mostly look human-shaped) to 91% (every dollar past a baseline is going to small reads). If your number is above 50%, the standard playbook above won't move the needle. The bill is dominated by traffic the playbook doesn't address.

What manual tuning can't fix

Reasons 1–5 are config problems. You can fix them in a Friday afternoon with ALTER WAREHOUSE and a scheduler cleanup, and most teams should. They get a bill back to where it would have been a year ago.

Reason 6 is a routing problem. A 100 MB filter against a synced Iceberg or DuckLake table costs effectively nothing on DuckDB. The same query on Snowflake costs you a minute of warm warehouse credits at the size you've provisioned. No amount of warehouse tuning closes that gap, because the cheapest Snowflake warehouse still pays the per-minute floor.

The fix is to not send the query to Snowflake in the first place. Per query, decide whether the lake engine can answer it; if it can, run it on DuckDB; if it can't, pass it through to Snowflake unchanged. That's what we built Melt to do.

How the math actually plays out

On a canonical agent-shaped workload (100,000 queries per month, 150 ms average latency, baseline LARGE warehouse, $3/credit), query routing alone takes a $100/month bill to about $15. Warehouse routing on top of that takes it to about $3.41. The full math, with a calculator you can run on your own numbers, lives on the methodology page.

The headline is that for an agent-shaped workload, the combined savings sit between 75% and 97%, depending on what fraction of your queries can be answered against the lake and whether warehouse routing is enabled.

A 60-minute plan if your bill is high right now

  1. Run the idle-time query above. If idle_pct is above 50% on any major warehouse, set AUTO_SUSPEND=60 and MIN_CLUSTER_COUNT=1 on it. (10 minutes)
  2. Run the small-short-queries query. If any warehouse has hundreds of small queries per week, those queries don't belong on that warehouse. (10 minutes)
  3. Audit your scheduled refreshes. Anything running more often than the business needs? Halve it. (15 minutes)
  4. Run the agent-shaped diagnostic. If it comes back above 30%, the rest of the playbook above won't get you where you want to go. You need routing, not tuning. (5 minutes)
  5. Open the cost calculator and plug in your real numbers. (10 minutes)
  6. If the number is interesting, talk to us. The repo is open and the contact form goes to a real person. (10 minutes)

A high Snowflake bill is almost never one big bug. It's usually five small things layered on top of a workload pattern that wasn't there 18 months ago. The five small things have known fixes. The workload pattern doesn't, yet, in most stacks. That's the part we built Melt for.

Melt Team