This article is written by David A. Spezia and originally appeared on the Snowflake Blog here: https://www.snowflake.com/blog/understanding-snowflakes-resource-optimization-capabilities/
The only certainty in today’s world is change. And nowhere is that more apparent than in the way organizations consume data. A typical company might have thousands of analysts and business users accessing dashboards daily, hundreds of data scientists building and training models, and a large team of data engineers designing and running data pipelines. Each of these workloads has distinct compute and storage needs, and those needs can change significantly from hour to hour and day to day. The challenge is making sure each of these workloads is fast, stable, and efficient. Efficiency means being able to provide the best performance at the lowest cost while reducing waste. Unfortunately, legacy data platforms are the epitome of waste and inefficiency. Their fixed resources are sized to accommodate peak demand, which means for most of the day, a large portion of their resources are idle—and incurring costs. On the other hand, complex queries or surges in usage result in bottlenecks because legacy platforms are unable to scale instantly to meet business demand.
VIRTUALLY UNLIMITED SCALE AND FLEXIBLE RESOURCES EXACTLY MATCHED TO BUSINESS REQUIREMENTS
Snowflake’s cloud data platform is different. As a consumption-based service, customers have access to a virtually unlimited set of resources that can nearly instantly be turned on but also automatically scaled down or completely turned off when no longer needed. This instant elasticity provides the flexibility to tightly match resources to the exact needs of every user, team, department, and workload every second of the day. This flexibility also applies to storage, compute, and serverless tasks; each resource can be independently scaled. The result is that Snowflake customers pay only for the resources they need, when they need them, which maximizes efficiency and results in minimal waste and lower costs.
MANAGED SERVICE WITH AUTOMATED RESOURCE OPTIMIZATION
Snowflake automates resource optimization, tunes queries, and eliminates basic maintenance tasks such as vacuuming, partitioning, and indexing, all of which reduces IT administrator overhead and eliminates costly service disruptions. Through seamless ongoing updates, Snowflake is designed to be faster and more efficient every year. Since June of 2019, Snowflake has reduced cloud services execution time by 42% and reduced query compilation time by 16%. In addition, the same recurring queries that customers run every day, take 4,400 fewer hours to run each day than they did a year ago. These efficiency improvements translate directly into faster performance and lower costs.
A COMPETITIVE DATA ADVANTAGE CAN DELIVER HIGHER ROI AND LOWER TCO
As organizations launch more data workloads and onboard more business users to their data platforms, they need to justify spend by directly tying it back to business value while also improving efficiency to lower TCO.
Snowflake’s inherent architectural efficiency, automated resource optimization, and ability to reduce operating costs provides customers with superior price-performance, delivering immense cost savings for organizations migrating from legacy on-premises solutions and other cloud data platforms. But more importantly, Snowflake helps customers generate a tremendous return on their data investment by providing better analytics across the business and enabling new revenue streams that were previously impossible. A newly released Forrester TEI report concluded that Snowflake can deliver an ROI of 612% over three years, including infrastructure and database management savings worth $5.9 million.
Snowflake’s multi-cluster shared data architecture delivered 10 times the performance of Uniper’s previous platform at a 30% lower cost.
Snowflake’s as-a-service model, multi-cluster shared data architecture, and per-second pricing provided an 800% cost savings while supporting exponentially more data and computing.
POWERFUL MONITORING AND MANAGEMENT TOOLS
Snowflake’s built-in resource monitoring features provide customers with complete transparency into usage and billing, enabling granular chargeback and showback capabilities tied to individual budgets. Granular performance and consumption data is available in Snowsight or through external BI tools for advanced usage forecasting. To complement its monitoring capabilities, Snowflake provides powerful alerting and usage management tools that can be applied at the user, resource, workload, and account level. And unlike some cloud platforms, Snowflake provides in-depth tuning capabilities for advanced scenarios.
CONSUMPTION-BASED PRICING ENABLES EFFICIENCY AND AGILITY
Unlike traditional solutions, Snowflake uses resources much more efficiently, enabling significant cost savings, for example:
- Compute resources can be dynamically scaled up or down for each individual workload as demand for more concurrency or raw compute power is needed. Each individual workload can be set to prioritize performance or enforce tight cost controls, depending upon business need.
- Snowflake can automatically shrink and even completely suspend compute resources. Coupled with per-second billing, this means Snowflake allows you to stop incurring costs when tasks are completed.
- Storage costs for Snowflake are generally a pass-through cost from the underlying cloud provider. Just as with compute, there are no size limits and no required capacity planning for storage; you simply load data into Snowflake as needed and pay for what you use. In addition, Snowflake automatically compresses all data usually on the order of 3–5x, resulting in significantly lower storage utilization than when the equivalent raw data is stored in traditional data warehouses or file storage.
- Snowflake provides a robust set of serverless services that are optimized for very short, infrequent, or light compute jobs. Tasks such as continuous batch loading of data through Snowpipe, search acceleration through Snowflake’s search optimization service, and database replication and failover are available on an as-needed basis without the need to assign a dedicated virtual warehouse. All this results in significant efficiency improvements by limiting the amount of time resources sit idle.
- Administration and metadata operations such as query parsing, SHOW commands, and serving pre-cached data fall under Snowflake’s Cloud Services layer, which provides free usage directly proportional to an account’s total daily compute consumption. This means that the vast majority of our customers do not get charged for these services.
FLEXIBILITY REQUIRES OVERSIGHT AND CONTROL
Snowflake’s highly elastic compute usage is charged by the second, so customers should monitor usage, growth, and resource efficiency on an ongoing basis to make sure they match performance requirements and budgets. Even though Snowflake automates resource optimization, there are opportunities for account admins to further tune their deployment, especially as their organization’s total compute footprint grows.
10 WAYS TO OPTIMIZE SNOWFLAKE RESOURCES
By enabling basic monitoring and resource optimization capabilities, you can easily prevent cost overruns and discover inefficiencies.
Part two of this blog post series will dive into the top 10 ways every Snowflake admin should optimize their resource and credit consumption, including:
- The basics of setting up auto-suspend for virtual warehouses
- Detailed examples of using ACCOUNT_USAGE and INFORMATION_SCHEMA
- How to enable alerting and usage thresholds with resource monitors
- Pre-built usage dashboards available from Snowflake’s BI partners