DataRobot Integrates Feature Discovery Capability with Snowflake

This article was written by Josh Klaben-Finegold and originally appeared on the DataRobot Blog here:


When DataRobot and Snowflake, the Data Cloud company, announced a new partnership in 2018, their stated goal was to “accelerate the adoption of AI in the enterprise.” For over two years, the two companies have worked closely to better understand how their businesses can integrate seamlessly to serve their customers’ needs.

After months of fruitful collaboration, DataRobot and Snowflake are excited to announce the first integration points under the partnership. DataRobot’s Feature Discovery — automated feature engineering that enables the creation of valuable new features for machine learning models — is available for Snowflake. Now, Snowflake customers have a greater ability to accelerate innovation with AI and machine learning initiatives and uncover business-driving predictive insights.

DataRobot’s Feature Discovery: Next-Generation Automated Feature Engineering

Feature engineering is one of the most critical tasks in AI, since the features you create often determine the success or failure of your machine learning projects.

The issue is that data rarely has the right features, and multiple data sources need to be consolidated into a single dataset to train models and make predictions. This usually involves joining multiple tables and exploring a lot of aggregations (i.e., sum, max, avg, count, entropy, etc.) on different derivation windows (e.g., last 30 days, last week, etc.).

DataRobot’s Automated Feature Discovery simplifies and accelerates this feature engineering process through the automation of expert data science best practices.

Suggested to use for print 2

Compared to usual automated feature engineering solutions, it can leverage data from multiple datasets, not just one, and automatically discovers, tests, and creates hundreds of valuable new features for your machine learning models, dramatically improving their accuracy.

Now Faster with Snowflake

Exploring multiple data sources has always required transferring large amounts of data between systems which was resource-intensive and time consuming.

DataRobot’s new Snowflake integration pushes Feature Discovery operations into Snowflake to minimize data movement, resulting in faster results and lower operating costs.

In many ways, Feature Discovery is an extension of DataRobot’s AutoML (Automated Machine Learning) product in terms of the automation that it brings to the data science process. This integration now allows users to get even more accurate models from their Snowflake Data Cloud.