Welcome to the Velocity monthly newsletter for October 2020 where we share our latest use cases, upcoming events, product updates, and more...
Bring Your Enterprise Data to the Cloud with Snowflake and Qlik
- How to move your data to the cloud in real-time?
- How to accelerate your analytics projects in cloud data platform?
- How to automate your entire data warehouse lifecycle?
INDUSTRY VIEW POINTS
Governing Cloud Data Stores
As you move data from legacy systems to a cloud data platform, you need to ensure the quality and overall governance of that data. In this digital age, data and it’s governance are the responsibility of the entire organization. Read the article.
Why Valuable Data Needs To Be Identifiable to The Entire Business
One of the greatest challenges that organizations are grappling with starts right at the beginning of the data pipeline. Research with IDC revealed that a staggering 96 percent of global business leaders reported that it is challenging for their company to identify potentially valuable data sources – with 56 percent stating that it is either very or extremely challenging. Read the article.
Customer Use Case
Qlik points way ahead for DCH
The challenges that Dah Chong Hong Holdings were facing was that the DCH consumer products business extends across the supply chain and includes the distribution of thousands of household favorites with a network of services spanning the entire supply chain and ensuring the highest levels of traceability, quality control, and product excellence. It is the largest agency distributor in Hong Kong with more than 500 brands of food and FMCG products under the Sims, IMSA and DCH banners. With such a widespread operation, DCH realized that it needed to sustain a deep understanding of both its customers and its business.
Solution Updates
Alteryx Data Science Portal
Alteryx has launched their Data Science Portal in the Alteryx Community (under the Blogs & Podcasts menu) where data scientists and machine learning enthusiasts can find all data science content offered across the Alteryx Academy, podcasts, blogs, discussions and more.
- Next-Level Feature Discovery: An enhanced dataset relationship workflow, makes it much easier to select multiple datasets, and define, edit, and visualize all your relationships at the same time. You can now access logs to get details on which features were explored, discarded, and generated. You can also download the full training dataset, including all the derived features.
- Comprehensive Autopilot Mode (PUBLIC BETA): This new mode runs every single model in the repository for your project, taking as long as necessary to maximize accuracy when you need it most. We also have a new Get More Accuracy feature, so you can kick off Autopilot in Quick mode, then start Comprehensive mode only after you’ve seen your initial results.
- Anomaly Assessment Insights: New in Automated Time Series 6.2, this interactive visualization allows you to quickly investigate anomalies and anomalous regions in your data, and access SHAP scores for the underlying features causing the anomaly. This allows you to understand the root cause, as well as be able to explore all of your anomalies, panning through different time segments and zooming in to see the detail.
- Model Comparison Reimagined: The ability to compare models has taken a huge leap forward in Release 6.2. We worked with our most experienced data scientists to give you the best possible user experience where you can compare models and choose the best for deployment. We have enhanced the Lift Charts, ROC Curve, and Profit Curve, added support for more bins, and have added new tooltips to enhance overall ease of use.
- Governed Approval Workflows: Customizable governance policies and review and approval workflows come to MLOps in 6.2. This introduces accountability in your production AI, and enables you to continue to deploy and manage production models while at the same time increasing your overall level of AI governance for your entire organization.
- Connect to Remote Repositories: We recognize that your data science teams often govern and manage their models in popular open source code repositories. In Release 6.2, MLOps allows you to connect directly to your GitHub and S3 repositories and dynamically pull model code and model artifacts into DataRobot, making it simple to package, test, deploy, and monitor them in your production environment of choice.
Augmented intelligence
- Cluster Chart – Shows clusters using the new k-means clustering function. Correlation Chart – Shows correlations using the correlation function. Control Chart – Shows how a process changes over time.
- Number formatting of master measures
- Turn on and off borders in containers
- Custom sorting in Sankey Chart
- Frequency counts in filter pane
- WMS (web map service) layer opacity
- Hover icons toggle
- Columns for name, description, owner, published (when applicable), Data last reloaded (when applicable), details
- Sort by clicking on the column headers
- Keyboard shortcut just like grid (Ctrl + g) and list view (Ctrl + Shift + l)
- Navigate with keyboard
- Support for Qlik-supported screen readers
- Automated continuous data loading for Google Cloud Storage stages: Aligning with the spirit of near-zero maintenance as a core tenet of Snowflake, this feature enables automated continuous data loads for Google Cloud Storage (GCS) without requiring administrative intervention. Using event notifications, Snowpipe automatically loads new data from cloud storage into Snowflake, making it more readily available to analytics teams for faster time to insight.
- Continuous data loading to Snowflake on AWS: Many Snowflake customers require a multi-cloud approach, whether it’s to avoid vendor lock-in or to accommodate different platforms resulting from mergers and acquisitions. Often organizations look for ways to use Snowflake’s cloud data platform across the leading cloud providers Snowflake supports, namely Amazon, Google, and Microsoft.
- Support for pattern matching: Event-driven architectures are commonplace today. As certain types of events occur (such as fraud detection or real-time customer service requests), using data to take the right action is critical.
Snowsight Usability Updates
Snowsight, the Snowflake user interface, is where analysts can build queries and interact with Snowflake’s cloud data platform. Snowflake continually works to extend Snowsight functionality to include the capabilities required for all Snowflake-supported workloads.
- User refresh of schema metadata: Previously in Snowsight, the schema metadata was refreshed automatically every 24 hours. When a new table or view was created in a database, users with the right roles and permissions could query the object; however, the schema browser would not list the new object immediately, and the new object could not be searched, meaning important updates or new data could be missed.
- Execution of multiple SQL statements sequentially in a single worksheet: Analysts use worksheets within Snowsight to build and run SQL statements on Snowflake. Previously, analysts had to either build complex scheduling routines or execute SQL statements manually.