This article was written by D House and originally appeared on the Alteryx Analytics Blog here: https://community.alteryx.com/t5/Analytics/Define-Your-Own-Snowflake-Functions-with-Alteryx/ba-p/769318
If you’re like most Alteryx users, you’ve got valuable data living in databases and data warehouses in your organization that you want to analyze. At Alteryx, we take accelerating your time to value with big data seriously. For example, Alteryx recently announced a partnership with cloud data platform Snowflake, meaning you can leverage Alteryx In-DB tools with your valuable Snowflake data.
If you frequently work with large datasets, you’re probably already a fan of our well-loved In-DB tools. In fact, Alteryx In-DB capabilities are so helpful in reducing processing time and enabling analysis across huge datasets, that I can no longer count the number of times I’ve been asked, “How can I do this In-DB?” Well, if you’re a Snowflake user, I’m excited to share that Snowflake has just launched public-preview capabilities that will help you basically do whatever you want in DB!
Details, you say?? Snowflake has introduced Java in their user defined functions, also known as UDFs. With Java UDFs, customers can bring functions they have in JVM (Java Virtual Machine) and execute right inside of Snowflake’s data cloud with Snowflake’s powerful processing engine. How does this lead to your total In-DB nirvana? Let me explain with a demonstration.
Using Snowflake Java UDFs for Fun and Profit
Here’s how you can use this functionality to perform sentiment analysis on a dataset in Snowflake using the in-database, Java UDF extensibility.
Let’s imagine you own a music supply company and you have a continuous stream of data regarding customer reviews of the products you sell at a music store coming into a Snowflake table. You, of course, want satisfied customers and you want to quickly address any issues they may have had with the products you sell. Whenever there is a negative comment about one of your products, you want your customer service team to reach out to the customer to help rectify the issue.
So, let’s dive into Alteryx to show you how this is done and then we will show you how to set this up on your own.
Put on Your Snow(flake?) Boots, We’re Going for a Hike
Now that you’ve seen the power of moving some of the processing upstream, let us take you through how to do this in your own environment.
- Install SnowSQL on the machine that will be pushing the files (https://sfc-repo.snowflakecomputing.com/snowsql/index.html) We used the Windows 64-bit 1.2.14 version.
- Download these Java files
- You need the ability to create the necessary stage and function within Snowflake (i.e. SYSADMIN or similar privileges)
- Create a (or use an existing) stage in the schema where you want the function to be defined.
- The SQL command to do this in Snowflake, within your chosen database and schema, is “CREATE STAGE YOUR_STAGE;”.
- In your file system, navigate to where you have downloaded the Java files
- If you have not already done, so, unzip the files into their own directory.
- Inside of this directory is a file “Config.cmd”. Right click, choose edit in Notepad (or similar text editor)
- Replace the values in this file with your values. (TIP: Do not put a space between the “=” sign and your value)
- Save and close the file.
- In the same folder, find “put_in_stage.sql”. Right click, choose edit in Notepad (or similar text editor)
- On the fourth line, “@VIDEO.TEST.AYXDEV” represents the @DATABASE.SCHEMA.STAGE. Replace the values with your values.
- Save and close the file
- In the same folder, find “declare_udf.sql”. Right click, choose edit in Notepad (or similar text editor)
- On the sixth line, “@ALTERYX.DEV.AYXDEV” also represents the @DATABASE.SCHEMA.STAGE. Replace the values with your values.
- Now open a command line window and browse to the folder where the Java files are located.
- PRO TIP: If you still have the file browser open to the location of the unzipped files, just type CMD in the address bar (where the file path is listed) and hit Enter. This will open a command line tool already at the location.
- In the command line, run the first batch script by typing “upload_jar.bat” and hitting Enter. If you have completed the prerequisites, it should prompt you for the password of the user you entered in the Config.cmd file. Enter the Password and hit Enter.
- You should see the jar file begin to upload to the specified location
- Upon success, you now need to create the UDF using the resources we just uploaded. On the command line type “declare_udf.bat” and hit Enter.
- It will again prompt you for the password; enter the password and hit Enter.
- You should see the function get create and a message that the statement was executed successfully.
- Now, hop over to Snowflake, log in, and navigate to the database and schema where you created the function (based on your entries in the Config.cmd file)
- In a Snowflake worksheet in the database and schema you previously chose, run the statement
- SELECT sentiment(‘bad’);
- You should have 1 row returned with a number that represents the sentiment of the word “bad” on a scale from -1 to 1; -1 representing very negative and 1 representing very positive and 0 being neutral.
- You are now ready to move to Alteryx and use your function!
Build Your Own Snowflake In-DB Breakthrough
Now that you’ve seen how it works, I encourage you to go it on your own! Here’s what to do next:
- Download the Alteryx macro attached to this article and these Java files.
- Watch the video and build your workflow alongside it using your Snowflake account.
Hope you enjoy!