Alteryx Multi-Threaded Processing (AMP)

This article was written by Garabujo7 and originally appeared on the Alteryx Engine Works Blog here: https://community.alteryx.com/t5/Engine-Works/Alteryx-Multi-Threaded-Processing-AMP/ba-p/805984

 

AMP, Alteryx Multi-Threaded Processing Engine

The Alteryx Engine is the component that processes the data in Alteryx. Since we don't normally ask ourselves how Alteryx processes the data or what is under the hood (beyond the RAM and the hardware), the software works through a processing engine.

 

This new version, the AMP Engine, includes parallel processing ...

Parallel processing? Here is a brief explanation:

AMP Animation 2.gif

 

To make it easier, let us think of a service office that only has one window. The number of people who can be served is one. It is a sequential processing model: for the next one to be served, you have to wait for the first one to finish. In this example, if each person takes 5 minutes to resolve their issue, to serve four it will take 20 minutes.

While in parallel processing, it is as if we had four service windows; these could serve four people at the same time, so in the same twenty minutes it could serve 20 people, four times more.

What the AMP Engine does is break the data into packets that are processed in parallel, for faster execution. In other words, the Engine will use all your processing cores and RAM when you run the workflow.

How do I use AMP in my flows? First, you have to have the version of Alteryx Designer 2020.2 or newer. To check the version of Alteryx you have installed, go to Help -> About.

Garabujo7_4-1594997609405.png

Garabujo7_5-1594997609409.png

If you do not have this version yet, you can update it at any time to start using AMP. Go to Help -> Alteryx Downloads.

Garabujo7_6-1594997609411.png

 

Choose version 2021.2 (or newer):

Garabujo7_0-1628265833727.png

If you have questions about which version to download or the installation process, you can consult this article with a quick guide to get you started with Alteryx Designer.

AMP is available for all workflows but in this version you have to specify that you want to use the AMP engine for each workflow flow individually. You can also select to use the AMP Engine for all new workflows in the User Settings:

CristonS_0-1629824435931.png

 

To do this, left click on any white part of the canvas.

Garabujo7_9-1594997609437.png

Then in the Workflow - Configuration, on the left part of the screen, select Runtime and at the bottom, the last option says Use AMP Engine.

Garabujo7_10-1594997609445.png

Now you can run your stream and feel the power of Alteryx's parallel processing engine, AMP.

How to check if you are using AMP? To verify, you can see in the Results window if the following message appears: This is AMP Engine.

Garabujo7_11-1594997609452.png

I ran the following workflow on my local computer where I connect to a SQL Server database that reads 10.4 million records and blends three excel files: one with 99K, one with 21K, and one CSV with 2.4K records, respectively.

Garabujo7_12-1594997609459.png

 

The blend is made with a Find Replace tool. The process takes one minute and ten seconds.

111.png

It is a big difference for a relatively large volume of data, although the best test is the one you carry out on your computer and with your data to validate it.

Considerations:

Like everything in life, results can vary and depend on many factors such as the complexity of the workflow, what analytical blocks (tools) you use, the size of the data and the hardware you have available.

 

Requirements for AMP:

The AMP engine must have at least 400 MB to process a thread from a workflow. For example, with 8 threads, there must be at least 3.2 GB of memory available to AMP at run time.