This article is by David Hare and originally appeared on the Alteryx Engine Works Blog here: https://community.alteryx.com/t5/Engine-Works-Blog/Tackling-Queued-Jobs-With-Queueing-Theory-Part-2/ba-p/486450
Disclaimers
- This is the second article in a two part series on using Queueing Theory to address queued jobs in Alteryx Server. If you missed Part 1, you can find it here. Many concepts are discussed in that article that will make this article more consumable, so it is highly recommended (dare I say required?) to read Part 1 before proceeding.
- Attached you'll find a sample workflow. Please note, this is just that, a sample workflow. This is not an official sizing application. This doesn't factor in other considerations such as additional applications or processes that are also running, Controller, Gallery, Insights, Map Rendering, etc… For an official Server sizing and architecture discussion, please reach out to your Alteryx representative.
The Sample Workflow
Attached is the sample workflow. It leverages Interface tools and Reporting tools and is intended to be executed through the Gallery. Note, you will need to run this workflow with Administrative privileges the first time as there are some Python packages that need to be installed for all the Python code to work.
The Inputs
The sample Server Queue Calculator requires two inputs:
- The job arrival rate (λ). On average, the number of jobs that 'arrive' per hour. This should include both scheduled jobs and on demand jobs executed by users in the Gallery. Obviously some hours will be busier than others. The great thing about this workflow is the interactive nature. Do analysis with an average number, a peak usage number, etc...
- The average job runtime (in seconds). On average, how long jobs run. In Part 1 we looked at the exponential distribution of job runtimes on Server. Try and pick a representative average number to start, then explore behavior as you move towards the tail.
A couple of tips... If you need help determining these values, the Alteryx Server Usage Report is a great resource. Also, when in doubt, round up. It's always better to have slightly more resources than not enough.
The Server Queue Calculator only requires two inputs.
The workflow then calculates the job service rate (µ) with the equation:
µ = 3600 / (avg job runtime)
This gives us the number of jobs that can be 'serviced' in one hour, assuming one 'server', which in our case is the number of simultaneously running workflows. After that simple calculation, the magic of Python kicks in and calculates an entire table of values for increasing 'server' amounts.
Note, the Queue Calculator logic starts going wonky when the number of 'servers' goes beyond 100. If the input parameters are too high, a friendly message like below will be displayed:
The Outputs
The sample Server Queue Calculator outputs 4 reports into a single output page. Let's look at each of these...
Summary Report
The summary report prints out the input parameters for validation, and then a set of Minimum and Recommended values.
The Minimum # of Simultaneous Workflows is a simple calculation we looked at in the end of Part 1, which just ensures we have enough 'servers' to handle the incoming arrival rate based on our service rate:
c = CEIL (λ / µ)
Also printed out is the Engine Utilization (ρ) , Probability of Queued Jobs, Average number of Queued Jobs, and the Average Queue Time. All these values are based on the Minimum # of Simultaneous Workflow value.
Additionally, a "Recommended" value for # of Simultaneous Workflows is displayed, which is based on the Knee in the curve values in the subsequent charts. The calculated Engine Utilization, Probability of Queued Jobs, Average number of Queued Jobs, and the Average Queue Time are displayed to compare against the "Minimum" values above.
Queueing & Utilization Curve
This chart has a lot of useful information in it, which shows as the number of simultaneous workflows increase, what the expected behavior would be to the overall engine utilization and probability of queued jobs. You can think of the engine utilization here as the % of time that the configured Simultaneous Workflows will be busy running jobs. The green Knee line indicates the point at which further increasing the number of simultaneous workflows doesn't provide a significant reduction in the probability of queued jobs. Think of it as after this point, the return on investment starts to drop.
Number of Queued Jobs
This chart provides an in depth look at how many jobs we can expect to be in "queued" state at any given moment, given the configured number of simultaneous workflows. In this example, with 2 simultaneous workflows, we can expect roughly 4 jobs in queue all the time. But with 4 simultaneous workflows, that number drops to almost zero.
Average Queue Time
This chart provides an in depth look at how much time jobs will spend in "queued" state before getting a chance to run. In this example, with 2 simultaneous workflows, jobs would on average spend 136 seconds in the queue, compared to just 2 seconds in the queue with four simultaneous workflows.
Turning Theory Into Action
The "Recommended" number of simultaneous workflows is based on the maximum of the Knee in curve values, and assuming accurate job arrival rates and job run times, using the recommend number should produce good results. However, configuring Alteryx Server with this many Simultaneous Workflows will NOT guarantee there will be no queued jobs. It's important to remember that the above calculations are based on averages to give an idea how Alteryx Server might behave. It doesn't factor in sudden bursts of job arrivals or increased job run times.
Additionally, and this is extremely important, the above calculations are only accurate with the assumption that each 'server' (Simultaneous Workflow) has the same resource capacity as was used in the initial inputs that produced the average job runtime. This means I can't simply increase the # of Simultaneous Workflows setting on my Worker and expect amazing results. Why not?
Consider a standard 4-core 16GB Alteryx Server, which is using our default recommendation of Simultaneous Workflows = 2 (1/2 of physical cores). Each engine icon below represents a configured Simultaneous Workflow.
If the sample Server Queue Calculator recommends that we should be using 4 Simultaneous Workflows, and we simply increase that number in our Worker System Settings, we'd have a server that looks like this:
The problem with this is each of those Engines (Simultaneous Workflows) will have less resources available to process their respective workflows, since there are more running processes to share the system with. This undoubtedly means that job run times will take longer to complete compared to before, and thus the calculations are no longer accurate.
Instead, if we provision an additional Alteryx Server Worker with 4-cores and 16GB of memory (identical to the existing server), and configure the 2nd server to also have 2 Simultaneous Workflows, then we've kept the resource allocation to each Engine the same as before, and the calculations hold true. In this scenario, job run times should be consistent with before no matter which Worker they execute on, and the number of jobs we can service per hour has doubled compared to before.
Conclusion
Whether you are currently experiencing frequent queued jobs, or you want to plan for future growth, understanding Queueing Theory and how to apply it to Alteryx Server can have a tremendous impact on improving the end user experience. The attached sample workflow allows you to understand queueing behavior based on current job arrival rates and run times. Or you can play 'what if' scenarios to understand the impact of a new department being added, increasing job arrival rate by 20%, or if data sizes increase causing job run times to take 10% longer.
Feel free to leave a comment if you have any questions, similar ideas, or most importantly, if you thought of another 5 consecutive vowel word!