This article is by Michael F and originally appeared on the Alteryx Engine Works Blog here: https://community.alteryx.com/t5/Engine-Works-Blog/An-Introduction-to-Troubleshooting-Alteryx-Server/ba-p/401682
When Alteryx Server returns an error message, there are many log files that can be used to help identify the cause of the error. Identifying which log file is the first task in troubleshooting. A brief description of primary logs, as well as where and how to access them is documented here.
In addition to obtaining the appropriate logs, other basic steps in troubleshooting are understanding the steps that can replicate the issue and initial research into the error messages. The following steps are detailed below to provide a deeper understanding of what items should stand out when troubleshooting Server.
1. Understand which Server node is experiencing the issue
Troubleshooting your Server instance will depend on what component of the Server is returning an error.
If a workflow isn’t working on the Server, you will need to investigate the Worker Node and the existing Engine Settings of the Server. Many people understand how to debug a workflow on Designer, so troubleshooting a workflow from the worker node is the easiest strategy for fixing a workflow on Server. Open the workflow causing the error in Designer on the Worker node and troubleshoot the workflow the same way you would on your local Designer instance. The goal is to understand what in the deployed workflow is causing the error, and then to make the appropriate changes. If this is not an option, Engine logs should be turned on in the System Settings, which mimics the Results window of a workflow execution.
If a Gallery action isn’t working, investigate the Gallery Node. Most of this information will be logged in the Gallery logs. Whether it’s uploading a workflow, sharing a workflow, adding users, or simply getting your Gallery up and running, the Gallery logs should be able to help pinpoint the cause.
If the Scheduler and assignments aren’t working, then investigate the Controller Node. The primary log file that captures Service activity will be the Service log. Since the Server is primarily run through the Service, it is never a bad idea to gather the Service log. In that same vein, it’s never a bad idea to get all log files.
In any of these nodes, if there are ever crashes, or performance issues, these will be captured in the Event Viewer, which can be on either node depending on the issue.
2. Replicate, if possible
In many cases, replication will help the most in understanding an issue, and escalate if needed. In the broken Server environment, identify steps that were taken from when things were seemingly “ok” or from a baseline understanding. The simpler the environment, the easier it is to troubleshoot.
For example, if the Alteryx Service is running but the Gallery isn’t starting, revert the URL to localhost/gallery and test if the original settings work. Understanding a baseline for replication is a great way to start eliminating possibilities and understanding the root cause.
3. Initial error message research
Now that the Server environment has been reverted to its simplest form where the error can be replicated, and the Server node has been identified, the next step to finding the correct solution is decoding the actual error message. Many times, an error will be specific to the Alteryx Server Settings itself – for example, the error X would indicate that the Server System Settings are improperly configured. If it seems Alteryx related, then researching on community should be the first thing to do.
If the error message seems to point to permissions, Active Directory, or the like, then it may be useful to research about native Windows related errors. Alteryx Server is directly dependent on the environment it runs on, so understanding the Windows environment and what could be wrong helps expose these errors.
If the error message seems to point to the underlying database, Mongo, then looking through Mongo documentation or error codes would be most helpful.
4. Ask Support!
While most cases can be solved by proper troubleshooting techniques, even the simplest errors require some assistance. Alteryx Support will help in identifying the root causes and finding a resolution.
As a final checklist, when hitting an error with Server, following these steps will help find the fastest resolution to an Alteryx Server issue:
- Understand the node that’s breaking and gather the appropriate logs
- Understand the behavior by replicating
- Identify common causes with initial research
- Ask Support