How To Optimize Angelfish Performance

Creation date: 4/18/2022 10:44 PM    Updated: 5/31/2022 7:21 PM
This article contains a list of recommendations for optimizing the performance of your Angelfish instance. These recommendations are especially relevant for large scale / high traffic environments.

If you have a question about anything on this page, please open a support ticket.

HARDWARE

At a basic level, Angelfish is a database application.  The best way to improve performance is to use the best hardware available: fast disk, multiple CPU cores, and enough RAM for large requests to complete in-memory.

Disk I/O
We recommend storing Angelfish's application files AND Data Directory on local storage, ideally local SSD storage.  Network-attached storage can be used for log files, but not for Angelfish application & data files: Angelfish is a database application.

CPU Threads
Dedicated servers typically have cores, and virtual | container servers typically have threads.  In this article, "threads" means CPU cores or CPU threads, depending on your server type.

We recommend a minimum of 4 threads for your Angelfish instance.  By default, each individual Angelfish report request uses 1 thread.  If more threads are available, you can allocate multiple threads for each report request on a per-Profile basis, in the Advanced tab - API Threads setting 

The law of diminishing returns applies here: each additional thread provides an improvement, but the improvement is reduced for each additional thread.  

When allocating 3+ threads for each report request, disk i/o becomes a bottleneck unless SSD is used.

RAM
Active data processing jobs and report requests use RAM, and you want to have enough RAM so your OS doesn't use the page file.  We recommend a minimum of 8 GB, increasing as your environment dictates.

Angelfish will work with less RAM (e.g. 2-4 GB), but keep an eye on your resources when making large report requests.

CONFIG


Remove Unwanted Data
Angelfish reports are date range-based: when a report is requested, all activity during the date range is evaluated while the report is built. The more activity exists in the date range, the longer the report takes to load.  

If you see unwanted data in your reports, we recommend doing two things:
  • Use the Delete Specific Visits feature on the Run/Data Management tab to remove the data
  • Create a Filter (or edit an existing Filter) to block the data from returning

Help Article: Filters Overview

Disable IT Reports
This makes sense when you have a bunch of Profiles that read the same logs (e.g. a Profile for each subsite) and it's not necessary to have IT Reports data in each profile. You can disable all IT Reports by selecting "Disable" for the Hit File Types & Download File Types fields in the IT Reports section of the Settings tab.

Limit Page Query Parameters
Most Page Query Parameters are useful for the web server but are irrelevant from a reporting perspective. If you need Query Parameters in your reports, we recommend using the Include option and only adding necessary query parameters.

Reduce Page File Types
If you use a log-based tracking method (SID, USR, IPUA or IP), you must specify the File Types that will be counted as Pages.  This is configured in the "Pageview File Types" field in the Settings tab.  We recommend using the Include option with a list of File Types.

Strip Unique Strings from Pages
Some web servers stuff a session ID in the URL, which greatly increases Page cardinality. You can use an Advanced filter to strip these strings from your Pages.

External Sites Only: Enable "Ignore Inflated Visits"
This feature excludes Visits that exceed a Pageview count threshold: the default setting is 100 Pageviews, and this number is editable.  Most visits that exceed this number are robots / crawlers / scanners.

agf.conf


The agf.conf file is located in the root Angelfish directory and contains various config options for your Angelfish instance. If you make any changes to agf.conf, you must restart Angelfish for the changes to be applied.

max_log_processors
This variable affects the number of Profiles that Angelfish will process simultaneously: the default value is 2.  You can increase this value, but we've found that low-powered systems can become i/o bound in the 3-4 range.  Edit with care.

max_threads
This variable allocates CPU threads for each report request for all Profiles in the Application.  We don't recommend changing this value (default = 1).

It's best to use each Profile's Advanced tab - API Threads config setting and allocate threads to  individual Profiles that need a performance boost.