ADF Performance Monitor – Major New Version 9.0 (Part 1)
I’m very excited to announce that we have a major new version of the ADF Performance Monitor – version 9.0 !
We have added many valuable new features; new metrics that can detect and help explain poor performance, disruptions, hiccups, and help troubleshooting ADF applications. Like operating system metrics: the CPU usage of the ADF application, the total CPU usage of the whole underlying operating system, the total used and free physical (RAM) memory of the whole system, and the Linux load averages. A high CPU usage rate and memory usage may indicate a poorly tuned or designed application. Optimizing the application can lower CPU utilization. Generic APM tools have these kinds of metrics too in some way, but the combination of system metrics with ADF specific metrics of the ADF Performance Monitor makes it even more possible to relate performance problems.
Another reason to pay attention to system metrics is that nowadays more and more applications are deployed on the cloud. Very likely there will be shared virtual machines and resources (CPU, memory, network). Applications and processes could influence each other if frequently other processes have a very high usage of the available CPU or memory capacity.
This blog (part 1) describes the first part of these new features. Part 2 describes the CPU execution time of individual HTTP requests and click actions. It answers the question: “What request/click action in the application is responsible for burning that CPU ?
CPU Usage and Performance
All critical IT infrastructure devices contain a central processing unit (CPU) that ultimately dictates the performance, stability, and performance capacity of the device. Overall machine performance is often influenced by many resources, such as physical memory, disks, network, external services, the quality of the software, etc. CPU utilization for servers is typically the primary metric that indicates current system performance levels as well as available capacity.
CPU performance latency occurs when one or more processes consume most of the processor time. Threads that are ready to be executed must wait in a queue for the processor to become available. Trying to overcome a processor bottleneck by throwing hardware at the problem (e.g., more CPU, more memory, faster disks, more network connections) will often not help.
Monitoring CPU usage helps you to analyze spikes in CPU load and identify overactive CPU usage. Depending on the CPU performance behavior, you can:
- upgrade the CPU or add multiple processors
- cut load
- find underlying performance bottlenecks caused by (ADF) program code.
- avoid exorbitant costs arising due to unnecessary upgrades
- identify unnecessary background processes running
- find out the resource utilization of a process or application and its impact on the system as a whole
1 – CPU Load % of JVM Process and Whole Operating System
The monitor shows on day, hour and minute overviews:
- CPU load of the JVM process of the ADF application (grey). This is dependent on the activities going on in the ADF application program code or in libraries. See the impact of its usage, especially when the ADF application puts (too) much load on the available CPU’s. If this load is 100%, then CPU’s are actively running threads from the JVM 100% of the time (this includes application threads as well as the JVM internal threads). If frequently during the peak hours the JVM process load is (on average) more than 60% the monitor gives a warning, this should be a trigger investigate this and to try to bring it down. In the hour overview screenshot above the CPU load of the JVM process was quite high – from 20:50 to 20:55 it was even more than 80% (!).
- CPU load of the whole underlying operating system (pink). This load depends on all the activities going on in the whole system. When there are other big background processes executed on the same machine (!), it is important like to monitor them, the performance of the ADF application will likely be influenced. This can also be the case when multiple applications are deployed on the same machine. If during the peak hours the total system load is (on average) more than 80% the monitor gives a warning. It is very wise to investigate this warning, and to limit the CPU burning processes. In the hour overview screenshot above shows that from 20:20 to 20:30 the total system CPU load was 100% (!).
Another example in the following hour overview screenshot:
In this case we had a Linux server where an ADF application was deployed on, but also a Jenkins server. A Jenkins job was executed from 16:38 to 16:48, that took temporarily more than 80% of the total system CPU load for 10 minutes. This was on a test server, but if this happened on a busy production server under high load, the end-users would very likely have experienced slower response times than normal.
2 – CPU Application Usage in Time
It is very useful to monitor the total execution time of the JVM process. This corresponds a little bit with the application % usage but is also different – it is in the total CPU time that the JVM process (server where ADF application is deployed on) has used in seconds. It is useful to get insight in the CPU time consumption of the ADF application, and especially when the load is high. Different releases can be compared, and ADF developers can try to bring down (unnecessary) CPU time of expensive, ‘CPU burning’, operations in the ADF application. More on that in part 2.
The overview above shows the total CPU usage in time of a day (right bottom). We can clearly see a typical day pattern: end-users start around 08:00-09:00, high peak times are between 10:00 and 12:00, they have lunch between 12:00 and 13:00, and leave office after 17:00.
3 – Used System Memory vs Free System Memory
Just as important it is to monitor the JVM heap memory and garbage collections, it is important to monitor the whole operating system memory for over-consumption. How much is used, and how much is free? When an operating system is running out, it will go swapping memory in and out – this is expensive and resource consuming. It is better to avoid that. The ADF Performance Monitor shows the used and free system memory:
The overview above shows a lack of free memory from 20:20 to 20:38, and an increase in system memory usage to the maximum of 16 GB.
4 – System Load Average (Linux Load Averages)
An alternative and excellent way to look at the load of the whole system is the Linux load averages. Linux load averages are “system load averages” that show the running thread (task) demand on the system as an average number of running threads on the processors plus waiting threads (waiting threads queued to the available processors). This measures demand, which can be greater than what the system is currently processing. The ADF Performance Monitor shows the Linux load averages of the last minute (this is what Linux shows in the loadavg command of the last minute).
It looks a bit like the CPU system load and CPU process time, but the difference is that the Linux load average includes waiting threads on the queue. During a very high (over) load spikes are much more visible in this chart because it includes waiting thread queued to the CPU’s.
Read here an excellent article that explains Linux load averages in detail and its interesting history.
In part 2 of this blog I write on other new features of this major new release. It answers the question: “What request/click action in the application is responsible for burning that CPU ?
Free 10 Day Trial
We have also a free 10-day trial, you can request it on this website at our main page (adfpm.com).
You can purchase a license of the ADF Performance Monitor at our order page.