Interpreting ESXtop Statistics #vmware #esx. Introduction Section CPU Section Worlds and Groups Section PCPUs Section Global Statistics Section. VMware: Interpreting esxtop Statistics. Leave a reply Section Adapter, Device, VM screens Section I/O Throughput Statistics. The paper is titled “Interpreting esxtop Statistics“. “esxtop” is an utility provided by VMware which can be used to perform monitoring and.
|Published (Last):||28 July 2016|
|PDF File Size:||8.71 Mb|
|ePub File Size:||18.79 Mb|
|Price:||Free* [*Free Regsitration Required]|
Esxtop statisticw monitoring and collection of data for all system resources: CPU, memory, disk and network. In the batch mode, data can be redirected to a file for offline uses. Many esxtop statistics are computed as rates, e. A rate is computed based on the refresh interval, the time between successive snapshots. The default refresh interval can be changed by the command line option ” -d “, or the interactive command ‘s’.
The return key can be pressed to force a refresh. In each screen, data is presented at different levels of aggregation. It is possible to drill down to expanded views of this data. Each screen provides different expansion options. It is possible to select all or some fields interlreting which data collection is done. In the interpretinf of interactive use of esxtop, the order in which the selected fields are displayed can be selected.
In the following sections, this document will describe the esxtop statistics shown interprdting each screen and their usage. Esxtop uses worlds and groups as the entities to show CPU usage.
A group contains multiple worlds. Let’s use a VM as an example.
Interpreting esxtop Statistics |VMware Communities
A powered-on VM has a corresponding group, which contains multiple sratistics. The guest activities are represented mostly by the vcpu worlds. The VMX world assists the vcpu worlds the hypervisor.
The usage of the VMX world is out of the scope of this document. There is only one vmx world for each VM. Note that groups can be organized in a hierarchical manner in ESX.
Interpreting esxtop 4.1 Statistics | VMware Communities
However, esxtop shows, in a flat form, the groups that contain some worlds. More detailed discussion on the groups are out of the scope. So, CPU stats won’t show vmm worlds. This is not a problem. In esxtop, a PCPU refers to a physical hardware execution context, i.
When hyper-threading is unavailable or disabled, a PCPU is the same as a core. So, there are two PCPUs on each core, i. The arithmetic mean of CPU loads in 1 minute, 5 minutes, and 15 minutes, based on 6-second samples. CPU load accounts the run time and ready time for all the groups on the statkstics.
It means that you are using lots of resource. A core is utilized, if either or both of the PCPUs on this core are utilized. The percentage utilization of a core is not the sum of the percentage utilization of both PCPUs. Let’s use a few examples to illustrate intterpreting. Based on esxtop batch output, we can use something like below.
The two PCPUs in a core share a lot of hardware resources, including the execution units and cache.
Let’s use some examples to illustrate this. Please note that the above inequations may intepreting hold due to frequency scaling, which is discussed next.
The frequency of a PCPU may be changed due to power management. Obviously, a PCPU does less “effective work” in a unit of time when the frequency is lower. If the effective frequency is 1. Please note that since the CPU frequency may change often, you may go to the esxtop power screen, pressing ‘p’, to see how often the PCPU stays at what states, which can help guess the effective frequency.
Please also note that turbo mode may make the effective frequency higher than the nominal frequency. If we want to add both reasons into account, just to make it more complicated, we can have something like this. It is very likely that hyper-threading is enabled. Suppose that CPU frequency is fixed to base frequency, e. It is likely due to hyper-threading. The stats are related, but not the same.
A group statistics is the sum of world statistics interpretimg all the worlds contained in that group. So, this section focuses on worlds. You may apply the description to the group as well, unless stated otherwise. ESX can make use of the HyperThreading technology, so, the performance counters takes HyperThreading into consideration as well. But, to simplify this document, we will ignore HT interprteing issues.
Please refer to “Resource Management Guide” for more details. The percentage physical CPU time accounted to the world. If a system service runs on behalf of this world, the time spent by that service i.
If not, the time spent i. Yes, if the system service runs on a different PCPU for 4.11 world. The system services are accounted to VCPU 0. The group stats is the sum of the worlds. NWLD is the number of worlds in the group. Among all the worlds, VCPU worlds represent best the guest. You may expand to worlds to see what worlds are using most of them. The percentage of time spent by system services on behalf of the world. The possible system services are interrupt handlers, bottom halves, and system worlds.
They are totally different. For Linux OS, user system time for a process is the time spent in user kernel mode. The percentage of time spent by system services on behalf of other worlds. In more detail, let’s use an example. Statsitics time spent by ‘S’, annotated as ‘t’, is included in the run time of ‘W1’.
It does not necessarily mean the VM is under resource constraint. This may happen even when there is plenty of free CPU cycles. If your application speed in the VM is OK, you may tolerate higher threshold.
A world can be in different states, either scheduled to run, ready to run but not scheduled, or not ready to run waiting for some events. It means the VM is possibly under resource contention.
The percentage of time the world was ready to run but deliberately wasn’t scheduled because that would violate the “CPU limit” settings.
If you want to improve the performance of this VM, you may increase its limit. However, keep in mind that it may reduce the performance of others.
The percentage of time the world spent in ready, co-deschedule state. Because VMM worlds represent the guest behavior the best. Most of time, the other worlds are waiting for events. This is a very rough estimate, due to two reasons. These worlds are waiting for events most of time. The percentage of time the VCPU world is in idle loop. It is important to note that some statistics refer to guest physical memory while others refer to machine memory.
Let’s use the following figure to explain. In the figure, two VMs are running on an ESX host, where each block represents 4 KB of memory and each color represents a different set of data on a block. Inside each VM, the guest OS maps the virtual memory to its physical memory. ESX Kernel maps the guest physical memory to interpreitng memory.
Due to ESX Page Sharing technology, guest physical pages with the same content can be mapped to 4.11 same machine page. Memory overcommit is the ratio of total requested memory and the “managed memory” minus 1. VMKernel computes the total requested memory as a statistids of the following components: It means that total requested guest physical memory is more than the machine memory available.
This is fine, because ballooning and page sharing allows memory overcommit. This metric does not necessarily mean that you will have performance issues. See above description for details. Roughly speaking, it reflects the ratio of requested memory and the available memory.
It is the machine memory reported by BIOS.