Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

Dell InsightIQ 4.3.0.0 User Guide

PDF

Modules in Performance Reports

Modules display pre-configured collections of information about the monitored cluster.

Table 1. Module names and descriptions
Module Description
Active Clients Shows the number of unique client addresses that generate protocol traffic on the monitored cluster. Clients that are connected, but not generating any traffic, are not counted. You can optionally break out this data by protocol or node.

For NFS, the recommended maximum number of active clients that are supported is 1,000 per node.

For SMB 2.0 and later, the recommended maximum number of active clients that are supported is 1500 per node.

NOTE: Some protocols might issue a type of ping message, which can cause clients to appear active even if they are only sending these ping messages.
Average Cached Data Age Indicates the average amount of time that data has been in the L1 and L2 caches. Shorter times indicate that data is moving faster through cache and that older data is being replaced quickly.
Average Disk Hardware Latency Shows the average amount of time that it takes for the physical disk hardware to service an operation or transfer. You can optionally break out this data by node or disk.
Average Disk Operation Size Shows the average size of the operations or transfers that the disks in the cluster are servicing. You can optionally break out this data by direction, node, or disk.
Average Pending Disk Operations Count Shows the average number of operations or transfers that are in the processing queue for each disk in the cluster. You can optionally break out this data by node or disk. This module focuses on the queue depth of disk operations. Pending disk operation counts of 10 or less usually mean that the drives are operating as intended. Counts higher than 10 could indicate backlogs in disk operations, sometimes resulting in performance issues with the cluster. The graph is most meaningful when viewing the disk detail breakout.
Blocking File System Events Rate Shows the number of file blocking events occurring in the file system per second. You can optionally break out this data by path or node.
Cache Hits Shows the number of hits to L2 and L3 cache per second. You can optionally break out this data by job, node, or service.
Cluster Capacity Shows the amount of storage on the cluster.
  • Total Capacity - The total amount of storage capacity on the cluster.
  • Allocated Capacity - The storage capacity of nodes that belong to node pools that include three or more nodes. In the OneFS command-line interface, Allocated Capacity is referred to as "size" in the output of the isi status command. If all node pools include at least three nodes, the Allocated Capacity is the same as the Total Capacity.
  • Writable Capacity - The amount of storage capacity that user data can be written to on the cluster.
  • User Data Including Protection - The amount of storage capacity that is in use by user data and protection.
Connected Clients Shows the number of unique client addresses with established TCP connections to the cluster on known ports. You can optionally break out this data by protocol or node.
NOTE: UDP connections do not appear as connected. Also, some short-lived TCP connections might not appear as connected, although they are active.
Contended File System Events Rate Shows the number of file contention events, such as lock contention or read/write contention, occurring in the file system per second. You can optionally break out this data by path or node.
CPU % Use Shows the average CPU usage for all nodes in the monitored cluster. As some nodes may consume significantly more or less CPU resources than others, the average reflects the sum of the individual CPU-usage averages for each node.
NOTE: You can optionally break out this data by node. This breakout indicates the average CPU usage of each node. For example, at 10:52:22 AM on October 31, 2022, the specified node was using 14.35% of the total available node CPU capacity.
CPU Usage Rate Shows the rate of CPU time that is consumed on all cores per second. You can optionally break out this data by job, node, or service.
Deadlocked File System Events Rate Shows the number of file system deadlock events that the file system is processing per second. This information can be useful if you want to identify a specific file state that might be contributing to performance issues. Deadlocked events occur regularly during normal cluster operation, and the file system is designed to detect and break them. You can optionally break out this data by path or node.
Deduplication Summary (Logical) Shows the amount of space that deduplication has saved on the cluster and the amount of data that has been deduplicated. This module refers to the logical space and data. The file metadata and protection overhead are not considered.
Deduplication Summary (Physical) Shows the amount of space that deduplication has saved on the cluster and the amount of data that has been deduplicated. This module refers to the estimated physical space and data. The file metadata and protection overhead are considered.
Disk Activity Shows the average percentage of time that disks in the cluster spend performing operations instead of sitting idle. You can optionally break out this data by node or disk.
Disk IOPS Shows the rate of disk read and write operations per second. You can optionally break out this data by job, node, or service.
Disk Operations Rate Shows the average rate at which the disks in the cluster are servicing data read/write/change requests, also referred to as operations or disk transfers. You can optionally break out this data by disk, direction, or node.
Disk Throughput Rate Shows the total amount of data being read from and written to the disks in the cluster. You can optionally break out this data by disk, direction, or node.
External Network Errors Shows the number of errors that are generated for the external network interfaces. You can optionally break out this data by direction, interface, or node.
NOTE: During normal operations, this chart indicates an error count of 0. Errors reported in this chart are often the result of network infrastructure issues (for example, malformed frames or headers, handoff errors, or queuing errors) rather than OneFS issues. When investigating the cause of these errors, first review the logs and reports for the network switch and other network infrastructure components.
External Network Packets Rate Shows the total number of packets that passed through the external network interfaces in the monitored cluster. You can optionally break out this data by direction, interface, or node.
External Network Throughput Rate Shows the total amount of data that passed through the external network interfaces in the monitored cluster. You can optionally break out this data by interface, direction, client, operation class, protocol, or node. For example, if you have an application that uses a particular SmartConnect zone, you can view the throughput on those nodes to observe that application's performance.
File System Events Rate Shows the number of file system events, or operations, such as read, write, lookup, or rename, that the file system is servicing per second. You can optionally break out this data by direction, operation class, path, node, or event.
File System Throughput Rate Shows the rate at which data is being read from and written to the file system.
Job Workers Shows the number of active and assigned workers on the cluster. An active worker is a worker that is performing a system job. An assigned worker is a worker that has been assigned to a system job but is not currently performing the job. You can optionally break out this data by job name or job ID.
Jobs Shows the number of active and inactive jobs on the cluster. An active job is a system job that workers perform. An inactive job is a system job that has been assigned workers, but the workers are not currently performing the job. You can optionally break out this data by job name or job ID.
L1 and L2 Cache Prefetch Throughput Rate Shows the amount of data that was prefetched for L1 and L2 and how much of the prefetched data was requested.
  • Starts - Indicates the amount of data that was requested.
  • Hits - Indicates the amount of requested data that was available.
L1 Cache Throughput Rate Shows the amount of data that was requested from the L1 cache and how much of the requested data was available in the L1 cache.
  • Starts - Indicates the amount of data that was requested.
  • Hits - Indicates the amount of requested data that was available.
  • Waits - Indicates the amount of requested data that existed in cache but was not available because the data was in use.
  • Misses - Indicates the amount of requested data that did not exist in cache.
  • Prefetch Hits - Indicates the amount of data that was requested from prefetch.
L2 Cache Throughput Rate Shows the amount of data that was requested from the L2 cache and how much of the requested data was available in the L2 cache.
  • Starts - Indicates the amount of data that was requested.
  • Hits - Indicates the amount of requested data that was available.
  • Waits - Indicates the amount of requested data that existed in cache, but was not available because the data was in use.
  • Misses - Indicates the amount of requested data that did not exist in cache.
  • Prefetch Hits - Indicates the amount of data that was requested from prefetch.
L3 Cache Throughput Rate Shows the amount of data that was requested from the L3 cache and how much of the requested data was available in the L3 cache.
  • Starts - Indicates the amount of data that was requested.
  • Hits - Indicates the amount of requested data that was available.
  • Misses - Indicates the amount of requested data that did not exist in cache.
Locked File System Events Rate Shows the number of file lock operations occurring in the file system per second. You can optionally break out this data by path or node.
Overall Cache Hit Rate Shows the percentage of data requests that returned hits. A hit means that the requested data was available in cache. Higher hit rates indicate that more of the requested data is being retrieved from cache.
  • L1 Hit Rate - Indicates the hit rate for L1 cache. Private to the node servicing input-output requests.
  • L2 Hit Rate - Indicates the hit rate for L2 cache. Global across all nodes
  • L3 Hit Rate - Indicates the hit rate for L3 cache. Read only, data is stored on a solid state drive (SSD) instead of RAM
Overall Cache Throughput Rate Shows the amount of data that was requested from cache and how much of the requested data was available in cache.
  • L1 Starts - Indicates the amount of data that was requested from the L1 cache.
  • L1 Hits - Indicates the amount of requested data that was available in the L1 cache.
  • L2 Starts - Indicates the amount of data that was requested from the L2 cache.
  • L2 Hits - Indicates the amount of requested data that was available in the L2 cache.
  • L3 Starts - Indicates the amount of data that was requested from the L3 cache.
  • L3 Hits - Indicates the amount of requested data that was available in the L3 cache.
Pending Disk Operations Latency Shows the average amount of time that disk operations spend in the input/output scheduler. You can optionally break out this data by node or disk. The module represents the amount of time that a disk operation remains in the input-output scheduler, not the actual latency of a disk operation.
Protocol Operations Average Latency Shows the average amount of time that is required for protocols to process incoming operations. These values are typically represented in fractions of seconds. You can optionally break out this data by client, operation class, protocol, or node. For example, view the latency by protocol, then at the operation class for each protocol, to check for specific operations that might be causing excessive latency.
Protocol Operations Rate Shows the total number of requests per client for all file data access protocols. Combined with data from the disk throughput and disk operations rate data elements, this information can help you identify specific clients that might be contributing to cluster load. You can optionally break out this data by client, operation class, protocol, or node.
Slow Disk Access Rate Shows the rate at which slow, or long-latency, disk operations occur. You can optionally break out this data by node or disk.
Workload CPU Time Displays the CPU usage across workloads visible for different datasets. Filtering down with specific metrics can help determine where CPU resources are being consumed.
Workload IOPS Displays the input/output operations across workloads visible for different datasets. Filtering down with specific metrics can help determine where disk operations are being carried out.
Workload L2/L3 Cache Hits Displays the rate at which the cache is accessed during various operations on the monitored cluster. Filtering down with specific metrics can help determine what operations are making maximum use of the cache.
Workload Latency Displays the average amount of time spent by disk operation. Filtering down with specific metrics can help determine where your disk operations might be slowing down.
Workload Throughput Displays the rate at which data is being read from and written to the file system. Filtering down with specific metrics can help determine where the file transfer rate is peaking.
NOTE: Depending on which version of the OneFS operating system the monitored cluster is running, certain InsightIQ features may not be available.

Rate this content

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please select whether the article was helpful or not.
  Comments cannot contain these special characters: <>()\