B.2 oclumon dumpnodeview
Use the oclumon dumpnodeview
command to view log information from the system monitor service in the form of a node view.
Usage Notes
A node view is a collection of all metrics collected by Cluster Health Monitor for a node at a point in time. Cluster Health Monitor attempts to collect metrics every five seconds on every node. Some metrics are static while other metrics are dynamic.
A node view consists of eight views when you display verbose output:
-
SYSTEM: Lists system metrics such as CPU COUNT, CPU USAGE, and MEM USAGE
-
TOP CONSUMERS: Lists the top consuming processes in the following format:
metric_name: 'process_name(process_identifier) utilization'
-
CPUS: Lists statistics for each CPU
-
PROCESSES: Lists process metrics such as PID, name, number of threads, memory usage, and number of file descriptors
-
DEVICES: Lists device metrics such as disk read and write rates, queue length, and wait time per I/O
-
NICS: Lists network interface card metrics such as network receive and send rates, effective bandwidth, and error rates
-
FILESYSTEMS: Lists file system metrics, such as total, used, and available space
-
PROTOCOL ERRORS: Lists any protocol errors
Generate a summary report that only contains the SYSTEM and TOP CONSUMERS views.
Syntax
oclumon dumpnodeview [-allnodes | -n node1 ...] [-last duration | -s timestamp -e timestamp] [-i interval] [-v | [-system][-process][-procag][-device][-filesystem][-nic][-protoerr][-cpu][-topconsumer]] [-format format type] [-dir directory [-append]]
Parameters
Table B-2 oclumon dumpnodeview Command Parameters
Parameter | Description |
---|---|
|
Use this option to dump the node views of all the nodes in the cluster. |
|
Specify one node or several nodes in a space-delimited list for which you want to dump the node view. |
|
Use this option to specify a time, given in For example:
|
|
Use the Specify time in For example:
Note: Specify these two options together to obtain a range. |
|
Specify a collection interval, in five-second increments. |
|
Displays verbose node view output. |
|
Dumps each specified node view parts. |
|
Specify the output format. "format type" can be The default format is mostly tabular with legacy for node view parts with only one row. |
|
Dumps the node view to the files in the directory that you specify. Specify the For example, the command If this command is run twice, it overwrites the data dumped by the previous run. Running the command with |
|
Outputs the process of the node view, aggregated by category:
Note: |
|
Displays online help for the |
Usage Notes
-
In certain circumstances, data can be delayed for some time before the command replays the data.
For example, the
crsctl stop cluster -all
command can cause data delay. After runningcrsctl start cluster -all
, it may take several minutes beforeoclumon dumpnodeview
shows any data collected during the interval. -
The default is to continuously dump node views. To stop continuous display, use Ctrl+C on Linux and Microsoft Windows.
-
Both the local system monitor service (
osysmond
) and the cluster logger service (ologgerd
) must be running to obtain node view dumps. -
The
oclumon dumpnodeview
command displays only 127 CPUs of the CPU core, omitting a CPU at random from the list.
Metric Descriptions
This section includes descriptions of the metrics in each of the seven views that comprise a node view listed in the following tables.
Table B-3 oclumon dumpnodeview SYSTEM View Metric Descriptions
Metric | Description |
---|---|
|
Number of physical CPUs. |
|
Number of CPU cores in the system. |
|
Number of logical compute units. |
|
CPU hyperthreading enabled (Y) or disabled (N). |
|
Name of the CPU vendor. |
|
Average CPU utilization per processing unit within the current sample interval (%). |
Percentage of over all CPU cores. 100% indicates that all cores are spent for that metric. |
|
|
Total CPU usage = |
|
CPU used by processes in kernel mode. |
|
CPU used by normal processes in user mode. |
|
CPU used by "niced" processes (low priority). |
|
CPU waiting for I/O. |
|
Virtual CPU waiting for physical CPU to be freed by other VM. |
|
Number of processes waiting in the run queue within the current sample interval. |
|
Amount of free RAM (KB). |
|
Amount of total usable RAM (KB). |
|
Shared memory. |
|
Amount of physical RAM used for file buffers plus the amount of physical RAM used as cache memory (KB). On Microsoft Windows systems, this is the number of bytes currently being used by the file system cache. Note: This metric is not available on Solaris. |
|
Amount of swap memory free (KB) |
|
Total amount of physical swap memory (KB) |
|
Total size of huge in KB Note: This metric is not available on Solaris or Microsoft Windows systems. |
|
Free size of huge page in KB Note: This metric is not available on Solaris or Microsoft Windows systems. |
|
Smallest unit size of huge page Note: This metric is not available on Solaris or Microsoft Windows systems. |
|
Average total disk read rate within the current sample interval (KB per second). |
|
Average total disk write rate within the current sample interval (KB per second). |
|
Average disk I/O operation rate within the current sample interval (I/O operations per second). |
|
Average swap in rate within the current sample interval (KB per second). Note: This metric is not available on Microsoft Windows systems. |
|
Average swap out rate within the current sample interval (KB per second). Note: This metric is not available on Microsoft Windows systems. |
|
Average page in rate within the current sample interval (pages per second). |
|
Average page out rate within the current sample interval (pages per second). |
|
Average total network receive rate within the current sample interval (KB per second). |
|
Average total network send rate within the current sample interval (KB per second). |
|
Number of processes. |
|
The current number of processes running on the CPU. |
|
Number of processes currently blocked waiting for I/O. |
|
Number of real-time processes. |
|
The current number of real-time processes running on the CPU. |
|
Number of open file descriptors. or Number of open handles on Microsoft Windows. |
|
System limit on the number of file descriptors. Note: This metric is not available on either Solaris or Microsoft Windows systems. |
|
Number of disks. |
|
Number of network interface cards. |
|
Average total network error rate within the current sample interval (errors per second). |
|
Number of network file system. |
|
Load average (average number of jobs in the run queue or waiting for disk I/O) of the last 1, 5, 15 minutes. |
Table B-4 oclumon dumpnodeview PROCESSES View Metric Descriptions
Metric | Description |
---|---|
|
The name of the process executable. |
|
The process identifier assigned by the operating system. |
|
PID of the parent process. For example, if process 1 spawns process 2, then ppid of process 2 is pid of process 1. |
|
The total amount of CPU time this process is scheduled to run since it started. The total amount of CPU time spent for this process so far is measured in micro seconds. |
|
Limit on number of file descriptors for this process. Note: This metric is not available on Microsoft Windows, AIX, and HP-UX systems. |
|
Process CPU utilization (%). Note: The utilization value can be up to 100 times the number of processing units. |
|
Process virtual memory usage (KB). |
|
Process private memory usage (KB). |
|
Process shared memory usage (KB). Note: This metric is not available on Microsoft Windows, Solaris, and AIX systems. It is supported only on Linux systems. |
|
Working set of a program (KB) Note: This metric is only available on Microsoft Windows. |
|
Number of file descriptors open by this process. or Number of open handles by this process on Microsoft Windows. |
|
Number of threads created by this process. |
|
The process priority. |
|
The nice value of the process. Note: This metric is not applicable to Microsoft Windows systems. |
|
The state of the process. Note: This metric is not applicable to Microsoft Windows systems. |
Table B-5 oclumon dumpnodeview DEVICES View Metric Descriptions
Metric | Description |
---|---|
|
Average disk read rate within the current sample interval (KB per second). |
|
Average disk write rate within the current sample interval (KB per second). |
|
Average disk I/O operation rate within the current sample interval (I/O operations per second) |
|
Number of I/O requests in |
|
Average wait time per I/O within the current sample interval (msec). |
|
If applicable, identifies what the device is used for. Possible values are:
|
Table B-6 oclumon dumpnodeview NICS View Metric Descriptions
Metric | Description |
---|---|
|
Average network receive rate within the current sample interval (KB per second). |
|
Average network sent rate within the current sample interval (KB per second). |
|
Average effective bandwidth within the current sample interval (KB per second) |
|
Average error rate within the current sample interval (errors per second). |
|
Average incoming packet rate within the current sample interval (packets per second). |
|
Average outgoing packet rate within the current sample interval (packets per second). |
|
Average error rate for incoming packets within the current sample interval (errors per second). |
|
Average error rate for outgoing packets within the current sample interval (errors per second). |
|
Average drop rate for incoming packets within the current sample interval (packets per second). |
|
Average drop rate for outgoing packets within the current sample interval (packets per second). |
|
Average packet receive rate for unicast within the current sample interval (packets per second). |
|
Whether PUBLIC or PRIVATE. |
|
Average packet receive rate for multi-cast (packets per second). |
|
Estimated latency for this network interface card (msec). |
Table B-7 oclumon dumpnodeview FILESYSTEMS View Metric Descriptions
Metric | Description |
---|---|
|
Total amount of space (KB). |
|
Mount point. |
|
File system type, whether local file system, NFS, or other. |
|
Amount of used space (KB). |
|
Amount of available space (KB). |
|
Percentage of used space (%) |
|
Percentage of free file nodes (%). Note: This metric is not available on Microsoft Windows systems. |
Table B-8 oclumon dumpnodeview PROTOCOL ERRORS View Metric Descriptions
Metric | Description |
---|---|
|
Number of input datagrams discarded due to errors in the IPv4 headers of the datagrams. |
|
Number of input datagrams discarded because the IPv4 address in their IPv4 header's destination field was not a valid address to be received at this entity. |
|
Number of locally addressed datagrams received successfully but discarded because of an unknown or unsupported protocol. |
|
Number of failures detected by the IPv4 reassembly algorithm. |
|
Number of IPv4 discarded datagrams due to fragmentation failures. |
|
Number of times that TCP connections have made a direct transition to the |
|
Number of times that TCP connections have made a direct transition to the |
|
Total number of TCP segments retransmitted. |
|
Total number of received UDP datagrams for which there was no application at the destination port. |
|
Number of received UDP datagrams that could not be delivered for reasons other than the lack of an application at the destination port. |
Table B-9 oclumon dumpnodeview CPUS View Metric Descriptions
Metric | Description |
---|---|
|
Virtual CPU. |
|
CPU usage in system space. |
|
CPU usage in user space. |
|
Value of NIC for a specific CPU. |
|
CPU usage for a specific CPU. |
|
CPU wait time for I/O operations. |
Example B-2 dumpnodeview -n
The following example dumps node views from node1
, node2
, and node3
collected over the last 12 hours:
$ oclumon dumpnodeview -n node1 node2 node3 -last "12:00:00"
The following example displays node views from all nodes collected over the last 15 minutes at a 30-second interval:
$ oclumon dumpnodeview -allnodes -last "00:15:00" -i 30
Example B-3 dumpnodeview –format csv
The following example shows how to use the option -format csv
to output content in comma-separated values file format:
# oclumon dumpnodeview –format csv
dumpnodeview: Node name not given. Querying for the local host
----------------------------------------
Node: node1 Clock: '2016-09-02 11.18.00-0700' SerialNo:310668
----------------------------------------
SYSTEM:
"#pcpus","#cores","#vcpus","cpuht","chipname","cpuusage[%]","cpusys[%]","cpuuser[%]",
"cpunice[%]","cpuiowait[%]","cpusteal[%]","cpuq","physmemfree[KB]","physmemtotal[KB]",
"mcache[KB]","swapfree[KB]","swaptotal[KB]","hugepagetotal","hugepagefree","hugepagesize",
"ior[KB/S]","iow[KB/S]","ios[#/S]","swpin[KB/S]","swpout[KB/S]","pgin[#/S]","pgout[#/S]",
"netr[KB/S]","netw[KB/S]","#procs","#procsoncpu","#procs_blocked","#rtprocs","#rtprocsoncpu",
"#fds","#sysfdlimit","#disks","#nics","loadavg1","loadavg5","loadavg15","#nicErrors"
2,12,24,Y,"Intel(R) Xeon(R) CPU X5670 @ 2.93GHz",68.66,5.40,63.26,0.00,0.00,0.00,0,820240,
73959636,61520568,4191424,4194300,0,0,
2048,143,525,64,0,0,0,279,600.888,437.070,951,24,0,58,N/A,33120,6815744,13,5,19.25,17.67,16.09,0
TOPCONSUMERS:
"topcpu","topprivmem","topshm","topfd","topthread"
"java(25047) 225.44","java(24667) 1008360","ora_lms1_prod_1(28913) 4985464","polkit-gnome-au(20730) 1038","java(2734) 209"
Example B-4 dumpnodeview –procag
The following example shows how to output node views, aggregated by category: DBBG (DB backgrounds), DBFG (DB foregrounds), CLUST (Cluster), and OTHER (other processes).
# oclumon dumpnodeview –procag
----------------------------------------
Node: node1 Clock: '2016-09-02 11.14.15-0700' SerialNo:310623
----------------------------------------
PROCESS AGGREGATE:
cpuusage[%] privatemem[KB] maxshmem[KB] #threads #fd #processes category sid
0.62 45791348 4985200 187 10250 183 DBBG prod_1
0.52 29544192 3322648 191 10463 187 DBBG webdb_1
17.81 8451288 967924 22 511 22 DBFG webdb_1
75.94 34930368 1644492 64 1067 64 DBFG prod_1
3.42 3139208 120256 480 3556 25 CLUST
1.66 1989424 16568 1110 4040 471 OTHER
Example B-5 Node View Output
----------------------------------------
Node: rwsak10 Clock: '2016-05-08 02.11.25-0800' SerialNo:155631
----------------------------------------
SYSTEM:
#pcpus: 2 #vcpus: 24 cpuht: Y chipname: Intel(R) cpu: 1.23 cpuq: 0
physmemfree: 8889492 physmemtotal: 74369536 mcache: 55081824 swapfree: 18480404
swaptotal: 18480408 hugepagetotal: 0 hugepagefree: 0 hugepagesize: 2048 ior: 132
iow: 236 ios: 23 swpin: 0 swpout: 0 pgin: 131 pgout: 235 netr: 72.404
netw: 97.511 procs: 969 procsoncpu: 6 rtprocs: 62 rtprocsoncpu N/A #fds: 32640
#sysfdlimit: 6815744 #disks: 9 #nics: 5 nicErrors: 0
TOP CONSUMERS:
topcpu: 'osysmond.bin(30981) 2.40' topprivmem: 'oraagent.bin(14599) 682496'
topshm: 'ora_dbw2_oss_3(7049) 2156136' topfd: 'ocssd.bin(29986) 274'
topthread: 'java(32255) 53'
CPUS:
cpu18: sys-2.93 user-2.15 nice-0.0 usage-5.8 iowait-0.0 steal-0.0
.
.
.
PROCESSES:
name: 'osysmond.bin' pid: 30891 #procfdlimit: 65536 cpuusage: 2.40 privmem: 35808
shm: 81964 #fd: 119 #threads: 13 priority: -100 nice: 0 state: S
.
.
.
DEVICES:
sdi ior: 0.000 iow: 0.000 ios: 0 qlen: 0 wait: 0 type: SYS
sda1 ior: 0.000 iow: 61.495 ios: 629 qlen: 0 wait: 0 type: SYS
.
.
.
NICS:
lo netrr: 39.935 netwr: 39.935 neteff: 79.869 nicerrors: 0 pktsin: 25
pktsout: 25 errsin: 0 errsout: 0 indiscarded: 0 outdiscarded: 0
inunicast: 25 innonunicast: 0 type: PUBLIC
eth0 netrr: 1.412 netwr: 0.527 neteff: 1.939 nicerrors: 0 pktsin: 15
pktsout: 4 errsin: 0 errsout: 0 indiscarded: 0 outdiscarded: 0
inunicast: 15 innonunicast: 0 type: PUBLIC latency: <1
FILESYSTEMS:
mount: / type: rootfs total: 563657948 used: 78592012 available: 455971824
used%: 14 ifree%: 99 GRID_HOME
.
.
.
PROTOCOL ERRORS:
IPHdrErr: 0 IPAddrErr: 0 IPUnkProto: 0 IPReasFail: 0 IPFragFail: 0
TCPFailedConn: 5197 TCPEstRst: 717163 TCPRetraSeg: 592 UDPUnkPort: 103306
UDPRcvErr: 70
Parent topic: OCLUMON Command Reference