The CPU Monitor (a part of
the diagnostic tool Event Monitor Services (EMS) and not a part of
the vPars Monitor) is designed to monitor cache parity errors within
the CPUs on the system. With its Dynamic Processor Resilience (DPR),
if the CPU Monitor detects a pre-determined number of errors, the
CPU Monitor will deactivate a CPU for the current boot session. If
the problems are severe enough, the CPU Monitor will deconfigure the
socket for the next boot of the system.
Deactivation of a CPU means
that the OS will attempt to no longer use the CPU by migrating all
threads off the CPU. Deactivation of a CPU is not persistent across an OS or system reboot.
Deconfiguration of a socket
means that the EMS issues a firmware call, marking the socket for
deconfiguration on the next system boot. On the next system boot,
none of the cores in the target socket are visible to either the OS
in standalone mode or the OS instances of the virtual partitions.
The deconfiguration is persistent across system boots.
Note the following two items:
A deactivation of a CPU does not mean a deconfiguration
of its socket. The CPU Monitor is able to determine whether the CPU
needs to be deactivated or whether it needs to take further action
and deconfigure the socket.
A reboot of a virtual partition is not the same as
a reboot of the system (the entire box or nPartition).
The exceptions to the deactivation of CPUs are
the boot processor of each OS instance (the boot processor has a logical
instance of zero) and the last CPU in a cell or nPartition. The exception
to the deconfiguration of sockets is that the last remaining socket
will not be deconfigured (otherwise, the system could not boot).
If any spare iCAP (formerly known as iCOD) or
PPU CPUs are available, the necessary number of CPUs will be activated
to replace the CPUs deactivated.
|
| |
|
| NOTE: On a vPars system, when a virtual partition goes
down and contains a deconfigured or deactivated CPU, the Monitor will
try to decommission the CPU from use and replace it with another good
CPU if possible. If this is not possible, the vPars Monitor will not
allow the partition to boot until the deconfigured or deactivated
CPU can be taken out of use. Following are some cases where the vPars
Monitor may not allow the virtual partition to boot:
There is a deconfigured or deactivated CPU which has
been reserved for the partition as part of the total (cpu::num) request and vPars Monitor does not have
any free CPUs with which to replace it. To correct this, you can delete
CPUs from other partitions or from this partition. There is a deconfigured or deactivated CPU that has
been bound to the partition by hardware path (cpu:hw_path) which the vPars Monitor is not able
to replace with another available CPU. To correct this, you can remove
the CPU specified by hardware path using -d cpu:hw_path to allow the deconfigured or deactivated CPU to be decommissioned
and replaced with another (working) CPU. There is a deconfigured CPU which has been reserved
for the partition as part of a CLP request (cell:cell_ID:cpu::num) and there are no free CLPs
in that cell. To correct this, you can make available CPUs from that
cell by deleting the CPUs that are part of this cell from other partitions
or delete the CPUs from the cell in this partition.
Dual-core processors have two CPUs (that is, cores)
per processor. Deactivation happens on a CPU level, but deconfiguration
happens at the socket level. If a processor’s socket is deconfigured,
both CPUs sharing the socket will be unavailable. (Integrity only) If a CPU is marked for deconfiguration using an EFI command and the
nPartition is not rebooted (for example, the vPars Monitor is immediately
booted), the vPars Monitor will not know or indicate (including with vparstatus) that the CPU has been marked for deconfiguration
and will use the CPU like any other working CPU. |
|
| |
|