A
|
---|
application restart | | Starting an application, usually on another node,
after a failure. Application can be restarted manually, which may
be necessary if data must be restarted before the application can run
(example: Business Recovery Services work like this.) Applications
can by restarted by an operator using a script, which can reduce
human error. Or applications can be started on the local or remote
site automatically after detecting the failure of the primary site.
|
---|
arbitrator | | Nodes in a disaster tolerant architecture that act
as tie-breakers in case all of the nodes in a data center go down
at the same time. These nodes are full members of the Serviceguard
cluster and must conform to the minimum requirements. The arbitrator
must be located in a third data center to ensure that the failure
of an entire data center does not bring the entire cluster down.
See also quorum server.
|
---|
asymmetrical cluster | | A cluster that has more nodes at one site than at
another. For example, an asymmetrical metropolitan cluster may have
two nodes in one building, and three nodes in another building. Asymmetrical
clusters are not supported in all disaster tolerant architectures.
|
---|
asynchronous data replication | | Local I/O will complete without waiting for the replicated
I/O to complete; however, it is expected that asynchronous data
replication will process the I/Os in the original order.
|
---|
automatic failover | | Failover directed by automation scripts or software
(such as Serviceguard) and requiring no human intervention. In a ContinentalClusters environment,
the start-up of package recovery groups on the Recovery Cluster without
intervention. See also application restart.
|
---|
B
|
---|
BC | | (Business Copy) A PVOL or SVOL in an HP StorageWorks
XP series disk array that can be split from or merged into a normal PVOL
or SVOL. It is often used to create a snapshot of the data taken
at a known point in time. Although this copy, when split, is often
consistent, it is not usually current.
|
---|
BCV | | (Business Continuity Volume) An EMC Symmetrix term
that refers to a logical device on the EMC Symmetrix that may be merged
into or split from a regular R1 or R2 logical device. It is often
used to create a snapshot of the data taken at a known point in
time. Although this copy, when split, is often consistent, it is
not usually current.
|
---|
bi-directional configuration | | A continental cluster configuration in which each
cluster serves the roles of primary and recovery cluster for different
recovery groups. Also known as a mutual recovery configuration.
|
---|
Business Recovery Service | | Service provided by a vendor to host the backup systems
needed to run mission critical applications following a disaster.
|
---|
C
|
---|
campus cluster | | A single cluster that is geographically dispersed
within the confines of an area owned or leased by the organization
such that it has the right to run cables above or below ground between buildings
in the campus. Campus clusters are usually spread out in different
rooms in a single building, or in different adjacent or nearby buildings.
See also Extended Distance Cluster.
|
---|
cascading failover | | Cascading failover is the ability of an application
to fail from a primary to a secondary location, and then to fail
to a recovery location on a different site. The primary location
contains a metropolitan cluster built with Metrocluster EMC SRDF,
and the recovery location has a standard Serviceguard cluster.
|
---|
client reconnect | | Users access to the backup site after failover.
Client reconnect can be transparent, where the user is automatically
connected to the application running on the remote site, or manual, where
the user selects a site to connect to.
|
---|
cluster | | An Serviceguard cluster is a networked grouping
of HP 9000 and/or HP Integrity Servers series 800 servers (host systems
known as nodes) having sufficient redundancy of software and hardware
that a single failure will not significantly disrupt service. Serviceguard
software monitors the health of nodes, networks, application services,
EMS resources, and makes failover decisions based on where the application
is able to run successfully.
|
---|
cluster alarm | | Time at which a message is sent indicating that
the Primary Cluster is probably in need of recovery. The cmrecovercl command is enabled at this time.
|
---|
cluster alert | | Time at which a message is sent indicating a problem
with the cluster.
|
---|
cluster event | | A cluster condition that occurs when the cluster
goes down or enters an UNKNOWN state, or
when the monitor software returns an error. This event may cause
an alert messages to be sent out, or it may cause an alarm condition
to be set, which allows the administrator on the Recovery Cluster to issue
the cmrecovercl command. The return of the cluster to the UP state
results in a cancellation of the event, which may be accompanied
by a cancel event notice. In addition, the cancellation disables the
use of the cmrecovercl command.
|
---|
cluster quorum | | A dynamically calculated majority used to determine
whether any grouping of nodes is sufficient to start or run the
cluster. Cluster quorums prevent split-brain syndrome which can
lead to data corruption or inconsistency. Currently at least 50%
of the nodes plus a tie-breaker are required for a quorum. If no
tie-breaker is configured, then greater than 50% of the nodes is
required to start and run a cluster.
|
---|
command device | | A disk area in the HP StorageWorks XP series disk
array used for internal system communication. You create two command
devices on each array, each with alternate links (PV links).
|
---|
consistency group | | A set of Symmetrix RDF devices that are configured
to act in unison to maintain the integrity of a database. Consistency
groups allow you to configure R1/R2 devices on multiple Symmetrix
frames in Metrocluster with EMC SRDF.
|
---|
continental cluster | | A group of clusters that use routed networks and/or
common carrier networks for data replication and cluster communication
to support package failover between separate clusters in different
data centers. Continental clusters are often located in different
cities or different countries and can span 100s or 1000s of kilometers.
|
---|
Continuous Access | | A facility provided by the Continuos Access software
option available with the HP StorageWorks E Disk Array XP series.
This facility enables physical data replication between XP series disk
arrays.
|
---|
D
|
---|
data center | | A physically proximate collection of nodes and
disks, usually all in one room.
|
---|
data consistency | | Whether data are logically correct and immediately
usable; the validity of the data after the last write. Inconsistent
data, if not recoverable to a consistent state, is corrupt.
|
---|
data currency | | Whether the data contain the most recent transactions,
and/or whether the replica database has all of the committed transactions
that the primary database contains; speed of data replication may
cause the replica to lag behind the primary copy, and compromise
data currency.
|
---|
data loss | | The inability to take action to recover data. Data
loss can be the result of transactions being copied that were lost when
a failure occurred, non-committed transactions that were rolled
back as pat of a recovery process, data in the process of being replicated
that never made it to the replica because of a failure, transactions
that were committed after the last tape backup when a failure occurred
that required a reload from the last tape backup. transaction processing
monitors (TPM), message queuing software, and synchronous
data replication are measures that can protect against data loss.
|
---|
data mirroring | | See See mirroring..
|
---|
data recoverability | | The ability to take action that results in data
consistency, for example database rollback/roll forward recovery.
|
---|
data replication | | The scheme by which data is copied from one site
to another for disaster tolerance. Data replication can be either
physical (see physical data replication)
or logical (see logical data replication).
In a ContinentalClusters environment, the process by which data that is used
by the Primary Cluster packages is transferred to the Recovery Cluster and made
available for use on the Recovery Cluster in the event of a recovery.
|
---|
database replication | | A software-based logical data replication scheme
that is offered by most database vendors.
|
---|
disaster | | An event causing the failure of multiple components
or entire data centers that render unavailable all services at a single
location; these include natural disasters such as earthquake, fire,
or flood, acts of terrorism or sabotage, large-scale power outages.
|
---|
disaster protection | | (Don’t use this term?) Processes, tools,
hardware, and software that provide protection in the event of an
extreme occurrence that causes application downtime such that the
application can be restarted at a different location within a fixed
period of time.
|
---|
disaster recovery | | The process of restoring access to applications
and data after a disaster. Disaster recovery can be manual, meaning
human intervention is required, or it can be automated, requiring
little or no human intervention.
|
---|
disaster recovery services | | Services and products offered by companies that
provide the hardware, software, processes, and people necessary
to recover from a disaster.
|
---|
disaster tolerant | | The characteristic of being able to recover quickly
from a disaster. Components of disaster tolerance include redundant
hardware, data replication, geographic dispersion, partial or complete recovery
automation, and well-defined recovery procedures.
|
---|
disaster tolerant architecture | | A cluster architecture that protects against multiple points
of failure or a single catastrophic failure that affects many components
by locating parts of the cluster at a remote site and by providing
data replication to the remote site. Other components of disaster tolerant
architecture include redundant links, either for networking or data replication,
that are installed along different routes, and automation of most
or all of the recovery process.
|
---|
E,
F
|
---|
ESCON | | Enterprise Storage Connect. A type of fiber-optic
channel used for inter-frame communication between EMC Symmetrix
frames using EMC SRDF or between HP StorageWorks E XP series disk array
units using Continuous Access XP.
|
---|
event log | | The default location (/var/adm/cmconcl/eventlog) where events are logged on the monitoring ContinentalClusters system.
All events are written to this log, as well as all notifications that
are sent elsewhere.
|
---|
Extended Distance Cluster | | A cluster with alternate nodes located in different data
centers separated by some distance. Formerly known as campus
cluster.
|
---|
failback | | Failing back from a backup node, which may or may
not be remote, to the primary node that the application normally runs
on.
|
---|
failover | | The transfer of control of an application or service
from one node to another node after a failure. Failover can be manual,
requiring human intervention, or automated, requiring little or
no human intervention.
|
---|
filesystem replication | | The process of replicating filesystem changes from
one node to another.
|
---|
G
|
---|
gatekeeper | | A small EMC Symmetrix device configured to function
as a lock during certain state change operations.
|
---|
H,
I
|
---|
heartbeat network | | A network that provides reliable communication among nodes
in a cluster, including the transmission of heartbeat messages,
signals from each functioning node, which are central to the operation
of the cluster, and which determine the health of the nodes in the
cluster.
|
---|
high availability | | A combination of technology, processes, and support partnerships
that provide greater application or system availability.
|
---|
J,
K, L
|
---|
local cluster | | A cluster located in a single data center. This
type of cluster is not disaster tolerant.
|
---|
local failover | | Failover on the same node; this most often applied
to hardware failover, for example local LAN failover is switching to
the secondary LAN card on the same node after the primary LAN card
has failed.
|
---|
logical data replication | | A type of on-line data replication that replicates
logical transactions that change either the filesystem or the database.
Complex transactions may result in the modification of many diverse
physical blocks on the disk.
|
---|
LUN | | (Logical Unit Number) A SCSI term that refers to
a logical disk device composed of one or more physical disk mechanisms, typically
configured into a RAID level.
|
---|
M
|
---|
M by N | | A type of Symmetrix grouping in which up to two
Symmetrix frames may be configured on either side of a data replication
link in a Metrocluster with EMC SRDF configuration. M by N configurations include
1 by 2, 2 by 1, and 2 by 2.
|
---|
manual failover | | Failover requiring human intervention to start an
application or service on another node.
|
---|
Metrocluster | | A Hewlett-Packard product that allows a customer
to configure an Serviceguard cluster as a disaster tolerant metropolitan
cluster.
|
---|
metropolitan cluster | | A cluster that is geographically dispersed within
the confines of a metropolitan area requiring right-of-way to lay
cable for redundant network and data replication components.
|
---|
mirrored data | | Data that is copied using mirroring.
|
---|
mirroring | | Disk mirroring hardware or software, such as MirrorDisk/UX.
Some mirroring methods may allow splitting and merging.
|
---|
mission critical application | | Hardware, software, processes and support services that
must meet the uptime requirements of an organization. Examples of
mission critical application that must be able to survive regional
disasters include financial trading services, e-business operations,
911 phone service, and patient record databases.
|
---|
mission critical solution | | The architecture and processes that provide the
required uptime for mission critical applications.
|
---|
multiple points of failure (MPOF) | | More than one point of failure that can bring down an
Serviceguard cluster.
|
---|
multiple system high availability | | Cluster technology and architecture that increases
the level of availability by grouping systems into a cooperative
failover design.
|
---|
mutual recovery configuration | | A continental cluster configuration in which each
cluster serves the roles of primary and recovery cluster for different
recovery groups. Also known as a bi-directional configuration.
|
---|
N
|
---|
network failover | | The ability to restore a network connection after
a failure in network hardware when there are redundant network links
to the same IP subnet.
|
---|
notification | | A message that is sent following a cluster or package
event.
|
---|
O
|
---|
off-line data replication. | | Data replication by storing data off-line, usually
a backup tape or disk stored in a safe location; this method is
best for applications that can accept a 24-hour recovery time.
|
---|
on-line data replication | | Data replication by copying to another location
that is immediately accessible. On-line data replication is usually
done by transmitting data over a link in real time or with a slight delay
to a remote site; this method is best for applications requiring
quick recovery (within a few hours or minutes).
|
---|
P
|
---|
package alert | | Time at which a message is sent indicating a problem
with a package.
|
---|
package event | | A package condition such as a failure that causes
a notification message to be sent. Package events can be accompanied
by alerts, but not alarms. Messages are for information only; the cmrecovercl command is not enabled for a package event.
|
---|
package recovery group | | A set of one or more packages with a mapping between their
instances on the Primary Cluster and their instances on the Recovery Cluster.
|
---|
physical data replication | | An on-line data replication method that duplicates
I/O writes to another disk on a physical block basis. Physical replication
can be hardware-based where data is replicated between disks over
a dedicated link (for example EMC’s Symmetrix Remote Data Facility
or the HP StorageWorks E Disk Array XP Series Continuous Access),
or software-based where data is replicated on multiple disks using
dedicated software on the primary node (for example, MirrorDisk/UX).
|
---|
planned downtime | | An anticipated period of time when nodes are taken
down for hardware maintenance, software maintenance (OS and application),
backup, reorganization, upgrades (software or hardware), etc.
|
---|
PowerPath | | A host-based software product from Symmetrix that
delivers intelligent I/O path management. PowerPath is required for
M by N Symmetrix configurations using Metrocluster with EMC SRDF.
|
---|
Primary Cluster | | A cluster in production that has packages protected
by the HP ContinentalClusters product.
|
---|
primary package | | The package that normally runs on the Primary Cluster in
a production environment.
|
---|
pushbutton failover | | Use of the cmrecovercl command to allow all package recovery groups to start
up on the Recovery Cluster following a significant cluster event on the Primary Cluster.
|
---|
PV links | | A method of LVM configuration that allows you to
provide redundant disk interfaces and buses to disk arrays, thereby protecting
against single points of failure in disk cards and cables.
|
---|
PVOL | | A primary volume configured in an XP series disk
array that uses Continuous Access. PVOLs are the primary copies
in physical data replication with Continuos Access on the XP.
|
---|
Q
|
---|
quorum | | See See cluster quorum..
|
---|
quorum server | | A cluster node that acts as a tie-breaker in a
disaster tolerant architecture in case all of the nodes in a data center
go down at the same time. See also arbitrator.
|
---|
R
|
---|
R1 | | The Symmetrix term indicating the data copy that
is the primary copy.
|
---|
R2 | | The Symmetrix term indicating the remote data copy
that is the secondary copy. It is normally read-only by the nodes
at the remote site.
|
---|
Recovery Cluster | | A cluster on which recovery of a package takes place
following a failure on the Primary Cluster.
|
---|
recovery group failover | | A failover of a package recovery group from one
cluster to another.
|
---|
recovery package | | The package that takes over on the Recovery Cluster in
the event of a failure on the Primary Cluster.
|
---|
regional disaster | | A disaster, such as an earthquake or hurricane,
that affects a large region. Local, campus, and proximate metropolitan
clusters are less likely to protect from regional disasters.
|
---|
remote failover | | Failover to a node at another data center or remote
location.
|
---|
resynchronization | | The process of making the data between two sites
consistent and current once systems are restored following a failure.
Also called data resynchronization.
|
---|
rolling disaster | | A second disaster that occurs before recovering
from a previous disaster, for example, while data is being synchronized
between two data centers after a disaster, one of the data centers
fails, interrupting the data synchronization process. Rolling disasters
may result in data corruption that requires a reload from tape backups.
|
---|
S
|
---|
single point of failure (SPOF) | | A component of a cluster or node that, if it fails,
affects access to applications or services. See also multiple
points of failure.
|
---|
single system high availability | | Hardware design that results in a single system
that has availability higher than normal. Hardware design examples
are: on-line addition or replacement of I/O cards, memory,
etc.
|
---|
special device file | | The device file name that the HP-UX operating system
gives to a single connection to a node, in the format /dev/devtype/filename.
|
---|
split-brain syndrome | | When a cluster reforms with equal numbers of nodes
at each site, and each half of the cluster thinks it is the authority
and starts up the same set of applications, and tries to modify
the same data, resulting in data corruption. Serviceguard architecture
prevents split-brain syndrome in all cases unless dual cluster locks
are used.
|
---|
SRDF | | (Symmetrix Remote Data Facility) A level 1-3 protocol
used for physical data replication between EMC Symmetrix disk arrays.
|
---|
SVOL | | A secondary volume configured in an XP series disk
array that uses Continuous Access. SVOLs are the secondary copies
in physical data replication with Continuos Access on the XP.
|
---|
SymCLI | | The Symmetrix command line interface used to configure
and manage EMC Symmetrix disk arrays.
|
---|
Symmetrix device number | | The unique device number that identifies an EMC logical
volume.
|
---|
synchronous data replication | | Each data replication I/O waits for the preceding
I/O to complete before beginning another replication. Minimizes
the chance of inconsistent or corrupt data in the event of a rolling
disaster.
|
---|
T
|
---|
transaction processing monitor (TPM) | | Software that allows you to modify an application
to store in-flight transactions in an external location until that
transaction has been committed to all possible copies of the database
or filesystem, thus ensuring completion of all copied transactions.
A TPM protects against data loss at the expense of the CPU overhead
involved in applying the transaction in each database replica. Software that provides a reliable mechanism to ensure that
all transactions are successfully committed. A TPM may also provide
load balancing among nodes.
|
---|
transparent failover | | A client application that automatically reconnects
to a new server without the user taking any action.
|
---|
transparent IP failover | | Moving the IP address from one network interface
card (NIC), in the same node or another node, to another NIC that
is attached to the same IP subnet so that users or applications
may always specify the same IP name/address whenever they connect,
even after a failure.
|
---|
U-Z
|
---|
volume group | | In LVM, a set of physical volumes such that logical
volumes can be defined within the volume group for user access.
A volume group can be activated by only one node at a time unless
you are using Serviceguard OPS Edition. Serviceguard can activate
a volume group when it starts a package. A given disk can belong
to only one volume group. A logical volume can belong to only one
volume group.
|
---|
WAN data replication solutions | | Data replication that functions over leased or switched
lines. See also continental cluster.
|
---|