Most of them are outdated, but provide historical design context.
They are not user documentation and should not be treated as such.
Documentation is available here.
Detailed PM Health Check
Power Management Health Check
Summary
The requirement is to add a periodic health check of all Hosts with configured PM The scheduled job will try to send a status command to all PM enabled hosts periodically (once an-hour by default) and raise alerts for failed operations
Owner
Feature owner: Eli Mesika (emesika) Engine Component owner: Eli Mesika (emesika) QA Owner: Pavel Stehlik (pstehlik) Email: emesika@redhat.com
Current status
- Target Release: 3.5
- Status: Design
- Last updated date: MAY 3 2014
Detailed Description
Add a class PmHealtCheckManager to handle the scheduled check This class will
Read the related configuration values(see Configuration) and if feature is enabled reads the
PMHealtCheckIntervalInSec configurationvariable.
Create the Quartz job in it initialize() method which will be called from backend::initialize()
CRUD
N/A
DAO
N/A
Metadata
N/A
Configuration
The following configuration variabled will be added to vdc_options
PMHealthCheckEnabled (boolean, false by default) - Enable/Diable the Pm Health Check scheduled job
PMHealthCheckIntervalInSec (int, default 3600) - Determines the number of seconds for scheduling the PM Healt Check operation
Those configuration value should be exposed to the engine-config tool.
Business Logic
The PmHealtCheckManager (if enabled) will create a Quartz job that runs each PmHealtCheckIntervalInSec and will do the following:
Search for all Hosts with defined and enabled power management
For each Host
If the Host has just a Primary card, send a status command to this card, In case that this failed
and Alert is generated, in case that it succeeded we check if there is an active alert for this host
and remove it.
If the Host has Primary & Secondary cards
For sequential devices, both are tested but only warning alerts are generated if one of those
cards is OK and one fails
For concurrent devices both are tested and alert is generated no matter which card fails
API
N/A
User Experience
N/A
Installation/Upgrade
New configuration values will be installed (see Configuration)
User work-flows
User may see Alerts generated by the PM Healt Check job listed with other PM alerts generated by the system. In 3.5, the user may be able to clear those alerts as any other alerts on the system See Dismiss Alerts
Enforcement
Code should verify that PM Health Check cycle can not be run while another cycle is active, this is due the fact that in a general elecricity failure or shutdown, looping over all hosts and waiting for the communication timeouts may be time consuming
Dependencies / Related Features and Projects
Affected oVirt projects
See RFE
Documentation / External references
Future Directions
N/A
Open Issues
N/A