High-Availability and Clusters
|
|
|
| Minimum Setup |
- Get at least 2 machines and mirror its data
- Use "good" hardware
- Provide clean electrical power and ventalition
- Establish good process, proceedures and policies
|
| Failure Modes |
- Incoming T1 line outside the facility
- Get a second T1 somewhere else
- CSU/DSU and hub
- Router, firewalls, gateway, mail, web, vpn
- have a hot-swap server ( expensive )
- untested updates deployed to productions
- security breaches and data loss
- use 2 system and do load sharing and you know it's up to date
- cpu and power supply fans
- have spare fans floating around
- shutdown the system upon fan failrues
- cabling - ethernet cables, power cords
- disk drives
- use raid5 to try to keep running due to sngle disks failures
- user mistakes, system admin goofs, management and security mistakes
- lots of training and use experienced people to manage systems and processes and proceedures
|
| HA Mailing Lists |
|
| High Availability HOWTO |
|
| Load Balancing |
|
| Monitoring & Failover |
|
| Other HA Apps |
|
| Down Time per Year for Maintenance |
- While one server is down for maintenance, make sure you still have HA working with the remaining servers that is still online ---> at least 3 systems available to look like one server
- Keep in mind your system takes a minute or two to shutdown
and another minute or two to reboot ..
ITX-Blades.net/HA 99.99% Uptime
99.999% uptime implies roughly 14 minutes of down time over a 4 year period
Motorola.com 99.9999%
|
| Commercial HA Vendors |
|