Disaster recovery planning
- Disaster Recovery Planning
This article details disaster recovery (DR) planning for a MediaWiki 1.40 installation. It’s geared towards system administrators and those responsible for maintaining the availability of a MediaWiki site. Effective DR planning ensures minimal downtime and data loss in the event of a catastrophic failure.
Understanding Disaster Recovery
Disaster recovery isn't just about backups; it's a comprehensive strategy encompassing prevention, mitigation, and restoration. A *disaster* can be anything from a hardware failure to a natural disaster, or even a cyberattack. A solid DR plan should address several key areas: data backup and recovery, server redundancy, and failover procedures. Consider also the impact of a disaster on related services such as user accounts and content pages.
Data Backup and Recovery
This is the foundational element of any DR plan. Regularly scheduled backups are crucial. We will cover several backup strategies.
Backup Methods
- **Full Backups:** The simplest approach – copies *all* data. Time-consuming and resource-intensive, but provides the fastest restore.
- **Differential Backups:** Copies all changes made since the *last full backup*. Faster than full backups, but restore time is longer as you need the full backup *and* the latest differential.
- **Incremental Backups:** Copies only the changes made since the *last backup* (full, differential, or incremental). Fastest backup time, but the slowest restore, requiring all incremental backups *and* the full backup.
Backup Frequency
The frequency depends on the rate of change of your wiki.
Frequency | Description | Recommended for |
---|---|---|
Daily | Full backup every day. | Small wikis with minimal daily changes. |
Weekly | Full backup weekly, with differential backups daily. | Medium-sized wikis with moderate changes. |
Weekly/Incremental | Full backup weekly, incremental backups daily. | Large, active wikis with frequent edits. |
Backup Targets
Backups should *never* be stored on the same physical hardware as the primary wiki. Options include:
- **Offsite Storage:** Dedicated backup servers in a geographically separate location.
- **Cloud Storage:** Services like Amazon S3, Google Cloud Storage, or Azure Blob Storage.
- **Tape Storage:** While less common now, still viable for long-term archiving.
Backups *must* include the following:
- The MediaWiki installation directory.
- The `images` directory.
- The MySQL/MariaDB database (using `mysqldump` or similar tools). See Manual:Backups for more information.
- The `LocalSettings.php` file (securely stored!). This file contains critical configuration information regarding your database connection and other essential settings.
Server Redundancy and Failover
Redundancy minimizes downtime. Failover automates switching to a backup server in case of primary server failure.
Load Balancing
Distributes traffic across multiple servers. If one server fails, the others continue to handle requests. Requires a load balancer (hardware or software). Consider using a reverse proxy like Manual:Configuration settings#Reverse proxy for additional benefits.
Database Replication
Replicates the database to a secondary server in real-time. In case of primary database failure, the secondary server can be promoted to primary. MySQL/MariaDB offers several replication options. See Manual:Database setup for details.
Failover Automation
Automates the process of switching to a backup server. This can be achieved using tools like:
- **Keepalived:** Manages virtual IP addresses and automatically fails over to a backup server.
- **Pacemaker:** A more complex cluster resource manager.
- **Cloud Provider Solutions:** Many cloud providers offer automated failover services.
Server Specifications (Example)
This table outlines specifications for primary and secondary servers in a redundant setup.
Component | Primary Server | Secondary Server (Hot Standby) |
---|---|---|
CPU | 16 Core Intel Xeon | 12 Core Intel Xeon |
RAM | 64 GB DDR4 | 32 GB DDR4 |
Storage | 2 TB SSD (RAID 1) | 1 TB SSD (RAID 1) |
Network | 1 Gbps Dedicated | 1 Gbps Dedicated |
Operating System | Linux (Ubuntu Server 22.04) | Linux (Ubuntu Server 22.04) |
Testing and Documentation
A DR plan is useless if it hasn’t been tested.
Regular Testing
- **Backup Restoration Tests:** Periodically restore backups to a test environment to verify integrity and recovery time.
- **Failover Drills:** Simulate server failures and verify that the failover process works as expected.
- **Database Failover Tests:** Test the database replication and failover procedures.
Documentation
Detailed documentation is *essential*. This should include:
- Backup procedures.
- Restore procedures.
- Failover procedures.
- Contact information for key personnel.
- Server configurations.
- Database configurations.
- A clear and concise runbook detailing the steps to take during a disaster.
Recovery Time Objective (RTO) and Recovery Point Objective (RPO)
These metrics define your DR goals.
- **RTO (Recovery Time Objective):** The maximum acceptable downtime. For example, an RTO of 4 hours means the wiki must be back online within 4 hours of a disaster.
- **RPO (Recovery Point Objective):** The maximum acceptable data loss. For example, an RPO of 1 hour means you can afford to lose up to 1 hour of data. These objectives should be clearly defined and aligned with business requirements.
Security Considerations
Don't forget security! DR plans should incorporate security best practices:
- **Secure Backups:** Encrypt backups to protect sensitive data. Consider using Extension:SecureSignOn for enhanced security.
- **Access Control:** Limit access to backup and recovery systems to authorized personnel only.
- **Regular Security Audits:** Ensure the security of your DR infrastructure.
Additional Resources
- Manual:Installation
- Manual:Configuration
- Manual:Database
- Manual:Upgrading
- Special:MyPreferences – Important for individual user settings.
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️