Server rental store

Data versioning

# Data Versioning on MediaWiki Servers

This article details the data versioning strategy employed on our MediaWiki servers. Understanding this is crucial for maintaining data integrity, facilitating rollback operations, and ensuring disaster recovery capabilities. This guide is geared towards new server engineers and administrators.

Overview

Data versioning is the practice of retaining multiple versions of data over time. In the context of our MediaWiki installation, this encompasses the database (containing page content, user data, and configuration settings), the file uploads directory (containing images, documents, and other media), and configuration files. We utilize a multi-layered approach, combining database replication, regular backups, and filesystem snapshots. Effective data versioning allows us to recover from accidental data loss, software errors, or malicious attacks. Proper implementation requires careful coordination with our Database Administration team and adherence to established Backup Procedures.

Database Versioning

The core of our MediaWiki data resides within the database. We employ a master-slave replication setup to provide both redundancy and read scalability. This is supplemented by regular full and incremental backups.

Replication

We utilize asynchronous replication. The master server handles all write operations, and changes are propagated to the slave servers.

Server Role Purpose Status
Master Handles all write operations; primary data source. Active
Slave 1 Read-only replica for reporting and read-heavy operations. Active
Slave 2 Read-only replica; geographical redundancy. Active

The replication lag is monitored via Monitoring Dashboard and alerts are triggered if it exceeds acceptable thresholds. Failover procedures to a slave server are documented in the Disaster Recovery Plan. We use MySQL Replication technology.

Backups

Full database backups are performed weekly, and incremental backups are performed daily. These backups are stored offsite, ensuring protection against site-wide disasters.

Backup Type Frequency Retention Period Storage Location
Full Backup Weekly 4 Weeks Offsite Secure Storage
Incremental Backup Daily 7 Days Offsite Secure Storage

Database backups are tested regularly through Restore Drills to verify their integrity and ensure a quick recovery time objective (RTO). Restoration procedures are outlined in the Database Restoration Guide.

File Uploads Versioning

The `mw-config.php` file defines the location of our uploads directory. Versioning of files within this directory is achieved through filesystem snapshots and regular archiving.

Filesystem Snapshots

Filesystem snapshots are taken hourly, providing a point-in-time copy of the entire uploads directory. These snapshots are stored on a separate storage volume. Snapshots allow for quick rollback to previous versions of files if needed. The snapshot process is automated using LVM Snapshots.

Archiving

Older versions of files (beyond the snapshot retention period) are archived to long-term storage. This archival process is managed by the Storage Management team.

File Versioning Method Retention Period Purpose
Filesystem Snapshots 24 Hours Quick rollback of recent file changes.
Archival Storage Indefinite Long-term preservation of file history.

Configuration File Versioning

Configuration files (such as `LocalSettings.php`, `mw-config.php`, and web server configuration files) are versioned using a version control system – Git.

Git Repository

All configuration files are stored in a private Git repository. Changes are made through pull requests, reviewed by senior engineers, and then merged into the main branch. This ensures that all changes are tracked, auditable, and easily reversible.

Branching Strategy

We use a Gitflow branching strategy with dedicated branches for development, testing, and production. Deployment of configuration changes is automated using Continuous Integration/Continuous Deployment (CI/CD).

Rollback Procedures

In the event of data corruption or accidental changes, the following rollback procedures are followed:

1. **Database:** Restore from the most recent verified backup or failover to a slave server. 2. **File Uploads:** Restore files from a filesystem snapshot. 3. **Configuration Files:** Revert to a previous commit in the Git repository.

Detailed rollback procedures are documented in the Incident Response Guide. Always consult with the Security Team before initiating a rollback.

Future Considerations

We are currently evaluating the implementation of more advanced data versioning technologies, such as:

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️