Server rental store

Change Management Process

# Change Management Process

This article details the Change Management Process for our MediaWiki server infrastructure. It is intended for new server engineers and system administrators who will be making changes to the production environment. Adhering to this process is critical to maintaining system stability and minimizing disruption to our users. Understanding this process and the associated tools is essential for all personnel with access to the server environment. Please also review our Disaster Recovery Plan and Security Policy for related information.

Overview

The Change Management Process is a structured approach to controlling changes made to the MediaWiki server environment. This process ensures all changes are planned, tested, approved, and implemented in a controlled manner. This includes changes to the PHP configuration, Apache web server settings, database schema, extensions, and core MediaWiki code. Failure to follow this process can lead to unexpected downtime, data loss, or security vulnerabilities. It is closely tied to our Monitoring System and Incident Response Procedures.

Change Request Process

All proposed changes, regardless of size, must be submitted as a Change Request (CR). The CR should include a detailed description of the change, the reason for the change, the potential impact, the rollback plan, and the testing plan.

The process flow is as follows:

1. **Submission:** Submit the CR via our Jira ticketing system. 2. **Review:** The CR will be reviewed by the Change Advisory Board (CAB), consisting of senior server engineers and the Database Administrator. 3. **Approval:** Approved CRs will be scheduled for implementation. 4. **Implementation:** The change will be implemented during a scheduled maintenance window, if necessary. 5. **Verification:** After implementation, the change will be verified to ensure it functions as expected. 6. **Documentation:** All changes must be documented in our Wiki Documentation System and the Configuration Management Database.

Risk Assessment & Impact Analysis

Before any change is approved, a thorough risk assessment and impact analysis must be conducted. This involves identifying potential risks associated with the change, assessing the likelihood and impact of those risks, and developing mitigation strategies.

Here's a breakdown of risk levels:

Risk Level Description Action
High Significant potential for disruption or data loss. Requires extensive testing, rollback plan, and CAB approval. May require postponement.
Medium Moderate potential for disruption. Requires thorough testing and a rollback plan.
Low Minimal potential for disruption. Requires basic testing and documentation.

Impact analysis determines which systems and services may be affected by the change. This includes evaluating dependencies and identifying potential conflicts with existing configurations. We use a Dependency Mapping Tool to assist with this process.

Maintenance Windows

Changes that may impact users should be implemented during scheduled maintenance windows. These windows are typically scheduled during off-peak hours to minimize disruption.

Current Maintenance Window Schedule:

Day Time (UTC) Duration
Saturday 02:00 - 06:00 4 hours
Sunday 02:00 - 06:00 4 hours
Wednesday 02:00 - 04:00 2 hours

Emergency changes may be implemented outside of scheduled maintenance windows, but require explicit approval from the System Administrator and the CAB.

Rollback Plan

Every change must have a detailed rollback plan in case of failure. The rollback plan should outline the steps required to restore the system to its previous state. This may involve restoring backups, reverting configuration changes, or disabling the new functionality. Testing the rollback plan is crucial to ensure its effectiveness. Consider using Version Control System for configuration files.

Testing Procedures

Before implementing any change in the production environment, it must be thoroughly tested in a staging environment that mirrors the production environment as closely as possible. Testing should include:

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️