Server rental store

Data processing

# Data Processing Server Configuration

This article details the configuration of our data processing servers, crucial for handling the large volumes of data related to wiki edits, user activity, and search indexing. This guide is aimed at newcomers to the server administration team and outlines the key hardware and software components involved. Understanding this configuration is vital for maintaining the performance and stability of the wiki.

Overview

The data processing servers are responsible for tasks beyond simply serving web pages. They handle background jobs, including database updates, search index maintenance, and generating reports. These servers are distinct from the web servers that directly respond to user requests and the database servers that store the wiki’s content. Effective data processing ensures a responsive and reliable wiki experience for all users. The system is designed for scalability, with the potential to add more processing nodes as the wiki grows. We utilize a distributed processing model, leveraging message queues to ensure resilience and prevent data loss.

Hardware Configuration

The current data processing servers are built with the following specifications:

Component Specification
Processor Intel Xeon Gold 6248R (24 cores/48 threads)
RAM 128 GB DDR4 ECC Registered
Storage (OS) 500 GB NVMe SSD
Storage (Data) 4 x 4 TB SATA HDD (RAID 10)
Network Interface Dual 10 Gigabit Ethernet
Power Supply Redundant 800W Platinum

This hardware configuration is regularly reviewed and updated to meet the increasing demands placed on the data processing infrastructure. The RAID 10 configuration of the data storage provides both performance and redundancy, protecting against data loss in the event of a drive failure. We monitor server performance closely to identify potential bottlenecks.

Software Stack

The software stack on the data processing servers is carefully chosen to ensure compatibility with our MediaWiki installation and to provide the necessary functionality for handling background tasks.

Software Version Purpose
Operating System CentOS Linux 8 Server OS, provides the foundation for all other software.
PHP 7.4 Executing the MediaWiki PHP scripts for background jobs.
RabbitMQ 3.8 Message broker for asynchronous task processing.
Redis 6.0 In-memory data store for caching and session management.
Cron v4.1 Scheduled task runner.
Supervisor 4.0 Process control system, ensures services are running.

The use of RabbitMQ allows us to decouple tasks from the web servers, preventing long-running processes from impacting user experience. Redis provides a fast caching layer, reducing the load on the database servers. PHP configuration is crucial for optimizing performance. Regular software updates are applied to address security vulnerabilities.

Key Processes and Configuration Details

Several key processes run on the data processing servers, each with its specific configuration.

Job Queue Processing

RabbitMQ is the central component of our job queue system. MediaWiki extensions and core code post various tasks to the queue, such as:

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️