Data processing
- Data Processing Server Configuration
This article details the configuration of our data processing servers, crucial for handling the large volumes of data related to wiki edits, user activity, and search indexing. This guide is aimed at newcomers to the server administration team and outlines the key hardware and software components involved. Understanding this configuration is vital for maintaining the performance and stability of the wiki.
Overview
The data processing servers are responsible for tasks beyond simply serving web pages. They handle background jobs, including database updates, search index maintenance, and generating reports. These servers are distinct from the web servers that directly respond to user requests and the database servers that store the wiki’s content. Effective data processing ensures a responsive and reliable wiki experience for all users. The system is designed for scalability, with the potential to add more processing nodes as the wiki grows. We utilize a distributed processing model, leveraging message queues to ensure resilience and prevent data loss.
Hardware Configuration
The current data processing servers are built with the following specifications:
Component | Specification |
---|---|
Processor | Intel Xeon Gold 6248R (24 cores/48 threads) |
RAM | 128 GB DDR4 ECC Registered |
Storage (OS) | 500 GB NVMe SSD |
Storage (Data) | 4 x 4 TB SATA HDD (RAID 10) |
Network Interface | Dual 10 Gigabit Ethernet |
Power Supply | Redundant 800W Platinum |
This hardware configuration is regularly reviewed and updated to meet the increasing demands placed on the data processing infrastructure. The RAID 10 configuration of the data storage provides both performance and redundancy, protecting against data loss in the event of a drive failure. We monitor server performance closely to identify potential bottlenecks.
Software Stack
The software stack on the data processing servers is carefully chosen to ensure compatibility with our MediaWiki installation and to provide the necessary functionality for handling background tasks.
Software | Version | Purpose |
---|---|---|
Operating System | CentOS Linux 8 | Server OS, provides the foundation for all other software. |
PHP | 7.4 | Executing the MediaWiki PHP scripts for background jobs. |
RabbitMQ | 3.8 | Message broker for asynchronous task processing. |
Redis | 6.0 | In-memory data store for caching and session management. |
Cron | v4.1 | Scheduled task runner. |
Supervisor | 4.0 | Process control system, ensures services are running. |
The use of RabbitMQ allows us to decouple tasks from the web servers, preventing long-running processes from impacting user experience. Redis provides a fast caching layer, reducing the load on the database servers. PHP configuration is crucial for optimizing performance. Regular software updates are applied to address security vulnerabilities.
Key Processes and Configuration Details
Several key processes run on the data processing servers, each with its specific configuration.
Job Queue Processing
RabbitMQ is the central component of our job queue system. MediaWiki extensions and core code post various tasks to the queue, such as:
- Article validation: Checking for broken links and other issues in articles.
- Search index updates: Maintaining the search index for fast and accurate search results.
- Category membership updates: Recalculating category membership after edits.
- Revision deletion: Handling requests for revision deletion.
Worker processes, written in PHP, consume these tasks from the queue and execute them. The number of worker processes is dynamically adjusted based on the queue length and server load.
Redis Configuration
Redis is used for several purposes, including:
- Caching frequently accessed data.
- Storing user session data.
- Rate limiting to prevent abuse.
The following table outlines the key Redis configuration parameters:
Parameter | Value | Description |
---|---|---|
`maxmemory` | 64GB | Maximum memory usage for Redis. |
`maxmemory-policy` | `allkeys-lru` | Eviction policy when `maxmemory` is reached (Least Recently Used). |
`timeout` | 300 | Session timeout in seconds. |
`databases` | 16 | Number of logical databases. |
Proper Redis configuration is essential for maintaining performance and preventing out-of-memory errors. Redis monitoring helps to identify potential issues.
Monitoring and Maintenance
Regular monitoring and maintenance are critical for keeping the data processing servers running smoothly. We use a combination of tools, including:
- Nagios: For monitoring server health and availability.
- Prometheus: For collecting and analyzing metrics.
- Grafana: For visualizing metrics.
- Log analysis tools: For identifying errors and issues.
Routine maintenance tasks include:
- Applying security patches.
- Updating software packages.
- Checking disk space.
- Monitoring job queue length.
- Analyzing server logs.
Regular backup procedures are in place to ensure data recovery in the event of a disaster. Database maintenance is also crucial, even though these servers don't directly host the database.
Special:Myuserpage Help:Contents MediaWiki Server Administration Database Administration PHP Linux Security Performance Troubleshooting
Cron Redis RabbitMQ Nagios Prometheus Grafana
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️