AI in West Midlands
AI in West Midlands: Server Configuration Guide
This article details the server configuration supporting the "AI in West Midlands" project. It's aimed at newcomers to the wiki and provides a technical overview of the hardware and software used. Understanding this setup is crucial for contributing to the project's development and maintenance. Please refer to our Deployment Guidelines for general contribution information.
Overview
The "AI in West Midlands" project utilizes a distributed server infrastructure to handle the computational demands of machine learning model training, inference, and data processing. This infrastructure is primarily hosted within a secure data centre in Birmingham, with some edge deployments for real-time applications. The system architecture is based on a microservices approach, detailed in our Microservices Architecture Document. We leverage a combination of bare-metal servers and virtual machines (VMs) for flexibility and scalability. This design allows for efficient resource allocation and caters to the varying needs of different AI tasks. See also Scalability Considerations.
Hardware Configuration
The core of the AI infrastructure consists of several server nodes, categorized by their primary function. Below are detailed specifications for each type.
Primary Training Servers
These servers are dedicated to training large AI models. They require significant computational power and memory.
Specification | Value |
---|---|
CPU | 2 x AMD EPYC 7763 (64-core) |
RAM | 512 GB DDR4 ECC Registered |
GPU | 8 x NVIDIA A100 80GB PCIe |
Storage | 4 x 8TB NVMe SSD (RAID 0) |
Network | 2 x 100GbE |
Operating System | Ubuntu 22.04 LTS |
These servers utilize GPU virtualization to maximise resource usage and are monitored through our Monitoring Dashboard.
Inference Servers
These servers are optimized for low-latency inference, serving predictions from trained models.
Specification | Value |
---|---|
CPU | 2 x Intel Xeon Gold 6338 (32-core) |
RAM | 256 GB DDR4 ECC Registered |
GPU | 4 x NVIDIA T4 16GB PCIe |
Storage | 2 x 4TB NVMe SSD (RAID 1) |
Network | 2 x 25GbE |
Operating System | CentOS Stream 9 |
The inference servers are deployed using Kubernetes for orchestration and autoscaling. See the Inference Pipeline Documentation for details on model deployment.
Data Storage Servers
These servers provide centralized storage for datasets and model artifacts.
Specification | Value |
---|---|
CPU | 2 x Intel Xeon Silver 4310 (12-core) |
RAM | 128 GB DDR4 ECC Registered |
Storage | 32 x 16TB SAS HDD (RAID 6) |
Network | 2 x 40GbE |
Operating System | Red Hat Enterprise Linux 8 |
Data is accessed via a Network File System (NFS) and secured with strict access controls described in the Security Policy.
Software Stack
The software stack is built around open-source technologies, ensuring flexibility and cost-effectiveness.
- Operating Systems: Ubuntu 22.04 LTS, CentOS Stream 9, Red Hat Enterprise Linux 8 - see OS Compatibility Matrix
- Containerization: Docker and Kubernetes – detailed in the Containerization Guide
- Machine Learning Frameworks: TensorFlow, PyTorch, scikit-learn – refer to the Framework Selection Criteria
- Programming Languages: Python, C++, Java – guidelines available in the Coding Standards
- Data Storage: NFS, object storage (MinIO) - see Data Storage Strategy
- Monitoring & Logging: Prometheus, Grafana, Elasticsearch, Kibana – see Monitoring and Alerting System
- Version Control: Git – use the Git Workflow for all code changes.
- CI/CD: Jenkins – automated build and deployment pipeline detailed in the CI/CD Pipeline Documentation.
Network Topology
The server infrastructure is connected via a high-speed, low-latency network. A dedicated VLAN is used for inter-server communication. Firewall rules are configured to restrict access to only necessary ports and services. Please consult the Network Diagram for a visual representation. The network is also segmented using Virtual LANs for improved security.
Security Considerations
Security is paramount. All servers are protected by firewalls, intrusion detection systems, and regular security audits. Access to servers is restricted to authorized personnel only, governed by the Access Control List. Data is encrypted both in transit and at rest. See the Security Best Practices document for detailed security guidelines.
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️