AI in London

AI in London: Server Configuration Overview

This article details the server configuration supporting Artificial Intelligence (AI) workloads in our London data center. It is aimed at newcomers to the MediaWiki site and provides a technical overview of the hardware and software employed. Please refer to the Server Room Access Policy before accessing any of these systems.

Overview

The "AI in London" project utilizes a cluster of high-performance servers dedicated to machine learning, deep learning, and natural language processing tasks. This infrastructure is designed for scalability, redundancy, and efficient resource utilization. The primary goal is to provide a robust platform for our Data Science Team to develop and deploy AI models. We maintain detailed Server Inventory records.

Hardware Specifications

The core of the AI infrastructure consists of the following server configurations. The servers are housed in Rack 7, Bay 1-12. See the Data Center Map for precise location details.

Server Role	Model	CPU	RAM	Storage	GPU
Master Node (Data Processing)	Dell PowerEdge R750xa	2 x Intel Xeon Gold 6348 (28 cores each)	512GB DDR4 ECC REG	4 x 4TB NVMe SSD (RAID 10)	NVIDIA RTX A6000 (48GB)
Worker Node 1-4 (Training)	Dell PowerEdge R750xa	2 x Intel Xeon Gold 6338 (32 cores each)	256GB DDR4 ECC REG	2 x 4TB NVMe SSD (RAID 1)	NVIDIA A100 (80GB)
Worker Node 5-8 (Inference)	Supermicro SYS-2029U-TR4	2 x AMD EPYC 7763 (64 cores each)	128GB DDR4 ECC REG	2 x 2TB NVMe SSD (RAID 1)	NVIDIA Tesla T4 (16GB)
Storage Node (Data Repository)	Dell PowerEdge R740xd	2 x Intel Xeon Gold 6248R (24 cores each)	128GB DDR4 ECC REG	16 x 16TB SAS HDD (RAID 6)	None

All servers are connected via a 100Gbps InfiniBand network. Please review the Network Topology Diagram. Power redundancy is provided by dual power supplies and UPS systems. Check the UPS Status Page for current status.

Software Stack

The servers run a customized version of Ubuntu Server 22.04 LTS. The core software components are detailed below. See the Software Licensing Information for compliance details.

Component	Version	Purpose
Operating System	Ubuntu Server 22.04 LTS	Base operating system
CUDA Toolkit	12.2	GPU programming toolkit
cuDNN	8.9.2	Deep neural network library
TensorFlow	2.13	Machine learning framework
PyTorch	2.0.1	Machine learning framework
Docker	24.0.5	Containerization platform
Kubernetes	1.27	Container orchestration

All code is managed using Git and stored in our internal Code Repository. We utilize Jenkins for continuous integration and continuous deployment (CI/CD).

Networking Configuration

The AI cluster utilizes a dedicated VLAN (192.168.100.0/24) for internal communication. Access to the cluster from external networks is restricted to authorized personnel via a secure VPN connection. Refer to the VPN Configuration Guide for instructions.

Parameter	Value
VLAN ID	100
Subnet Mask	255.255.255.0
Gateway	192.168.100.1
DNS Servers	8.8.8.8, 8.8.4.4
Firewall Rules	See Firewall Configuration

Monitoring & Alerting

The entire infrastructure is monitored using Prometheus and Grafana. Alerts are configured for critical metrics such as CPU utilization, memory usage, disk space, and GPU temperature. See the Monitoring Dashboard Link for a live view of the system status. The Incident Response Plan outlines procedures for handling alerts and outages.

Future Expansion

Plans are underway to expand the AI cluster with additional GPUs and storage capacity. We are evaluating the use of NVMe over Fabrics to further improve I/O performance. The Capacity Planning Document details the projected growth and resource requirements.

Server Maintenance Schedule Backup and Recovery Procedures Security Audit Reports

Intel-Based Server Configurations

Configuration	Specifications	Benchmark
Core i7-6700K/7700 Server	64 GB DDR4, NVMe SSD 2 x 512 GB	CPU Benchmark: 8046
Core i7-8700 Server	64 GB DDR4, NVMe SSD 2x1 TB	CPU Benchmark: 13124
Core i9-9900K Server	128 GB DDR4, NVMe SSD 2 x 1 TB	CPU Benchmark: 49969
Core i9-13900 Server (64GB)	64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB)	128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB)	64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB)	128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation	64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration	Specifications	Benchmark
Ryzen 5 3600 Server	64 GB RAM, 2x480 GB NVMe	CPU Benchmark: 17849
Ryzen 7 7700 Server	64 GB DDR5 RAM, 2x1 TB NVMe	CPU Benchmark: 35224
Ryzen 9 5950X Server	128 GB RAM, 2x4 TB NVMe	CPU Benchmark: 46045
Ryzen 9 7950X Server	128 GB DDR5 ECC, 2x2 TB NVMe	CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB)	128 GB RAM, 1 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB)	128 GB RAM, 2 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB)	128 GB RAM, 2x2 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB)	256 GB RAM, 1 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB)	256 GB RAM, 2x2 TB NVMe	CPU Benchmark: 48021
EPYC 9454P Server	256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

Telegram: @powervps Servers at a discounted price

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️