Server rental store

AI in West Midlands

AI in West Midlands: Server Configuration Guide

This article details the server configuration supporting the "AI in West Midlands" project. It's aimed at newcomers to the wiki and provides a technical overview of the hardware and software used. Understanding this setup is crucial for contributing to the project's development and maintenance. Please refer to our Deployment Guidelines for general contribution information.

Overview

The "AI in West Midlands" project utilizes a distributed server infrastructure to handle the computational demands of machine learning model training, inference, and data processing. This infrastructure is primarily hosted within a secure data centre in Birmingham, with some edge deployments for real-time applications. The system architecture is based on a microservices approach, detailed in our Microservices Architecture Document. We leverage a combination of bare-metal servers and virtual machines (VMs) for flexibility and scalability. This design allows for efficient resource allocation and caters to the varying needs of different AI tasks. See also Scalability Considerations.

Hardware Configuration

The core of the AI infrastructure consists of several server nodes, categorized by their primary function. Below are detailed specifications for each type.

Primary Training Servers

These servers are dedicated to training large AI models. They require significant computational power and memory.

Specification Value
CPU 2 x AMD EPYC 7763 (64-core)
RAM 512 GB DDR4 ECC Registered
GPU 8 x NVIDIA A100 80GB PCIe
Storage 4 x 8TB NVMe SSD (RAID 0)
Network 2 x 100GbE
Operating System Ubuntu 22.04 LTS

These servers utilize GPU virtualization to maximise resource usage and are monitored through our Monitoring Dashboard.

Inference Servers

These servers are optimized for low-latency inference, serving predictions from trained models.

Specification Value
CPU 2 x Intel Xeon Gold 6338 (32-core)
RAM 256 GB DDR4 ECC Registered
GPU 4 x NVIDIA T4 16GB PCIe
Storage 2 x 4TB NVMe SSD (RAID 1)
Network 2 x 25GbE
Operating System CentOS Stream 9

The inference servers are deployed using Kubernetes for orchestration and autoscaling. See the Inference Pipeline Documentation for details on model deployment.

Data Storage Servers

These servers provide centralized storage for datasets and model artifacts.

Specification Value
CPU 2 x Intel Xeon Silver 4310 (12-core)
RAM 128 GB DDR4 ECC Registered
Storage 32 x 16TB SAS HDD (RAID 6)
Network 2 x 40GbE
Operating System Red Hat Enterprise Linux 8

Data is accessed via a Network File System (NFS) and secured with strict access controls described in the Security Policy.

Software Stack

The software stack is built around open-source technologies, ensuring flexibility and cost-effectiveness.

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️