Server rental store

AI in Basingstoke

# AI in Basingstoke: Server Configuration

This article details the server configuration supporting Artificial Intelligence (AI) initiatives within the Basingstoke data centre. It's aimed at new engineers joining the team and provides a comprehensive overview of the hardware and software setup. Understanding this configuration is crucial for maintaining system stability and facilitating future expansion. This document assumes familiarity with basic Linux server administration and networking concepts.

Overview

The Basingstoke AI infrastructure is designed for high-throughput processing of large datasets, used primarily for machine learning model training and natural language processing. We utilize a cluster of high-performance servers, interconnected via a low-latency network. The core operating system is Ubuntu Server 22.04 LTS, chosen for its stability and extensive package repository. All data is stored on a Network File System (NFS) share, providing centralized access and simplifying data management. The system relies heavily on Docker for containerization and Kubernetes for orchestration.

Hardware Specifications

The primary compute nodes are based on a standardized configuration, detailed below. There are currently 24 nodes in the cluster, with plans for expansion in Q4 2024. A dedicated monitoring server collects performance metrics.

Component Specification
CPU AMD EPYC 7763 (64 Cores, 128 Threads)
RAM 512GB DDR4 ECC Registered (3200MHz)
Storage (OS) 500GB NVMe SSD
Storage (Data) Access via 100GbE to central NFS server
Network Interface Dual 100GbE Mellanox ConnectX-6
GPU 4 x NVIDIA A100 (80GB)

The NFS server itself is a separate, highly-available system.

Component Specification
CPU Dual Intel Xeon Platinum 8380 (40 Cores each)
RAM 1TB DDR4 ECC Registered (3200MHz)
Storage 2 x 4TB NVMe SSD (RAID 1 - OS)
12 x 16TB SAS HDD (RAID 6 - Data)
Network Interface Dual 100GbE Mellanox ConnectX-6

Finally, the Kubernetes master node requires specific resources:

Component Specification
CPU Intel Xeon Gold 6338 (32 Cores)
RAM 256GB DDR4 ECC Registered (3200MHz)
Storage 1TB NVMe SSD
Network Interface Dual 10GbE Intel X710

Software Stack

The software stack is built around a containerized environment. We utilize Python 3.9 as the primary programming language for AI development. Libraries such as TensorFlow, PyTorch, and scikit-learn are pre-installed in the base Docker images.

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️