Server rental store

AI in Cumbria

AI in Cumbria: Server Configuration

This document details the server configuration for the "AI in Cumbria" project, a distributed computing initiative focused on analyzing environmental data collected within the county of Cumbria, England. This guide is intended for new system administrators and developers contributing to the project. It outlines the hardware, software, and network configurations necessary for optimal performance and reliability. Understanding these details is crucial for maintaining and expanding the system. We will cover hardware specifications, software stack, network topology, and security considerations. Please refer to the Main Page for project overview and associated documentation.

Hardware Overview

The "AI in Cumbria" project utilizes a hybrid server infrastructure, combining on-premise servers at the University of Cumbria with cloud-based resources from a dedicated AWS Virtual Private Cloud (VPC). This design allows for both high-performance processing and scalability. The on-premise servers handle initial data ingestion and pre-processing, while the cloud resources handle the computationally intensive machine learning tasks. See Data Flow Diagram for a visual representation.

The core on-premise server specifications are detailed below:

Component Specification Quantity
CPU Intel Xeon Gold 6248R (24 cores/48 threads) 3
RAM 256 GB DDR4 ECC Registered 3
Storage (OS) 500GB NVMe SSD 3
Storage (Data) 16TB SAS HDD (RAID 6) 1 (Shared)
Network Interface Dual 10GbE 3
Power Supply Redundant 1200W Platinum 3

The AWS infrastructure consists of EC2 instances configured as follows:

Instance Type Quantity Configuration
p3.8xlarge 10 NVIDIA V100 GPUs, 32 vCPUs, 244 GB RAM
r5.large 5 2 vCPUs, 8 GB RAM (for control plane & monitoring)
s3 N/A 100TB Data Storage (Object Storage)

Refer to Hardware Maintenance Procedures for detailed information on server maintenance.

Software Stack

The software stack is designed to facilitate data ingestion, processing, and model deployment. We utilize a Linux-based operating system (Ubuntu Server 22.04 LTS) across all servers. Containerization is achieved using Docker and orchestration via Kubernetes. See Software Installation Guide for detailed installation instructions.

Here’s a breakdown of the key software components:

Software Version Purpose
Ubuntu Server 22.04 LTS Operating System
Python 3.9 Primary Programming Language
TensorFlow 2.12 Machine Learning Framework
PyTorch 2.0 Machine Learning Framework
Docker 24.0 Containerization Platform
Kubernetes 1.27 Container Orchestration
PostgreSQL 15 Database for Metadata and Results

We also employ Prometheus for monitoring and Grafana for visualization. The API Documentation provides details on interacting with the AI models.

Network Configuration

The network topology consists of a dedicated VLAN for the "AI in Cumbria" project, both on-premise and within the AWS VPC. On-premise servers connect to the university network via 10GbE switches. The AWS VPC is connected to the university network via a secure VPN tunnel using OpenVPN. This allows for secure data transfer between the on-premise and cloud resources.

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️