AI in Birmingham

From Server rental store
Jump to navigation Jump to search
  1. AI in Birmingham: Server Configuration

This article details the server configuration supporting the "AI in Birmingham" project, a local initiative focused on applying artificial intelligence to urban challenges. This document is intended for new system administrators and developers joining the project. It outlines the hardware, software, and networking components used to deliver the AI services.

Overview

The "AI in Birmingham" project utilizes a distributed server architecture to handle the computational demands of machine learning models and real-time data processing. The infrastructure is housed within the Birmingham Science City data center and is designed for scalability and high availability. We primarily leverage a hybrid cloud approach, utilizing both on-premise physical servers and cloud-based resources from Amazon Web Services. This allows us to balance cost, performance, and data security. Understanding the interplay between these components is crucial for effective system maintenance and development. Initial project goals were outlined in the Project Charter.

Hardware Specification

Our core on-premise infrastructure consists of three primary server clusters: the Data Ingestion Cluster, the Model Training Cluster, and the Inference Cluster. Each cluster is built using high-performance server hardware.

Component Specification Quantity
CPU Intel Xeon Gold 6248R (24 cores, 3.0 GHz) 36
RAM 256 GB DDR4 ECC Registered 36
Storage (Data Ingestion) 4 x 4TB NVMe SSD (RAID 0) 3
Storage (Model Training) 8 x 8TB SAS HDD (RAID 6) 6
Storage (Inference) 2 x 2TB NVMe SSD (RAID 1) 6
Network Interface 100 Gbps Ethernet 36
GPU (Model Training) NVIDIA Tesla V100 (32GB) 12

Further details on hardware procurement can be found in the Hardware Inventory. The cloud resources are provisioned via Infrastructure as Code using Terraform.

Software Stack

The software stack is built around a Linux operating system and various open-source tools. We’ve standardized on Ubuntu Server 22.04 LTS across all servers to simplify management and ensure compatibility.

Component Version Purpose
Operating System Ubuntu Server 22.04 LTS Base operating system
Programming Language Python 3.9 Primary development language
Machine Learning Framework TensorFlow 2.10 Model training and inference
Data Storage PostgreSQL 14 Relational database for metadata
Message Queue RabbitMQ 3.9 Asynchronous task processing
Containerization Docker 20.10 Application packaging and deployment
Orchestration Kubernetes 1.24 Container orchestration

All software packages are managed using APT package manager. Detailed installation guides exist for each component in the Software Documentation. We maintain a centralized Logging System using Elasticsearch, Logstash, and Kibana (ELK stack) for monitoring and troubleshooting. Security protocols are outlined in the Security Policy.

Networking Configuration

The server infrastructure is connected to the Birmingham Science City network via a dedicated 10 Gbps fiber optic link. Internal network communication between servers is facilitated by a private VLAN.

Component Configuration Notes
Network Topology Star Centralized management and security
Internal VLAN 192.168.10.0/24 Isolated network for internal communication
External IP Addresses Static, assigned by Birmingham Science City Used for public access to AI services
Firewall pfSense 2.5 Network security and access control
DNS Bind9 Internal DNS resolution

We utilize a load balancer (HAProxy) to distribute traffic across the Inference Cluster. Further details regarding network diagrams and firewall rules can be found in the Network Documentation. Regular network performance monitoring is conducted using Nagios.


Data Flow

Data ingested from various sources (e.g., city sensors, public datasets) is initially processed by the Data Ingestion Cluster. This data is then stored in a data lake using Hadoop Distributed File System. The Model Training Cluster utilizes this data to train machine learning models. Once trained, these models are deployed to the Inference Cluster for real-time predictions. Predictions are then consumed by various applications and services within the city.


Future Considerations

We are currently evaluating the integration of Apache Spark for faster data processing and the adoption of GPU virtualization to improve resource utilization. Expansion of the cloud infrastructure is planned for Q4 2024 to accommodate growing data volumes and model complexity.



Birmingham Science City Data Ingestion Model Training Inference Amazon Web Services Ubuntu Server TensorFlow PostgreSQL RabbitMQ Docker Kubernetes APT package manager Elasticsearch Logging System Security Policy Network Documentation Nagios Infrastructure as Code Project Charter Hardware Inventory Software Documentation Hadoop Distributed File System Apache Spark GPU virtualization


Intel-Based Server Configurations

Configuration Specifications Benchmark
Core i7-6700K/7700 Server 64 GB DDR4, NVMe SSD 2 x 512 GB CPU Benchmark: 8046
Core i7-8700 Server 64 GB DDR4, NVMe SSD 2x1 TB CPU Benchmark: 13124
Core i9-9900K Server 128 GB DDR4, NVMe SSD 2 x 1 TB CPU Benchmark: 49969
Core i9-13900 Server (64GB) 64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB) 128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB) 64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB) 128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration Specifications Benchmark
Ryzen 5 3600 Server 64 GB RAM, 2x480 GB NVMe CPU Benchmark: 17849
Ryzen 7 7700 Server 64 GB DDR5 RAM, 2x1 TB NVMe CPU Benchmark: 35224
Ryzen 9 5950X Server 128 GB RAM, 2x4 TB NVMe CPU Benchmark: 46045
Ryzen 9 7950X Server 128 GB DDR5 ECC, 2x2 TB NVMe CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB) 128 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB) 128 GB RAM, 2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB) 128 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB) 256 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB) 256 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 9454P Server 256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️