AI in Birmingham
- AI in Birmingham: Server Configuration
This article details the server configuration supporting the "AI in Birmingham" project, a local initiative focused on applying artificial intelligence to urban challenges. This document is intended for new system administrators and developers joining the project. It outlines the hardware, software, and networking components used to deliver the AI services.
Overview
The "AI in Birmingham" project utilizes a distributed server architecture to handle the computational demands of machine learning models and real-time data processing. The infrastructure is housed within the Birmingham Science City data center and is designed for scalability and high availability. We primarily leverage a hybrid cloud approach, utilizing both on-premise physical servers and cloud-based resources from Amazon Web Services. This allows us to balance cost, performance, and data security. Understanding the interplay between these components is crucial for effective system maintenance and development. Initial project goals were outlined in the Project Charter.
Hardware Specification
Our core on-premise infrastructure consists of three primary server clusters: the Data Ingestion Cluster, the Model Training Cluster, and the Inference Cluster. Each cluster is built using high-performance server hardware.
Component | Specification | Quantity |
---|---|---|
CPU | Intel Xeon Gold 6248R (24 cores, 3.0 GHz) | 36 |
RAM | 256 GB DDR4 ECC Registered | 36 |
Storage (Data Ingestion) | 4 x 4TB NVMe SSD (RAID 0) | 3 |
Storage (Model Training) | 8 x 8TB SAS HDD (RAID 6) | 6 |
Storage (Inference) | 2 x 2TB NVMe SSD (RAID 1) | 6 |
Network Interface | 100 Gbps Ethernet | 36 |
GPU (Model Training) | NVIDIA Tesla V100 (32GB) | 12 |
Further details on hardware procurement can be found in the Hardware Inventory. The cloud resources are provisioned via Infrastructure as Code using Terraform.
Software Stack
The software stack is built around a Linux operating system and various open-source tools. We’ve standardized on Ubuntu Server 22.04 LTS across all servers to simplify management and ensure compatibility.
Component | Version | Purpose |
---|---|---|
Operating System | Ubuntu Server 22.04 LTS | Base operating system |
Programming Language | Python 3.9 | Primary development language |
Machine Learning Framework | TensorFlow 2.10 | Model training and inference |
Data Storage | PostgreSQL 14 | Relational database for metadata |
Message Queue | RabbitMQ 3.9 | Asynchronous task processing |
Containerization | Docker 20.10 | Application packaging and deployment |
Orchestration | Kubernetes 1.24 | Container orchestration |
All software packages are managed using APT package manager. Detailed installation guides exist for each component in the Software Documentation. We maintain a centralized Logging System using Elasticsearch, Logstash, and Kibana (ELK stack) for monitoring and troubleshooting. Security protocols are outlined in the Security Policy.
Networking Configuration
The server infrastructure is connected to the Birmingham Science City network via a dedicated 10 Gbps fiber optic link. Internal network communication between servers is facilitated by a private VLAN.
Component | Configuration | Notes |
---|---|---|
Network Topology | Star | Centralized management and security |
Internal VLAN | 192.168.10.0/24 | Isolated network for internal communication |
External IP Addresses | Static, assigned by Birmingham Science City | Used for public access to AI services |
Firewall | pfSense 2.5 | Network security and access control |
DNS | Bind9 | Internal DNS resolution |
We utilize a load balancer (HAProxy) to distribute traffic across the Inference Cluster. Further details regarding network diagrams and firewall rules can be found in the Network Documentation. Regular network performance monitoring is conducted using Nagios.
Data Flow
Data ingested from various sources (e.g., city sensors, public datasets) is initially processed by the Data Ingestion Cluster. This data is then stored in a data lake using Hadoop Distributed File System. The Model Training Cluster utilizes this data to train machine learning models. Once trained, these models are deployed to the Inference Cluster for real-time predictions. Predictions are then consumed by various applications and services within the city.
Future Considerations
We are currently evaluating the integration of Apache Spark for faster data processing and the adoption of GPU virtualization to improve resource utilization. Expansion of the cloud infrastructure is planned for Q4 2024 to accommodate growing data volumes and model complexity.
Birmingham Science City
Data Ingestion
Model Training
Inference
Amazon Web Services
Ubuntu Server
TensorFlow
PostgreSQL
RabbitMQ
Docker
Kubernetes
APT package manager
Elasticsearch
Logging System
Security Policy
Network Documentation
Nagios
Infrastructure as Code
Project Charter
Hardware Inventory
Software Documentation
Hadoop Distributed File System
Apache Spark
GPU virtualization
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️