AI in Chelmsford
- AI in Chelmsford: Server Configuration
This article details the server configuration supporting the “AI in Chelmsford” project, a local initiative focused on applying artificial intelligence to improve city services. This guide is intended for new system administrators and developers contributing to the project. It covers hardware, software, and networking aspects of the infrastructure.
Overview
The "AI in Chelmsford" project relies on a distributed server architecture to handle the computational demands of machine learning models and data processing. The servers are hosted in a secure data center within Chelmsford city limits. The core infrastructure consists of three primary server roles: Data Ingestion, Model Training, and Inference. Each role is optimized for its specific task. Server Roles explains these roles in more detail. We utilize a combination of physical servers and virtual machines for flexibility and scalability. Virtualization is a key component of our strategy. Security is paramount; see Security Protocols for comprehensive details.
Hardware Configuration
The hardware is divided among the three server roles. The following tables outline the specifications for each.
Server Role | CPU | RAM | Storage | Network Interface |
---|---|---|---|---|
Data Ingestion | Intel Xeon Gold 6248R (24 cores) | 64 GB DDR4 ECC | 2 x 4TB NVMe SSD (RAID 1) | 10 Gbps Ethernet |
Model Training | 2 x AMD EPYC 7763 (64 cores total) | 256 GB DDR4 ECC | 4 x 8TB SAS HDD (RAID 5) + 1TB NVMe SSD (OS) | 25 Gbps Ethernet |
Inference | Intel Xeon Silver 4210 (10 cores) | 32 GB DDR4 ECC | 1 x 2TB SATA SSD | 1 Gbps Ethernet |
These specifications are subject to change based on project needs and hardware availability. Hardware Upgrades details the process for requesting and implementing hardware upgrades. Considerations surrounding Power Consumption are also crucial for server room management.
Software Configuration
The operating system across all servers is Ubuntu Server 22.04 LTS. This provides a stable and well-supported platform for our applications. We utilize Docker containers for application deployment and isolation. Docker Configuration provides detailed instructions on setting up and managing containers. Key software packages include:
- Python 3.10
- TensorFlow 2.12
- PyTorch 2.0
- PostgreSQL 14
- Redis 6
The following table details the specific software stacks deployed on each server role.
Server Role | Primary Software | Supporting Software |
---|---|---|
Data Ingestion | Apache Kafka, Logstash | Fluentd, PostgreSQL |
Model Training | TensorFlow, PyTorch, Jupyter Notebook | CUDA Toolkit, cuDNN, Horovod |
Inference | TensorFlow Serving, TorchServe | Nginx, Prometheus |
Software Dependencies outlines the relationships between different software packages and version compatibility. Regular Software Updates are essential for maintaining security and stability.
Networking Configuration
The servers are connected via a dedicated Gigabit Ethernet network within the data center. The network is segmented into three VLANs, one for each server role, to enhance security and isolate traffic. A separate 10 Gbps link connects the data center to the Chelmsford city network for data transfer and remote access.
VLAN ID | Server Role | Subnet | Gateway |
---|---|---|---|
10 | Data Ingestion | 192.168.10.0/24 | 192.168.10.1 |
20 | Model Training | 192.168.20.0/24 | 192.168.20.1 |
30 | Inference | 192.168.30.0/24 | 192.168.30.1 |
Firewall rules are configured using `iptables` to restrict access to only necessary ports and services. Network Security provides a detailed overview of the network architecture and security measures. DNS Configuration explains how internal and external DNS resolution is handled.
Future Considerations
We are planning to migrate to a Kubernetes-based orchestration platform to further improve scalability and resilience. Kubernetes Implementation outlines the roadmap for this transition. Exploring the use of GPUs for inference is also being considered to reduce latency and improve performance. GPU Acceleration details potential options and challenges.
Monitoring Tools are deployed to track server performance and identify potential issues. Disaster Recovery Plan outlines procedures for handling server failures and data loss.
Main Page Data Storage Backup Procedures Troubleshooting Guide Contact Information
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️