AI in Sheffield
- AI in Sheffield: Server Configuration
This article details the server configuration powering the "AI in Sheffield" project, a local initiative focused on applying artificial intelligence to urban challenges. This document is intended for new system administrators and developers contributing to the project. It outlines the hardware, software, and networking components, offering a comprehensive overview of the infrastructure.
Overview
The "AI in Sheffield" project relies on a cluster of servers located in a secure data center within the University of Sheffield. These servers are responsible for data ingestion, model training, inference, and serving a web-based user interface. The system is designed for scalability, reliability, and security. We utilize a hybrid cloud approach, with some components hosted on-premises and others leveraging cloud services for peak demand. System Architecture provides a high-level diagram of the entire system.
Hardware Configuration
The core of the AI infrastructure consists of four primary server types: data storage servers, compute servers, database servers, and web servers. Each server type has a specific role and configuration optimized for its tasks.
Server Type | Quantity | CPU | RAM | Storage | Network Interface |
---|---|---|---|---|---|
Data Storage Server | 2 | Intel Xeon Gold 6248R (24 cores) | 256 GB DDR4 ECC | 16 x 16TB SAS HDD (RAID 6) | 10 GbE |
Compute Server (GPU) | 4 | AMD EPYC 7763 (64 cores) | 512 GB DDR4 ECC | 2 x NVIDIA A100 (80GB) | 100 GbE |
Database Server | 2 | Intel Xeon Silver 4210 (10 cores) | 128 GB DDR4 ECC | 2 x 1TB NVMe SSD (RAID 1) | 1 GbE |
Web Server | 3 | Intel Core i7-10700K (8 cores) | 64 GB DDR4 | 1TB NVMe SSD | 1 GbE |
Detailed specifications for each server, including serial numbers and asset tags, are maintained in the Hardware Inventory. Power consumption is carefully monitored using a dedicated Power Distribution Unit (PDU).
Software Configuration
The software stack is built around a Linux foundation (Ubuntu Server 22.04 LTS). We employ containerization using Docker and orchestration with Kubernetes for managing application deployments and scaling.
Component | Version | Purpose |
---|---|---|
Operating System | Ubuntu Server 22.04 LTS | Base OS for all servers |
Docker Engine | 20.10.21 | Containerization platform |
Kubernetes | 1.24.0 | Container orchestration |
PostgreSQL | 14.5 | Primary database for metadata and application data |
TensorFlow | 2.9.1 | Machine learning framework |
PyTorch | 1.12.1 | Machine learning framework |
Nginx | 1.21.6 | Web server and reverse proxy |
The specific versions of software packages are tracked in the Software Bill of Materials (SBOM). All code is version controlled using Git and hosted on a private GitLab instance. Regular security updates are applied using Ansible for automated configuration management.
Networking Configuration
The server cluster is connected to the University of Sheffield’s network via a dedicated VLAN. A firewall protects the cluster from external threats. Internally, a software-defined network (SDN) manages traffic flow between servers.
Parameter | Value |
---|---|
VLAN ID | 1001 |
Subnet Mask | 255.255.255.0 |
Gateway | 192.168.1.1 |
DNS Servers | 8.8.8.8, 8.8.4.4 |
Firewall | pfSense 2.5.2 |
SDN Controller | ONOS |
Network diagrams and configuration details are available in the Network Documentation. We utilize SSH keys for secure remote access to the servers. All network traffic is logged for auditing and security analysis using ELK Stack. The system utilizes a Load Balancer to distribute traffic across web servers.
Security Considerations
Security is paramount. Access to the server cluster is restricted to authorized personnel only. Multi-factor authentication is enforced for all access methods. Regular security audits are conducted to identify and address vulnerabilities. All data is encrypted both in transit and at rest. Security Policy outlines the complete security measures in place.
Future Expansion
We anticipate expanding the cluster to accommodate growing data volumes and increasing computational demands. Future plans include the addition of more GPU servers and the implementation of a distributed file system. Further details can be found in the Capacity Planning Document. We are also investigating the use of Federated Learning to leverage data from other institutions.
Main Page Data Pipelines Model Deployment Monitoring and Alerting Troubleshooting Guide Contact Information
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️