AI in Glasgow

From Server rental store
Jump to navigation Jump to search

AI in Glasgow: Server Configuration

This article details the server configuration supporting the "AI in Glasgow" project, a research initiative focused on applying artificial intelligence to urban challenges within the city of Glasgow. This document is intended for new system administrators and developers joining the project. It outlines the hardware, software, and network setup.

Overview

The "AI in Glasgow" project relies on a distributed server infrastructure to handle the demands of data ingestion, model training, and real-time inference. The system is designed for scalability and resilience, utilizing a combination of on-premise hardware and cloud resources. We utilize a hybrid approach to balance cost, security, and performance. This document will cover the core on-premise infrastructure. For details on the cloud component, please see the Cloud Integration Guide.

Hardware Configuration

The core on-premise infrastructure consists of three primary server types: Data Ingestion Servers, Training Servers, and Inference Servers. Each server type is built with specific hardware configurations optimized for its role.

Server Type CPU RAM Storage Network Interface
Intel Xeon Gold 6248R (24 cores) | 128GB DDR4 ECC | 8TB RAID 10 SSD | 10GbE
2 x AMD EPYC 7763 (64 cores each) | 512GB DDR4 ECC | 32TB RAID 6 NVMe SSD | 100GbE
Intel Xeon Silver 4210 (10 cores) | 64GB DDR4 ECC | 2TB NVMe SSD | 1GbE

These servers are housed in a dedicated rack within the University of Glasgow's Data Centre. Power and cooling are managed by the data centre's infrastructure. Detailed rack diagrams are available on the Data Centre Wiki. Regular hardware maintenance schedules are outlined in the Maintenance Procedures document.

Software Stack

The software stack is built around a Linux foundation, utilizing Ubuntu Server 22.04 LTS as the operating system. Key software components include:

  • Operating System: Ubuntu Server 22.04 LTS
  • Containerization: Docker and Kubernetes are used for application deployment and orchestration.
  • Data Storage: PostgreSQL is used for relational data, and MinIO provides object storage.
  • Machine Learning Frameworks: TensorFlow, PyTorch, and scikit-learn are the primary frameworks used for model development and training.
  • Monitoring: Prometheus and Grafana are used for system monitoring and alerting.
  • Version Control: All code is managed using Git and hosted on GitLab.
  • Networking: NGINX is used as a reverse proxy and load balancer.

Network Configuration

The server infrastructure is connected to the University of Glasgow network via a dedicated VLAN. Static IP addresses are assigned to each server. Firewall rules are configured using iptables to restrict access to necessary ports only.

Server Role IP Address Subnet Mask Gateway
192.168.1.10 | 255.255.255.0 | 192.168.1.1
192.168.1.11 | 255.255.255.0 | 192.168.1.1
192.168.1.20 | 255.255.255.0 | 192.168.1.1
192.168.1.21 | 255.255.255.0 | 192.168.1.1
192.168.1.30 | 255.255.255.0 | 192.168.1.1
192.168.1.31 | 255.255.255.0 | 192.168.1.1

DNS resolution is handled by the University's internal DNS servers. Access to the servers from outside the University network is restricted and requires VPN access, as detailed in the Security Policy.

Security Considerations

Security is paramount. All servers are regularly patched with the latest security updates. Access to the servers is controlled via SSH keys and strong passwords. Data is encrypted both in transit and at rest. Regular security audits are conducted by the IT Security Team. Intrusion detection systems are in place to monitor for malicious activity. See the Incident Response Plan for details on handling security incidents.

Future Expansion

The infrastructure is designed to be scalable. Future expansion plans include adding more Training Servers to handle increasing model complexity and data volumes. We are also exploring the use of GPU acceleration to further improve training performance. The Capacity Planning Document outlines the projected growth and resource requirements.

Component Current Capacity Projected Capacity (1 year)
40TB | 80TB
128 CPU Cores | 256 CPU Cores
20 CPU Cores | 40 CPU Cores

Related Documentation


Intel-Based Server Configurations

Configuration Specifications Benchmark
Core i7-6700K/7700 Server 64 GB DDR4, NVMe SSD 2 x 512 GB CPU Benchmark: 8046
Core i7-8700 Server 64 GB DDR4, NVMe SSD 2x1 TB CPU Benchmark: 13124
Core i9-9900K Server 128 GB DDR4, NVMe SSD 2 x 1 TB CPU Benchmark: 49969
Core i9-13900 Server (64GB) 64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB) 128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB) 64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB) 128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration Specifications Benchmark
Ryzen 5 3600 Server 64 GB RAM, 2x480 GB NVMe CPU Benchmark: 17849
Ryzen 7 7700 Server 64 GB DDR5 RAM, 2x1 TB NVMe CPU Benchmark: 35224
Ryzen 9 5950X Server 128 GB RAM, 2x4 TB NVMe CPU Benchmark: 46045
Ryzen 9 7950X Server 128 GB DDR5 ECC, 2x2 TB NVMe CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB) 128 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB) 128 GB RAM, 2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB) 128 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB) 256 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB) 256 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 9454P Server 256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️