AI in Edinburgh
AI in Edinburgh: Server Configuration
This article details the server configuration powering the "AI in Edinburgh" project, a research initiative focused on advancements in Artificial Intelligence within the Edinburgh academic community. It is intended as a guide for new system administrators and developers contributing to the project. This document provides an overview of the hardware, software, and networking aspects of the infrastructure.
Overview
The "AI in Edinburgh" project relies on a distributed server cluster designed for high-performance computing and large-scale data processing. The cluster is primarily used for training machine learning models, running simulations, and analyzing large datasets. The servers are located in a dedicated data center within the University of Edinburgh and are interconnected via a high-speed network. We utilize a combination of bare metal servers and virtual machines to maximize resource utilization and flexibility. System Administration is crucial for maintaining the stability of this environment.
Hardware Configuration
The cluster consists of three main types of servers: Master Nodes, Compute Nodes, and Storage Nodes.
Server Type | Quantity | CPU | Memory (RAM) | Storage | Network Interface |
---|---|---|---|---|---|
Master Nodes | 2 | 2 x Intel Xeon Gold 6248R (24 cores/48 threads) | 256 GB DDR4 ECC REG | 2 x 1 TB NVMe SSD (RAID 1) | 10 Gbps Ethernet |
Compute Nodes | 20 | 2 x AMD EPYC 7763 (64 cores/128 threads) | 512 GB DDR4 ECC REG | 4 x 4 TB SATA HDD (RAID 10) + 1 x 500 GB NVMe SSD (local scratch) | 100 Gbps InfiniBand |
Storage Nodes | 4 | 2 x Intel Xeon Silver 4210 (10 cores/20 threads) | 128 GB DDR4 ECC REG | 16 x 16 TB SATA HDD (RAID 6) | 40 Gbps Ethernet |
These specifications represent the standard configuration. Individual servers may have slight variations depending on specific research needs. Hardware Inventory is maintained separately.
Software Configuration
The servers run a customized version of Ubuntu Server 22.04 LTS. The core software stack includes:
- Operating System: Ubuntu Server 22.04 LTS
- Containerization: Docker and Kubernetes for application deployment and management. Kubernetes Documentation is a vital resource.
- Programming Languages: Python 3.10, R 4.2.0, and C++ are the primary languages used for development.
- Machine Learning Frameworks: TensorFlow, PyTorch, and scikit-learn.
- Data Processing: Apache Spark and Hadoop for large-scale data processing. Big Data Analytics is a key area of research.
- Version Control: Git, hosted on a private GitLab instance.
- Monitoring: Prometheus and Grafana for system monitoring and alerting. System Monitoring Tools are essential for proactive maintenance.
Networking Configuration
The network infrastructure is designed for high bandwidth and low latency.
Network Component | Specification | Purpose |
---|---|---|
Core Switch | Arista 7050X Series | Provides high-speed switching between servers and the external network. |
InfiniBand Network | Mellanox ConnectX-6 Dx | Interconnects the compute nodes for high-performance communication. |
Ethernet Network | Cisco Catalyst 9300 Series | Provides connectivity for master and storage nodes, as well as external access. |
Firewall | pfSense | Protects the cluster from unauthorized access. Network Security protocols are strictly enforced. |
The network is segmented into three zones: a public zone for external access, a private zone for internal communication, and a storage zone for data access. Network Topology diagrams are available on the project wiki. We also employ Virtual Private Networks for secure remote access.
Storage Configuration
Data storage is critical for the "AI in Edinburgh" project. We utilize a combination of local and network storage.
Storage Type | Capacity | Technology | Access Protocol |
---|---|---|---|
Network File System (NFS) | 256 TB | RAID 6 (SATA HDD) | NFSv4 |
Object Storage | 1 PB | Distributed object storage (Ceph) | S3-compatible API |
Local Scratch Space | 500GB per Compute Node | NVMe SSD | Direct File System Access |
Data is regularly backed up to offsite storage for disaster recovery purposes. Data Backup Procedures are documented on the internal wiki. We have implemented Data Redundancy strategies to minimize data loss.
Security Considerations
Security is a top priority for the "AI in Edinburgh" project. We employ a multi-layered security approach, including:
- Firewall protection
- Intrusion detection and prevention systems
- Regular security audits
- Strong password policies
- Two-factor authentication
- Data encryption
- Access control lists
Security Best Practices are regularly reviewed and updated. All personnel involved in the project are required to complete security training.
Server Maintenance is scheduled weekly.
Troubleshooting Guide is available for common issues.
Contact Information for the system administrators is listed on the project homepage.
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️