AI in Poole
- AI in Poole: Server Configuration
This document details the server configuration for the "AI in Poole" project, outlining hardware, software, and networking details. This is intended as a guide for new administrators and developers working with the system. This project utilizes a distributed computing model to support the intensive processing requirements of large language models. Please refer to Main Page for project overview.
Overview
The "AI in Poole" infrastructure consists of a cluster of servers located in a dedicated data center in Poole, UK. The primary function of these servers is to host and operate large language models, providing API access for various applications. The cluster is designed for high availability, scalability, and performance. See System Architecture for a diagram of the overall system. This project heavily leverages Docker containers for environment isolation and reproducibility. Regular backups are performed using Backup Procedures.
Hardware Specifications
The server cluster comprises three main types of nodes: Master Nodes, Compute Nodes, and Storage Nodes. Each node type has specific hardware requirements.
Node Type | CPU | RAM | Storage | Network Interface |
---|---|---|---|---|
2x Intel Xeon Gold 6338 | 128 GB DDR4 ECC | 2x 1 TB NVMe SSD (RAID 1) | 10 Gbps Ethernet | | ||||
2x AMD EPYC 7763 | 256 GB DDR4 ECC | 4x 4 TB NVMe SSD (RAID 0) | 100 Gbps InfiniBand | | ||||
2x Intel Xeon Silver 4310 | 64 GB DDR4 ECC | 8x 16 TB SATA HDD (RAID 6) | 10 Gbps Ethernet | |
These specifications are subject to change as the project evolves. Refer to Hardware Inventory for the most up-to-date listing of individual server details. Power consumption is monitored via Power Monitoring System.
Software Stack
The "AI in Poole" servers run a customized Linux distribution based on Ubuntu 22.04 LTS. Key software components include:
- Operating System: Ubuntu 22.04 LTS
- Containerization: Docker 24.0.5 and Docker Compose
- Orchestration: Kubernetes 1.27
- Programming Languages: Python 3.10, CUDA (for GPU acceleration)
- Machine Learning Frameworks: TensorFlow 2.12, PyTorch 2.0
- Database: PostgreSQL 15 with TimescaleDB extension for time-series data. See Database Schema.
- Monitoring: Prometheus and Grafana. Consult Monitoring Dashboard.
Software Component | Version | Purpose |
---|---|---|
24.0.5 | Containerization platform | | ||
1.27 | Container orchestration | | ||
2.12 | Machine learning framework | | ||
2.0 | Machine learning framework | | ||
15 | Database management system | | ||
2.45 | Monitoring system | |
All software is managed through automated configuration management using Ansible. See Ansible Playbooks for details.
Networking Configuration
The server cluster is connected to the internet via a redundant 10 Gbps fiber connection. Internal communication between nodes is primarily handled through a dedicated 100 Gbps InfiniBand network for low-latency, high-bandwidth data transfer. A separate 10 Gbps Ethernet network is used for storage access and management.
Interface | IP Address Range | Purpose |
---|---|---|
192.168.1.0/24 | Internet connectivity | | ||
10.0.0.0/8 | Inter-node communication (Compute Nodes) | | ||
172.16.0.0/16 | Storage access | |
DNS resolution is handled by an internal BIND server. Firewall rules are managed using `iptables`. Further network details can be found in Network Diagram and Firewall Rules. Secure Shell (SSH) access is restricted to authorized personnel via key-based authentication. Please review Security Policies.
Security Considerations
Security is paramount. All servers are hardened according to CIS benchmarks. Regular security audits are conducted. Intrusion detection systems (IDS) are in place to monitor for malicious activity. Data encryption is used both in transit and at rest. Access control is strictly enforced using role-based access control (RBAC). See Security Documentation for comprehensive details.
Main Page System Architecture Hardware Inventory Software Versions Database Schema Monitoring Dashboard Backup Procedures Ansible Playbooks Network Diagram Firewall Rules Security Policies Security Documentation Troubleshooting Guide API Documentation Deployment Procedures Contact Information
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️