AI in Bristol
- AI in Bristol: Server Configuration
This article details the server configuration supporting the "AI in Bristol" project, a collaborative initiative focused on advancing Artificial Intelligence research within the Bristol area. It is intended for newcomers to our MediaWiki site and provides a technical overview of the infrastructure. This documentation will cover hardware specifications, software stack, networking considerations, and ongoing maintenance procedures. Understanding these components is crucial for contributing to the project, troubleshooting issues, and proposing improvements.
Overview
The "AI in Bristol" project relies on a distributed server infrastructure to handle the computational demands of machine learning model training, data analysis, and deployment of AI services. The system is designed for scalability, reliability, and efficient resource utilization. We leverage a hybrid cloud approach, combining on-premise hardware with cloud-based resources from AWS. This allows us to balance cost, performance and control. See System Architecture for a high-level diagram.
Hardware Specifications
Our core on-premise infrastructure consists of several dedicated servers. Details are provided in the table below:
Server Name | CPU | RAM | Storage | GPU | Network Interface |
---|---|---|---|---|---|
ai-bristol-01 | 2 x Intel Xeon Gold 6248R (30 cores total) | 256 GB DDR4 ECC | 4 x 4TB NVMe SSD (RAID 10) | 2 x NVIDIA A100 (80GB) | 100 Gbps Ethernet |
ai-bristol-02 | 2 x AMD EPYC 7763 (64 cores total) | 512 GB DDR4 ECC | 8 x 8TB SATA SSD (RAID 6) | 4 x NVIDIA RTX 3090 (24GB) | 100 Gbps Ethernet |
ai-bristol-03 | 1 x Intel Xeon Platinum 8280 (28 cores) | 128 GB DDR4 ECC | 2 x 2TB NVMe SSD (RAID 1) | 1 x NVIDIA Tesla V100 (32GB) | 10 Gbps Ethernet |
These servers are housed in a secure data center with redundant power and cooling. See Data Center Details for more information. We also utilize cloud instances for burst capacity and specialized tasks. Cloud Resource Management details our AWS configuration.
Software Stack
The software environment is built around a Linux base, specifically Ubuntu Server 22.04. We utilize containerization technology using Docker and orchestration with Kubernetes to ensure portability and scalability. Key software components include:
- Operating System: Ubuntu Server 22.04
- Containerization: Docker 24.0
- Orchestration: Kubernetes 1.27
- Programming Languages: Python 3.9, R 4.2.0
- Machine Learning Frameworks: TensorFlow 2.12, PyTorch 2.0, scikit-learn 1.2
- Data Storage: PostgreSQL 14, Object Storage (MinIO)
Networking Configuration
The server network is segmented into several VLANs to isolate different types of traffic and enhance security. A summary is shown below:
VLAN ID | Purpose | Subnet | Gateway |
---|---|---|---|
10 | Management Network | 192.168.10.0/24 | 192.168.10.1 |
20 | Data Transfer Network | 10.0.20.0/16 | 10.0.20.1 |
30 | Machine Learning Compute | 172.16.30.0/24 | 172.16.30.1 |
All communication between servers is encrypted using TLS 1.3. Firewall rules are managed using iptables and regularly audited for security vulnerabilities. See Network Security Policies for detailed configurations.
Monitoring and Maintenance
System monitoring is performed using Prometheus and Grafana, providing real-time insights into server performance and resource utilization. Alerts are configured to notify administrators of critical issues. Regular maintenance tasks include:
- Software Updates: Weekly security updates applied using apt.
- Backup and Recovery: Daily backups of critical data stored in Object Storage (MinIO).
- Log Analysis: Centralized log analysis using Elasticsearch, Logstash, and Kibana (ELK stack).
- Performance Tuning: Regular performance testing and optimization of machine learning models.
Detailed documentation on maintenance procedures can be found at Maintenance Procedures.
Cloud Integration Details
We utilize AWS for specific tasks, primarily model deployment and handling fluctuating workloads. The following table outlines the current AWS resources:
Resource Type | Instance Type | Region | Purpose |
---|---|---|---|
EC2 Instance | g5.2xlarge | eu-west-1 | Model Serving |
S3 Bucket | Standard | eu-west-1 | Data Storage |
Lambda Function | Python 3.9 | eu-west-1 | API Gateway Integration |
Access to AWS resources is managed through IAM roles and adheres to the principle of least privilege. See AWS Configuration Guide for more details.
Future Expansion
Future plans include expanding the on-premise GPU cluster with additional NVIDIA H100 GPUs and integrating a dedicated high-performance storage system utilizing NVMe over Fabrics. We also intend to explore federated learning techniques to leverage distributed datasets without compromising data privacy. Roadmap details the long-term vision for the "AI in Bristol" project.
Main Page Contributing Contact Us Security Policy Data Privacy
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️