AI in Guildford

From Server rental store
Jump to navigation Jump to search
  1. AI in Guildford: Server Configuration

This article details the server configuration powering the "AI in Guildford" project, a local initiative exploring applications of Artificial Intelligence within the borough. This guide is intended for newcomers to the MediaWiki platform and provides a technical overview of the hardware and software implemented. Understanding this setup is crucial for anyone contributing to the project’s development or troubleshooting issues.

Overview

The “AI in Guildford” project utilizes a clustered server environment hosted within a dedicated data center in Guildford. The architecture is designed for scalability, high availability, and efficient processing of large datasets. The primary goal is to provide a platform for machine learning model training, inference, and data storage. The project uses a hybrid cloud approach, leveraging both on-premise hardware and cloud services for specific tasks. This allows for cost optimization and flexibility. We frequently utilise searching to locate relevant server logs.

Hardware Configuration

The core of the infrastructure consists of three primary server nodes, each dedicated to a specific role: data ingestion & preprocessing, model training, and model serving.

Server Role Hardware Specifications Operating System Network Interface
CPU: 2 x Intel Xeon Gold 6248R
RAM: 256GB DDR4 ECC
Storage: 4 x 8TB SAS 12Gbps 7.2K RPM HDD (RAID 10) + 2 x 1TB NVMe SSD | Ubuntu Server 22.04 LTS | 10GbE
CPU: 2 x AMD EPYC 7763
RAM: 512GB DDR4 ECC
Storage: 8 x 4TB SAS 12Gbps 7.2K RPM HDD (RAID 0) + 4 x 2TB NVMe SSD
GPU: 4 x NVIDIA A100 (40GB) | CentOS Stream 9 | 100GbE
CPU: 2 x Intel Xeon Silver 4310
RAM: 128GB DDR4 ECC
Storage: 2 x 2TB NVMe SSD | Debian 11 | 1GbE

These servers are connected via a dedicated internal network using a spine-leaf architecture for minimized latency. Power distribution units (PDUs) and uninterruptible power supplies (UPS) ensure reliable power delivery. The data center's cooling system maintains optimal operating temperatures. Regular checks of blocked IPs are performed to ensure security.

Software Stack

The software stack is designed for modularity and ease of maintenance. We utilize containerization technologies to ensure consistent deployments across different environments. A central configuration management system (Ansible) automates server provisioning and configuration.

Component Version Purpose Notes
See Hardware Configuration Table | Base OS for all servers | Customized kernel parameters for performance.
Docker 24.0.5 | Application packaging and deployment | Utilized in conjunction with Docker Compose.
Kubernetes 1.27 | Container orchestration and scaling | Managed by a dedicated cluster administrator.
Python 3.9, R 4.2.1 | Data science and machine learning | Used for data preprocessing, model training and inference.
TensorFlow 2.12, PyTorch 2.0 | Model development and training | Framework selection depends on the specific task.
PostgreSQL 15 | Data storage and management | Regular database reports are generated.

Data is ingested using a combination of custom Python scripts and Apache Kafka. Model training leverages distributed training techniques across multiple GPUs. Model serving is handled by a dedicated inference server using gRPC for efficient communication. We employ site statistics to monitor performance.

Networking and Security

The network is segmented into different zones for enhanced security. A firewall protects the servers from external threats. Access control is managed using Role-Based Access Control (RBAC). All network traffic is encrypted using TLS/SSL.

Network Zone Description Access Control
Internet-facing services (e.g. web interface) | Restricted to specific IP addresses and ports.
Hosting services requiring limited external access | Firewall rules and intrusion detection system.
Core server infrastructure | Strict access control based on RBAC.
Server administration and monitoring tools | Multi-factor authentication required.

Regular security audits are conducted to identify and address vulnerabilities. We closely monitor recent changes to the system configuration. We also utilize a dedicated intrusion detection system (IDS) and intrusion prevention system (IPS). The project adheres to the principles of least privilege. Monitoring logs through user rights logs helps maintain security.


Future Considerations

Future upgrades include expanding the GPU cluster for increased training capacity, implementing a more robust monitoring system, and integrating with additional cloud services. We are also exploring the use of federated learning techniques to enable collaborative model training without sharing sensitive data. This documentation is subject to change as the project evolves. Be sure to check all pages for updates.


Intel-Based Server Configurations

Configuration Specifications Benchmark
Core i7-6700K/7700 Server 64 GB DDR4, NVMe SSD 2 x 512 GB CPU Benchmark: 8046
Core i7-8700 Server 64 GB DDR4, NVMe SSD 2x1 TB CPU Benchmark: 13124
Core i9-9900K Server 128 GB DDR4, NVMe SSD 2 x 1 TB CPU Benchmark: 49969
Core i9-13900 Server (64GB) 64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB) 128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB) 64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB) 128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration Specifications Benchmark
Ryzen 5 3600 Server 64 GB RAM, 2x480 GB NVMe CPU Benchmark: 17849
Ryzen 7 7700 Server 64 GB DDR5 RAM, 2x1 TB NVMe CPU Benchmark: 35224
Ryzen 9 5950X Server 128 GB RAM, 2x4 TB NVMe CPU Benchmark: 46045
Ryzen 9 7950X Server 128 GB DDR5 ECC, 2x2 TB NVMe CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB) 128 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB) 128 GB RAM, 2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB) 128 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB) 256 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB) 256 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 9454P Server 256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️