Difference between revisions of "AI in Monaco"
|  (Automated server configuration article) | 
| (No difference) | 
Latest revision as of 07:06, 16 April 2025
- AI in Monaco: Server Configuration
This document details the server configuration supporting the "AI in Monaco" project. This guide is intended for new engineers onboarding to the infrastructure team and provides a comprehensive overview of the hardware and software components. This project utilizes a distributed system designed for high throughput and low latency inference of large language models. See also System Architecture Overview for a broader context.
Overview
The "AI in Monaco" project leverages a cluster of dedicated servers to run a suite of AI models. These models are utilized for real-time data analysis and prediction, requiring significant computational resources. The server environment is based on a Linux distribution (Ubuntu 22.04 LTS) and utilizes a containerized deployment strategy with Docker and Kubernetes. Efficient resource management is crucial, as detailed in the Resource Allocation Policy. The primary goal of this configuration is to provide a scalable and reliable platform for AI model deployment and execution. Understanding the networking setup is vital, refer to Network Topology.
Hardware Specifications
The server cluster consists of the following hardware components. Each node adheres to the specification below.
| Component | Specification | 
|---|---|
| CPU | Dual Intel Xeon Gold 6338 (32 Cores / 64 Threads per CPU) | 
| RAM | 512GB DDR4 ECC Registered 3200MHz | 
| Storage (OS) | 1TB NVMe SSD (PCIe Gen4) | 
| Storage (Models) | 8 x 8TB SAS HDD (RAID 6) | 
| Network Interface | Dual 100GbE QSFP28 | 
| GPU | 4 x NVIDIA A100 80GB | 
These specifications were chosen to balance compute power, memory capacity, and storage throughput, as discussed in the Hardware Selection Rationale. Regular hardware monitoring is performed using Nagios.
Software Stack
The software stack is built around a containerized environment, allowing for portability and scalability.
| Software | Version | Purpose | 
|---|---|---|
| Operating System | Ubuntu 22.04 LTS | Base OS for all servers | 
| Docker | 24.0.7 | Containerization platform | 
| Kubernetes | 1.27.4 | Container orchestration | 
| NVIDIA Driver | 535.104.05 | GPU driver for CUDA and TensorRT | 
| CUDA Toolkit | 12.2 | NVIDIA's parallel computing platform | 
| TensorRT | 8.6.1 | NVIDIA's inference optimizer and runtime | 
| Prometheus | 2.46.0 | Monitoring and alerting | 
The specific versions were selected for compatibility and performance, as documented in the Software Version Control. Regular software updates are managed via Ansible.
Networking Configuration
The network infrastructure is designed for high bandwidth and low latency communication between servers.
| Parameter | Value | 
|---|---|
| Network Topology | Clos network | 
| Inter-Node Bandwidth | 100GbE | 
| Load Balancer | HAProxy | 
| DNS | Bind9 | 
| Firewall | iptables | 
| Internal Network | 10.0.0.0/16 | 
Detailed network diagrams are available at Network Diagrams. Security considerations regarding network access are outlined in the Security Policy. Troubleshooting network issues can be done with tcpdump.
Security Considerations
Security is a paramount concern. All servers are behind a firewall and access is restricted to authorized personnel only. Regular security audits are conducted, as detailed in the Security Audit Schedule. SSL/TLS is used for all communication between servers and clients. User access is managed through LDAP. Intrusion detection systems are in place, and logs are monitored daily. Refer to Incident Response Plan for emergency procedures.
Future Scalability
The architecture is designed for future scalability. Adding new nodes to the Kubernetes cluster is a straightforward process. The storage infrastructure can be expanded by adding more SAS HDDs or migrating to a distributed file system like Ceph. The network infrastructure can be upgraded to 200GbE or 400GbE as needed. Considerations for scaling are detailed in the Scalability Plan.
Main Page Server Maintenance Troubleshooting Guide Deployment Procedures Monitoring Dashboard
Intel-Based Server Configurations
| Configuration | Specifications | Benchmark | 
|---|---|---|
| Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 | 
| Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 | 
| Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 | 
| Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
| Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
| Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
| Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
| Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 | 
AMD-Based Server Configurations
| Configuration | Specifications | Benchmark | 
|---|---|---|
| Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 | 
| Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 | 
| Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 | 
| Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 | 
| EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 | 
| EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 | 
| EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 | 
| EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 | 
| EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 | 
| EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe | 
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️