AI in Monaco

AI in Monaco: Server Configuration

This document details the server configuration supporting the "AI in Monaco" project. This guide is intended for new engineers onboarding to the infrastructure team and provides a comprehensive overview of the hardware and software components. This project utilizes a distributed system designed for high throughput and low latency inference of large language models. See also System Architecture Overview for a broader context.

Overview

The "AI in Monaco" project leverages a cluster of dedicated servers to run a suite of AI models. These models are utilized for real-time data analysis and prediction, requiring significant computational resources. The server environment is based on a Linux distribution (Ubuntu 22.04 LTS) and utilizes a containerized deployment strategy with Docker and Kubernetes. Efficient resource management is crucial, as detailed in the Resource Allocation Policy. The primary goal of this configuration is to provide a scalable and reliable platform for AI model deployment and execution. Understanding the networking setup is vital, refer to Network Topology.

Hardware Specifications

The server cluster consists of the following hardware components. Each node adheres to the specification below.

Component	Specification
CPU	Dual Intel Xeon Gold 6338 (32 Cores / 64 Threads per CPU)
RAM	512GB DDR4 ECC Registered 3200MHz
Storage (OS)	1TB NVMe SSD (PCIe Gen4)
Storage (Models)	8 x 8TB SAS HDD (RAID 6)
Network Interface	Dual 100GbE QSFP28
GPU	4 x NVIDIA A100 80GB

These specifications were chosen to balance compute power, memory capacity, and storage throughput, as discussed in the Hardware Selection Rationale. Regular hardware monitoring is performed using Nagios.

Software Stack

The software stack is built around a containerized environment, allowing for portability and scalability.

Software	Version	Purpose
Operating System	Ubuntu 22.04 LTS	Base OS for all servers
Docker	24.0.7	Containerization platform
Kubernetes	1.27.4	Container orchestration
NVIDIA Driver	535.104.05	GPU driver for CUDA and TensorRT
CUDA Toolkit	12.2	NVIDIA's parallel computing platform
TensorRT	8.6.1	NVIDIA's inference optimizer and runtime
Prometheus	2.46.0	Monitoring and alerting

The specific versions were selected for compatibility and performance, as documented in the Software Version Control. Regular software updates are managed via Ansible.

Networking Configuration

The network infrastructure is designed for high bandwidth and low latency communication between servers.

Parameter	Value
Network Topology	Clos network
Inter-Node Bandwidth	100GbE
Load Balancer	HAProxy
DNS	Bind9
Firewall	iptables
Internal Network	10.0.0.0/16

Detailed network diagrams are available at Network Diagrams. Security considerations regarding network access are outlined in the Security Policy. Troubleshooting network issues can be done with tcpdump.

Security Considerations

Security is a paramount concern. All servers are behind a firewall and access is restricted to authorized personnel only. Regular security audits are conducted, as detailed in the Security Audit Schedule. SSL/TLS is used for all communication between servers and clients. User access is managed through LDAP. Intrusion detection systems are in place, and logs are monitored daily. Refer to Incident Response Plan for emergency procedures.

Future Scalability

The architecture is designed for future scalability. Adding new nodes to the Kubernetes cluster is a straightforward process. The storage infrastructure can be expanded by adding more SAS HDDs or migrating to a distributed file system like Ceph. The network infrastructure can be upgraded to 200GbE or 400GbE as needed. Considerations for scaling are detailed in the Scalability Plan.

Main Page Server Maintenance Troubleshooting Guide Deployment Procedures Monitoring Dashboard

Intel-Based Server Configurations

Configuration	Specifications	Benchmark
Core i7-6700K/7700 Server	64 GB DDR4, NVMe SSD 2 x 512 GB	CPU Benchmark: 8046
Core i7-8700 Server	64 GB DDR4, NVMe SSD 2x1 TB	CPU Benchmark: 13124
Core i9-9900K Server	128 GB DDR4, NVMe SSD 2 x 1 TB	CPU Benchmark: 49969
Core i9-13900 Server (64GB)	64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB)	128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB)	64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB)	128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation	64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration	Specifications	Benchmark
Ryzen 5 3600 Server	64 GB RAM, 2x480 GB NVMe	CPU Benchmark: 17849
Ryzen 7 7700 Server	64 GB DDR5 RAM, 2x1 TB NVMe	CPU Benchmark: 35224
Ryzen 9 5950X Server	128 GB RAM, 2x4 TB NVMe	CPU Benchmark: 46045
Ryzen 9 7950X Server	128 GB DDR5 ECC, 2x2 TB NVMe	CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB)	128 GB RAM, 1 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB)	128 GB RAM, 2 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB)	128 GB RAM, 2x2 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB)	256 GB RAM, 1 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB)	256 GB RAM, 2x2 TB NVMe	CPU Benchmark: 48021
EPYC 9454P Server	256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

Telegram: @powervps Servers at a discounted price

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️