AI in the Gobi Desert

From Server rental store
Jump to navigation Jump to search
  1. AI in the Gobi Desert: Server Configuration

This article details the server configuration utilized for the "AI in the Gobi Desert" project, a research initiative focused on deploying and maintaining artificial intelligence models in an extremely challenging environmental setting. This guide is intended for new members of the operations team and provides a comprehensive overview of the hardware and software choices. It assumes a basic understanding of Linux server administration and networking concepts.

Project Overview

The "AI in the Gobi Desert" project aims to leverage machine learning for environmental monitoring, specifically focusing on desertification patterns and wildlife tracking. The servers are deployed in a remote location, presenting unique challenges related to power availability, cooling, and network connectivity. The core requirements are high computational power for model training and inference, data storage for large datasets, and robust reliability due to limited on-site maintenance capabilities. We rely heavily on automation for server management.

Hardware Configuration

The server infrastructure consists of three primary server types: Edge Servers, Central Processing Servers, and Data Storage Servers. Each type is detailed below.

Edge Servers

Edge servers are located closest to the data collection points (sensor arrays and camera traps). They perform initial data processing and filtering, reducing the amount of data transmitted to the central servers.

Component Specification
Processor Intel Xeon Silver 4310 (8 Cores, 2.1 GHz)
RAM 64 GB DDR4 ECC 3200MHz
Storage (OS) 256 GB NVMe SSD
Storage (Data Buffer) 1 TB SATA SSD
Network Interface 2 x 10 Gigabit Ethernet
Power Supply 48V DC Input, Redundant Power Supplies

These servers utilize a custom power management system to optimize energy consumption. The operating system is a minimal Ubuntu Server 20.04 installation.

Central Processing Servers

Central processing servers are responsible for running the complex AI models and performing the bulk of the data analysis. They are housed in a hardened, climate-controlled shelter.

Component Specification
Processor 2 x AMD EPYC 7763 (64 Cores, 2.45 GHz)
RAM 256 GB DDR4 ECC 3200MHz
GPU 4 x NVIDIA A100 80GB
Storage (OS) 512 GB NVMe SSD
Storage (Data) 8 TB NVMe SSD (RAID 0)
Network Interface 2 x 40 Gigabit Ethernet
Power Supply Redundant 208V AC Power Supplies

These servers utilize Kubernetes for container orchestration and Prometheus for monitoring.

Data Storage Servers

Data storage servers provide long-term storage for the collected data and model artifacts.

Component Specification
Processor Intel Xeon Gold 6338 (32 Cores, 2.0 GHz)
RAM 128 GB DDR4 ECC 3200MHz
Storage 64 TB HDD (RAID 6)
Network Interface 2 x 10 Gigabit Ethernet
File System ZFS
Power Supply Redundant 208V AC Power Supplies

These servers are configured with Ceph for distributed storage and data redundancy.


Software Configuration

All servers utilize a common software stack to ensure consistency and ease of management.

All software is updated regularly using automated patching systems integrated with Ansible. Strict access control policies are enforced.


Networking Infrastructure

The network infrastructure consists of a satellite link for primary connectivity and a secondary terrestrial radio link as a backup. The servers are organized into a private network with firewalls and intrusion detection systems. Network segmentation is implemented to isolate the different server types. A dedicated VPN connection is used for remote access.

Cooling System

Due to the extreme temperatures in the Gobi Desert, a specialized cooling system is essential. The central processing servers are housed in a climate-controlled shelter with redundant cooling units. Edge servers utilize passive cooling techniques and are designed to operate within a wide temperature range. We monitor the temperature sensors on all servers constantly.

Future Considerations

Future upgrades will focus on increasing the efficiency of the power management system and exploring the use of renewable energy sources to reduce our carbon footprint. We also plan to implement federated learning to reduce the amount of data that needs to be transmitted over the network.


Intel-Based Server Configurations

Configuration Specifications Benchmark
Core i7-6700K/7700 Server 64 GB DDR4, NVMe SSD 2 x 512 GB CPU Benchmark: 8046
Core i7-8700 Server 64 GB DDR4, NVMe SSD 2x1 TB CPU Benchmark: 13124
Core i9-9900K Server 128 GB DDR4, NVMe SSD 2 x 1 TB CPU Benchmark: 49969
Core i9-13900 Server (64GB) 64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB) 128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB) 64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB) 128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration Specifications Benchmark
Ryzen 5 3600 Server 64 GB RAM, 2x480 GB NVMe CPU Benchmark: 17849
Ryzen 7 7700 Server 64 GB DDR5 RAM, 2x1 TB NVMe CPU Benchmark: 35224
Ryzen 9 5950X Server 128 GB RAM, 2x4 TB NVMe CPU Benchmark: 46045
Ryzen 9 7950X Server 128 GB DDR5 ECC, 2x2 TB NVMe CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB) 128 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB) 128 GB RAM, 2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB) 128 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB) 256 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB) 256 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 9454P Server 256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️