AI in the Rocky Mountains
- AI in the Rocky Mountains: Server Configuration
This article details the server configuration for our "AI in the Rocky Mountains" project, focused on analyzing wildlife patterns using machine learning. It's designed for newcomers to our MediaWiki site and provides a technical overview of the hardware and software powering this initiative. This document assumes a basic understanding of server administration and Linux concepts.
Project Overview
The "AI in the Rocky Mountains" project aims to process data from a network of remote cameras deployed across the Rocky Mountain region. This data includes images and video of wildlife, which are then analyzed using machine learning models to identify species, track movements, and monitor population trends. The servers are located in a secure, climate-controlled data center in Boulder, Colorado, chosen for its proximity to the study area and reliable power infrastructure. See Data Acquisition for details on camera setup.
Hardware Configuration
The server infrastructure consists of three primary server types: Input Servers, Processing Servers, and Storage Servers. Each type is configured with specific hardware to optimize its role. We utilize a clustered architecture for redundancy and scalability, as detailed in our Clustering Guide.
Input Servers
These servers receive data streams from the remote camera network. They perform initial validation and buffering before forwarding data to the Processing Servers.
Component | Specification |
---|---|
CPU | 2 x Intel Xeon Silver 4310 (12 Cores, 2.1 GHz) |
RAM | 64 GB DDR4 ECC Registered |
Storage (Temporary) | 2 x 1 TB NVMe SSD (RAID 1) – For buffering incoming data |
Network Interface | 2 x 10 Gbps Ethernet |
Operating System | Ubuntu Server 22.04 LTS |
Processing Servers
These are the workhorses of the project, running the machine learning models to analyze the incoming data. They require significant computational power. For more information on the AI models used, see Machine Learning Models.
Component | Specification |
---|---|
CPU | 4 x AMD EPYC 7763 (64 Cores, 2.45 GHz) |
RAM | 256 GB DDR4 ECC Registered |
GPU | 4 x NVIDIA A100 (80GB HBM2e) |
Storage (Local) | 4 x 4 TB NVMe SSD (RAID 0) – For rapid model loading and temporary data processing |
Network Interface | 2 x 25 Gbps Ethernet |
Operating System | CentOS Stream 9 |
Storage Servers
These servers provide persistent storage for the raw data, processed data, and model artifacts. Data archiving is critical, as discussed in Data Archiving Strategy.
Component | Specification |
---|---|
CPU | 2 x Intel Xeon Gold 6338 (32 Cores, 2.0 GHz) |
RAM | 128 GB DDR4 ECC Registered |
Storage | 12 x 16 TB SAS HDD (RAID 6) – Total usable storage: ~144 TB |
Network Interface | 2 x 10 Gbps Ethernet |
Operating System | Red Hat Enterprise Linux 8 |
Software Configuration
The software stack is built around open-source technologies for maximum flexibility and cost-effectiveness. See Software Licensing for details on our licensing policy.
- Operating Systems: As detailed above, we utilize Ubuntu Server, CentOS Stream, and Red Hat Enterprise Linux. Each OS is chosen based on the specific server role and compatibility with the required software.
- Machine Learning Framework: TensorFlow and PyTorch are both used, depending on the specific model. The Model Deployment Pipeline outlines the process of deploying new models.
- Data Storage: Ceph is used for distributed storage across the Storage Servers, providing scalability and redundancy. Refer to Ceph Configuration for specific details.
- Containerization: Docker and Kubernetes are used to containerize and orchestrate the machine learning models and other applications. The Kubernetes Cluster Setup document explains our cluster configuration.
- Monitoring: Prometheus and Grafana are used for monitoring server performance and application health. See Monitoring Dashboard for access to the dashboards.
- Networking: We employ a VLAN-based network architecture for security and segmentation. The Network Diagram provides a visual representation of the network.
- Security: Firewalld is used to manage network traffic, and regular security audits are performed. See Security Protocols for more information.
- Version Control: Git is used for all code and configuration management, stored on a private GitLab instance.
Network Topology
The servers are interconnected via a dedicated 100 Gbps backbone network within the data center. The Input Servers connect to the remote camera network via a secure VPN. All external access is restricted to authorized personnel only, as outlined in the Access Control Policy.
Future Expansion
We anticipate needing to expand the server infrastructure as the project grows and the volume of data increases. Future plans include adding more Processing Servers with even more powerful GPUs and exploring the use of cloud-based services for burst capacity. Details will be documented in the Capacity Planning Document.
Data Flow Diagram Troubleshooting Guide Server Maintenance Schedule Contact Information
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️