AI in Armenia
- AI in Armenia: Server Configuration and Infrastructure
This article details the server configuration supporting Artificial Intelligence (AI) initiatives within Armenia. It's aimed at newcomers to our MediaWiki site and provides a technical overview of the hardware, software, and networking components involved. This information will be valuable for those contributing to, maintaining, or expanding our AI infrastructure.
Overview
Armenia has seen increasing investment in AI development, driven by a growing tech sector and governmental support. This infrastructure is designed to support a range of AI applications, including machine learning, natural language processing, and computer vision. The core of this capability resides in a cluster of servers located in Yerevan, with redundant connections and robust security measures. We utilize a hybrid cloud approach, combining on-premise hardware with cloud resources for scalability and cost-effectiveness. See Data Center Locations for more details.
Hardware Specifications
The primary server cluster comprises high-performance computing (HPC) nodes optimized for AI workloads. These nodes are interconnected via a low-latency network. The following table details the primary server specifications:
Component | Specification | |
---|---|---|
CPU | Dual Intel Xeon Platinum 8380 (40 cores/80 threads per CPU) | |
RAM | 512 GB DDR4 ECC Registered RAM (3200 MHz) | |
GPU | 8 x NVIDIA A100 80GB GPUs | |
Storage | 4 x 8TB NVMe SSD (RAID 0) for OS and active data | 16 x 18TB SATA HDD (RAID 6) for long-term storage |
Network Interface | Dual 100GbE Network Interface Cards (NICs) | |
Power Supply | Redundant 2000W Platinum Power Supplies |
We also utilize several dedicated edge servers for real-time data processing, detailed in Edge Computing Deployment. These servers have less stringent requirements but are critical for certain applications. The storage configuration is managed by our Storage Management Protocol.
Software Stack
The software stack is built around a Linux distribution (Ubuntu Server 22.04 LTS) and leverages containerization technology for deployment and management. Key software components include:
- CUDA Toolkit: Version 12.2. Providing the necessary libraries for GPU-accelerated computing. CUDA Documentation
- cuDNN: Version 8.9. Supporting deep neural networks. cuDNN Release Notes
- TensorFlow: Version 2.13. A popular open-source machine learning framework. TensorFlow Website
- PyTorch: Version 2.1. A competing machine learning framework, offering flexibility and dynamic computation graphs. PyTorch Documentation
- Docker: Version 24.0. For containerization and application deployment. Docker Hub
- Kubernetes: Version 1.28. For container orchestration and scaling. Kubernetes Documentation
- NCCL: Version 2.16. Optimized communication library for multi-GPU training. NCCL Documentation
All software is regularly updated and patched according to our Security Patching Schedule.
Networking Infrastructure
The server cluster is connected via a high-speed, low-latency network. This network is critical for efficient data transfer between nodes during training and inference.
Component | Specification |
---|---|
Network Topology | Clos Network |
Switches | Arista 7508 Series |
Interconnect | 100GbE InfiniBand |
Firewall | Palo Alto Networks NGFW |
Load Balancer | HAProxy |
Network monitoring is performed using Network Monitoring Tools, including Prometheus and Grafana, to ensure optimal performance and identify potential bottlenecks. The network architecture is documented in Network Diagram.
Data Storage and Management
Data storage is tiered to balance performance and cost. Hot data (actively used for training and inference) is stored on NVMe SSDs, while cold data (archived datasets) is stored on SATA HDDs. A centralized data management system is used to track and manage all datasets.
Storage Tier | Capacity | Performance | Cost |
---|---|---|---|
Tier 1 (NVMe SSD) | 32 TB | Very High | High |
Tier 2 (SATA HDD) | 288 TB | Moderate | Low |
Tier 3 (Cloud Storage) | Scalable | Variable | Variable |
Data backups are performed daily and stored offsite. See Data Backup Procedures for details.
Security Considerations
Security is paramount. The following measures are implemented:
- Firewall: A robust firewall protects the network from unauthorized access.
- Intrusion Detection System (IDS): An IDS monitors network traffic for malicious activity.
- Access Control: Strict access control policies limit access to sensitive data and systems.
- Regular Security Audits: Regular security audits are conducted to identify and address vulnerabilities. See Security Audit Reports.
- Data Encryption: All data is encrypted at rest and in transit.
Future Expansion
We are planning to expand the server cluster in Q1 2024 to accommodate growing AI workloads. This expansion will involve adding additional GPU nodes and upgrading the network infrastructure. Details can be found in Future Infrastructure Plans.
AI Model Deployment Server Maintenance Procedures Troubleshooting Guide Contact Information Frequently Asked Questions
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️