AI in Oceania
AI in Oceania: Server Configuration
This article details the server configuration utilized for hosting the "AI in Oceania" project, a collaborative research initiative focused on artificial intelligence applications relevant to the Pacific Island region. It's designed as a guide for new system administrators and developers contributing to the project. Understanding this configuration is critical for maintenance, troubleshooting, and future scaling. This document assumes a basic familiarity with Linux server administration and networking concepts. Refer to Help:Contents for general MediaWiki help.
Overview
The "AI in Oceania" project requires significant computational resources for model training, inference, and data storage. The infrastructure is primarily hosted in a secure data center located in Auckland, New Zealand, chosen for its reliable power, network connectivity, and geographical proximity to many Pacific Island nations. We utilize a hybrid cloud approach, leveraging both dedicated hardware and cloud-based services. This allows for flexibility and cost optimization. See Help:Linking to pages for details on internal links. The core services are detailed in Special:Search/Core Services.
Hardware Specifications
The primary compute cluster consists of eight dedicated servers, each purpose-built for machine learning workloads. The following table outlines the specifications for each server:
Server ID | CPU | RAM | GPU | Storage |
---|---|---|---|---|
Server-01 | Intel Xeon Gold 6248R (24 cores) | 256 GB DDR4 ECC | NVIDIA Tesla V100 (32GB) | 4 x 4TB NVMe SSD (RAID 10) |
Server-02 | Intel Xeon Gold 6248R (24 cores) | 256 GB DDR4 ECC | NVIDIA Tesla V100 (32GB) | 4 x 4TB NVMe SSD (RAID 10) |
Server-03 | Intel Xeon Gold 6248R (24 cores) | 256 GB DDR4 ECC | NVIDIA Tesla V100 (32GB) | 4 x 4TB NVMe SSD (RAID 10) |
Server-04 | Intel Xeon Gold 6248R (24 cores) | 256 GB DDR4 ECC | NVIDIA Tesla V100 (32GB) | 4 x 4TB NVMe SSD (RAID 10) |
Server-05 | AMD EPYC 7763 (64 cores) | 512 GB DDR4 ECC | NVIDIA A100 (80GB) | 8 x 8TB NVMe SSD (RAID 10) |
Server-06 | AMD EPYC 7763 (64 cores) | 512 GB DDR4 ECC | NVIDIA A100 (80GB) | 8 x 8TB NVMe SSD (RAID 10) |
Server-07 | AMD EPYC 7763 (64 cores) | 512 GB DDR4 ECC | NVIDIA A100 (80GB) | 8 x 8TB NVMe SSD (RAID 10) |
Server-08 | AMD EPYC 7763 (64 cores) | 512 GB DDR4 ECC | NVIDIA A100 (80GB) | 8 x 8TB NVMe SSD (RAID 10) |
Software Stack
The servers run Ubuntu Server 22.04 LTS. The core software stack includes:
- Python 3.10: The primary programming language for AI development. See Help:How to edit a page for editing guidelines.
- TensorFlow 2.12: A popular machine learning framework.
- PyTorch 1.13: Another widely used machine learning framework.
- CUDA 11.8: NVIDIA's parallel computing platform and API.
- Docker 20.10: Containerization platform for reproducible environments. See Help:Formatting for advanced formatting options.
- Kubernetes 1.26: Container orchestration system for managing deployments.
- PostgreSQL 14: Relational database for metadata storage. A discussion about database performance is available on Special:Search/Database Performance.
- 'Object Storage (MinIO): A distributed object storage system.
Networking & Security
The servers are connected via a dedicated 10 Gigabit Ethernet network. Security is paramount. The following measures are in place:
Security Feature | Description | Status |
---|---|---|
Firewall | UFW (Uncomplicated Firewall) configured with strict rules. | Enabled |
Intrusion Detection System (IDS) | Suricata IDS monitoring network traffic for malicious activity. | Enabled |
VPN Access | Secure VPN access for remote administration. | Enabled |
Regular Security Audits | Penetration testing and vulnerability scanning performed quarterly. | Scheduled |
Data Encryption | All data at rest and in transit is encrypted. | Enabled |
Cloud Integration
We utilize Amazon Web Services (AWS) for supplemental compute and storage. Specifically:
Service | Purpose | Configuration |
---|---|---|
AWS S3 | Long-term archival storage of datasets. | Standard storage class, versioning enabled. |
AWS EC2 | On-demand compute instances for burst capacity during peak training loads. | p3.8xlarge instances with NVIDIA V100 GPUs. |
AWS SageMaker | Managed machine learning service for experimentation and deployment. | Utilized for prototyping new models. |
Monitoring and Logging
Comprehensive monitoring and logging are essential for maintaining system stability and performance. We use:
- Prometheus: For collecting and storing metrics.
- Grafana: For visualizing metrics and creating dashboards. See Special:Search/Grafana Dashboards for examples.
- 'ELK Stack (Elasticsearch, Logstash, Kibana): For centralized log management.
- Nagios: For alerting on critical system events.
Future Considerations
Planned upgrades include transitioning to newer GPU architectures (NVIDIA H100) and expanding the object storage capacity. We are also evaluating the implementation of a more robust disaster recovery plan. Further information on project goals can be found on Project:AI in Oceania.
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️