AI in the Tropics

From Server rental store
Jump to navigation Jump to search
  1. AI in the Tropics: Server Configuration

This document details the server configuration for the "AI in the Tropics" project, a research initiative focused on applying artificial intelligence to climate modeling and biodiversity analysis in tropical environments. This guide is designed for new system administrators and developers contributing to the project, detailing hardware, software, and networking aspects. It assumes a baseline understanding of Linux server administration and MediaWiki syntax.

Overview

The "AI in the Tropics" project requires a robust and scalable server infrastructure to handle large datasets, complex model training, and real-time data analysis. The current setup utilizes a hybrid approach, combining on-premise servers for sensitive data and computationally intensive tasks with cloud resources for scalability and redundancy. We prioritize data security, high availability, and efficient resource utilization. See also Server room best practices.

Hardware Configuration

The core on-premise servers are housed in a climate-controlled server room at the research facility. The following table details the specifications of the primary servers:

Server Name Role CPU RAM Storage Network Interface
ai-core-01 Primary AI Training & Model Serving 2 x Intel Xeon Gold 6338 512 GB DDR4 ECC REG 2 x 8TB NVMe SSD (RAID 1) + 20TB HDD (Data Archive) 10 Gbps Ethernet
ai-data-01 Data Storage & Preprocessing 2 x AMD EPYC 7763 256 GB DDR4 ECC REG 8 x 16TB SATA HDD (RAID 6) 10 Gbps Ethernet
ai-web-01 Web Interface & API Gateway Intel Core i7-12700K 64 GB DDR5 1TB NVMe SSD 1 Gbps Ethernet

Additional servers are utilized for database management (see Database administration for details) and specialized tasks like image processing. A detailed inventory is maintained on the Server inventory page. Power redundancy is provided by dual power supplies and an Uninterruptible Power Supply (UPS) system.

Software Configuration

All servers run Ubuntu Server 22.04 LTS with a customized kernel optimized for machine learning workloads. The following software components are essential:

  • Operating System: Ubuntu Server 22.04 LTS
  • Containerization: Docker and Kubernetes are used for application deployment and orchestration.
  • Machine Learning Frameworks: TensorFlow, PyTorch, and scikit-learn are the primary frameworks.
  • Programming Languages: Python is the primary language, with supporting libraries for data science and machine learning.
  • Database: PostgreSQL is used for structured data storage.
  • Web Server: Nginx serves as the web server and reverse proxy.
  • Monitoring: Prometheus and Grafana are used for system monitoring and alerting.
  • Version Control: Git is used for code version control and collaboration.

Software updates are applied regularly using `apt update && apt upgrade`, following a strict change management process documented on the Change management wiki.

Network Configuration

The server infrastructure is connected to the research facility's network via a dedicated VLAN. The following table outlines the network addressing scheme:

Server Name IP Address Subnet Mask Gateway
ai-core-01 192.168.10.10 255.255.255.0 192.168.10.1
ai-data-01 192.168.10.11 255.255.255.0 192.168.10.1
ai-web-01 192.168.10.12 255.255.255.0 192.168.10.1

Firewall rules are configured using `ufw` to restrict access to essential services. All external access is routed through a secure VPN connection. See Network security guidelines for details. DNS resolution is handled by an internal DNS server.

Cloud Integration

For scalability and disaster recovery, the "AI in the Tropics" project utilizes [[Amazon Web Services (AWS)]. Specifically, we leverage:

  • AWS S3: For long-term data archiving and storage.
  • AWS EC2: For on-demand compute resources during peak load.
  • AWS Lambda: For serverless functions and event-driven processing.
  • AWS RDS: For a managed PostgreSQL database replica.

Data synchronization between on-premise servers and AWS is performed using `rsync` and automated scripts. The cloud integration architecture is documented on the Cloud integration documentation page.

Security Considerations

Security is paramount. The following measures are in place:

  • Regular security audits are conducted by the Security team.
  • All servers are hardened according to CIS benchmarks.
  • Intrusion detection and prevention systems are deployed.
  • Data encryption is used both in transit and at rest.
  • Access control is strictly enforced based on the principle of least privilege. See Access control policies.

Future Expansion

Planned expansions include adding a dedicated GPU server for accelerated machine learning and implementing a more robust disaster recovery plan. The Roadmap for server infrastructure details the planned upgrades and improvements.

Server maintenance schedule is available for planned outages. For troubleshooting, refer to the Troubleshooting guide.


Intel-Based Server Configurations

Configuration Specifications Benchmark
Core i7-6700K/7700 Server 64 GB DDR4, NVMe SSD 2 x 512 GB CPU Benchmark: 8046
Core i7-8700 Server 64 GB DDR4, NVMe SSD 2x1 TB CPU Benchmark: 13124
Core i9-9900K Server 128 GB DDR4, NVMe SSD 2 x 1 TB CPU Benchmark: 49969
Core i9-13900 Server (64GB) 64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB) 128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB) 64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB) 128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration Specifications Benchmark
Ryzen 5 3600 Server 64 GB RAM, 2x480 GB NVMe CPU Benchmark: 17849
Ryzen 7 7700 Server 64 GB DDR5 RAM, 2x1 TB NVMe CPU Benchmark: 35224
Ryzen 9 5950X Server 128 GB RAM, 2x4 TB NVMe CPU Benchmark: 46045
Ryzen 9 7950X Server 128 GB DDR5 ECC, 2x2 TB NVMe CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB) 128 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB) 128 GB RAM, 2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB) 128 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB) 256 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB) 256 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 9454P Server 256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️