AI in the Borneo Rainforest

From Server rental store
Jump to navigation Jump to search

AI in the Borneo Rainforest: Server Configuration

This document details the server configuration for the “AI in the Borneo Rainforest” project, designed to process data from remote sensor networks deployed throughout the region. This setup balances performance, reliability, and power efficiency, crucial for a remote, environmentally sensitive location. This guide is aimed at new system administrators joining the project.

Overview

The project utilizes a distributed server architecture. A central server cluster located in Kuching, Sarawak (Malaysia) receives, processes, and stores data transmitted from various edge devices within the rainforest. These edge devices include acoustic sensors, camera traps, and environmental sensors, all contributing to a real-time AI-powered monitoring system. The primary goal is to identify and track endangered species, monitor deforestation, and assess the overall health of the rainforest ecosystem. We leverage machine learning models for object detection, sound classification, and anomaly detection. This document focuses on the central cluster configuration.

Central Server Cluster Architecture

The central cluster comprises three primary server roles: data ingestion, processing, and storage. These roles are physically separated across dedicated server nodes for improved performance and fault tolerance. A load balancer distributes incoming data streams across the ingestion servers. A high-speed network interconnect (100GbE) links all nodes within the cluster.

Data Ingestion Servers

These servers are responsible for receiving data from the edge devices, performing initial validation, and queuing it for processing. They utilize a message queue (RabbitMQ) to decouple the ingestion process from the more computationally intensive processing stage.

Processing Servers

These servers execute the machine learning models. They are equipped with powerful GPUs to accelerate model inference. The processing servers pull data from the message queue, perform analysis, and store the results in the storage servers.

Storage Servers

These servers provide persistent storage for raw sensor data, processed data, and model outputs. They employ a distributed file system (Ceph) to ensure high availability and scalability.

Server Hardware Specifications

The following tables detail the hardware specifications for each server role. All servers run Ubuntu Server 22.04 LTS.

Server Role CPU RAM Storage Network Interface
Data Ingestion Intel Xeon Silver 4310 (12 cores) 64 GB DDR4 ECC 2 x 1 TB NVMe SSD (RAID 1) 10 GbE
Processing AMD EPYC 7763 (64 cores) 128 GB DDR4 ECC 1 x 2 TB NVMe SSD (OS) + 4 x NVIDIA A100 (40 GB) 100 GbE
Storage Intel Xeon Gold 6338 (32 cores) 128 GB DDR4 ECC 8 x 16 TB SAS HDD (RAID 8) 100 GbE

Software Stack

The software stack is carefully chosen to maximize performance, scalability, and maintainability.

  • Operating System: Ubuntu Server 22.04 LTS
  • Message Queue: RabbitMQ 3.9.x
  • Machine Learning Framework: TensorFlow 2.10.x, PyTorch 1.13.x
  • Distributed File System: Ceph Octopus
  • Database: PostgreSQL 14
  • Programming Languages: Python 3.9, C++
  • Load Balancer: HAProxy 2.4.x
  • Monitoring: Prometheus, Grafana - See Monitoring and Alerting for further details.
  • Containerization: Docker - See Docker Configuration for further details.
  • Configuration Management: Ansible - See Ansible Playbooks for further details.

Network Configuration

The network is segmented into three VLANs: one for each server role and one for management traffic. This enhances security and isolates potential failures.

VLAN ID Subnet Description
10 192.168.10.0/24 Data Ingestion Servers
20 192.168.20.0/24 Processing Servers
30 192.168.30.0/24 Storage Servers
40 192.168.40.0/24 Management Network

All servers have static IP addresses assigned within their respective VLANs. The load balancer has a public IP address and routes traffic to the ingestion servers based on configured rules. DNS resolution is handled by an internal DNS server running on a dedicated virtual machine. See Network Diagram for a visual representation.

Security Considerations

Security is paramount, given the sensitive nature of the data and the remote location.

  • Firewall: UFW is enabled on all servers, with strict rules governing inbound and outbound traffic. See Firewall Configuration for details.
  • SSH Access: SSH access is restricted to authorized personnel only, using key-based authentication.
  • Data Encryption: Data is encrypted at rest and in transit. TLS/SSL is used for all network communication.
  • Regular Security Audits: Regular security audits are conducted to identify and address potential vulnerabilities. Refer to Security Audit Procedures.
  • Intrusion Detection System (IDS): An IDS is deployed to monitor network traffic for malicious activity. See IDS Configuration.

Backup and Disaster Recovery

A comprehensive backup and disaster recovery plan is in place to ensure data durability and system resilience.

Backup Type Frequency Destination Retention Policy
Full Backup Weekly Offsite Storage (AWS S3) 6 Months
Incremental Backup Daily Local Storage (RAID 6) 1 Month
Transaction Log Backup Hourly Local Storage (RAID 6) 7 Days

Regular disaster recovery drills are conducted to test the effectiveness of the plan. See Disaster Recovery Plan for complete details.

Future Considerations

  • Scaling: As the project grows, the cluster may need to be scaled horizontally by adding more servers.
  • Edge Computing: Moving some processing to the edge devices could reduce latency and bandwidth requirements. Refer to Edge Computing Integration.
  • Model Optimization: Continuously optimizing the machine learning models to improve performance and accuracy is crucial. See Model Training and Deployment.
  • Power Efficiency: Investigating more power-efficient hardware and cooling solutions is important for reducing the environmental impact. See Power Consumption Analysis.


Main Page Data Flow Diagram Troubleshooting Guide Deployment Procedures System Requirements


Intel-Based Server Configurations

Configuration Specifications Benchmark
Core i7-6700K/7700 Server 64 GB DDR4, NVMe SSD 2 x 512 GB CPU Benchmark: 8046
Core i7-8700 Server 64 GB DDR4, NVMe SSD 2x1 TB CPU Benchmark: 13124
Core i9-9900K Server 128 GB DDR4, NVMe SSD 2 x 1 TB CPU Benchmark: 49969
Core i9-13900 Server (64GB) 64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB) 128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB) 64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB) 128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration Specifications Benchmark
Ryzen 5 3600 Server 64 GB RAM, 2x480 GB NVMe CPU Benchmark: 17849
Ryzen 7 7700 Server 64 GB DDR5 RAM, 2x1 TB NVMe CPU Benchmark: 35224
Ryzen 9 5950X Server 128 GB RAM, 2x4 TB NVMe CPU Benchmark: 46045
Ryzen 9 7950X Server 128 GB DDR5 ECC, 2x2 TB NVMe CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB) 128 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB) 128 GB RAM, 2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB) 128 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB) 256 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB) 256 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 9454P Server 256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️