Server rental store

AI in the New Caledonian Rainforest

AI in the New Caledonian Rainforest: Server Configuration

This article details the server configuration powering the “AI in the New Caledonian Rainforest” project, a research initiative utilizing artificial intelligence to monitor and analyze biodiversity in the New Caledonian rainforest. This guide is intended for newcomers to the MediaWiki platform and provides a detailed overview of the technical infrastructure. Understanding this setup is crucial for anyone contributing to the project's data processing or system maintenance. Please refer to the Main Page for project overview.

Project Overview

The project focuses on real-time analysis of audio and visual data collected from remote sensors deployed within the rainforest. This data is processed using machine learning models to identify species, track population movements, and detect potential threats to the ecosystem. The servers manage data ingestion, model training, and real-time inference. See Data Acquisition for details on sensor deployment.

Server Infrastructure

The infrastructure is comprised of three primary server roles: Data Ingestion, Processing & Model Training, and API & Visualization. These roles are distributed across a cluster of dedicated hardware. We leverage Linux as the operating system for all servers.

Data Ingestion Servers

These servers are responsible for receiving data streams from the rainforest sensors. They perform initial data validation and storage. Crucially, they handle data buffering to prevent loss during network interruptions. For details on networking see Network Topology.

Server Name Role CPU RAM Storage Network Interface
kali-ingest-01 Data Ingestion Intel Xeon Gold 6248R (24 cores) 128 GB DDR4 ECC 10 TB RAID 6 HDD 10 Gbps Ethernet
kali-ingest-02 Data Ingestion (Backup) Intel Xeon Gold 6248R (24 cores) 128 GB DDR4 ECC 10 TB RAID 6 HDD 10 Gbps Ethernet

These servers utilize rsync for data replication between themselves, providing redundancy. Data is initially stored in a raw format before being transferred to the processing servers. See Data Formats for further details.

Processing & Model Training Servers

These servers handle the computationally intensive tasks of data pre-processing, feature extraction, model training, and model evaluation. They are equipped with high-performance GPUs. We use Python and TensorFlow for model building.

Server Name Role CPU RAM GPU Storage Network Interface
kali-process-01 Processing & Training AMD EPYC 7763 (64 cores) 256 GB DDR4 ECC NVIDIA A100 (80GB) 20 TB RAID 0 NVMe SSD 40 Gbps InfiniBand
kali-process-02 Processing & Training AMD EPYC 7763 (64 cores) 256 GB DDR4 ECC NVIDIA A100 (80GB) 20 TB RAID 0 NVMe SSD 40 Gbps InfiniBand
kali-process-03 Processing & Training AMD EPYC 7763 (64 cores) 256 GB DDR4 ECC NVIDIA A100 (80GB) 20 TB RAID 0 NVMe SSD 40 Gbps InfiniBand

Distributed training is performed using Horovod. Model weights are stored in object storage.

API & Visualization Servers

These servers provide an API for accessing processed data and models, as well as a web-based visualization interface. They are responsible for serving predictions to end-users and displaying results. We use Flask as the web framework.

Server Name Role CPU RAM Storage Network Interface
kali-api-01 API & Visualization Intel Xeon Silver 4210 (10 cores) 64 GB DDR4 ECC 2 TB NVMe SSD 1 Gbps Ethernet
kali-api-02 API & Visualization (Backup) Intel Xeon Silver 4210 (10 cores) 64 GB DDR4 ECC 2 TB NVMe SSD 1 Gbps Ethernet

The API documentation is available at API Documentation. The visualization interface provides interactive maps and charts of species distribution. See Visualization Tools.

Software Stack

The following software components are crucial to the operation of the system:

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️