Server rental store

AI in the Baltic Sea

# AI in the Baltic Sea: Server Configuration

This article details the server configuration utilized for the "AI in the Baltic Sea" project, a research initiative focused on real-time data analysis and predictive modeling of environmental conditions within the Baltic Sea region. This guide is intended for newcomers to our server environment and provides a comprehensive overview of the hardware and software stack.

Project Overview

The "AI in the Baltic Sea" project ingests data from a network of underwater sensors, satellite imagery, and historical datasets. This data is processed using machine learning algorithms to predict algal blooms, monitor water quality, and track marine life migration patterns. The server infrastructure is designed for high throughput, low latency, and scalability. Data Acquisition is a crucial component, and Data Preprocessing prepares the data for analysis. The core of the project revolves around Machine Learning Models and their deployment. We utilize a distributed system to handle the large data volumes. See also Project Goals for a high-level overview.

Hardware Infrastructure

The server infrastructure consists of three primary tiers: Data Ingestion, Processing, and Storage. Each tier is built with redundancy and scalability in mind.

Data Ingestion Tier

This tier handles the reception of data from various sources. It’s designed for high availability and rapid data transfer.

Component Specification Quantity
Server Type Dell PowerEdge R750 2
CPU Intel Xeon Gold 6338 (32 Cores) 2 per server
RAM 256 GB DDR4 ECC REG 2 per server
Network Interface 100 Gbps Ethernet 2 per server
Storage (Temporary) 2 x 1 TB NVMe SSD (RAID 1) 2 per server

These servers utilize Network Protocols like MQTT and HTTP/S for data reception. Security Considerations are paramount in this tier.

Processing Tier

This tier performs the computationally intensive tasks of data cleaning, transformation, and model training/inference.

Component Specification Quantity
Server Type Supermicro SYS-2029U-TR4 4
CPU AMD EPYC 7763 (64 Cores) 2 per server
GPU NVIDIA A100 (80GB) 2 per server
RAM 512 GB DDR4 ECC REG 2 per server
Storage (Local) 4 x 4 TB NVMe SSD (RAID 10) 4 per server

GPU acceleration is essential for our Deep Learning Frameworks, specifically TensorFlow and PyTorch. We employ Containerization using Docker and Kubernetes for efficient resource management.

Storage Tier

The Storage Tier provides persistent storage for raw data, processed data, and model artifacts.

Component Specification Capacity
Storage System Dell EMC PowerScale F600 1 PB (Scalable to 5 PB)
File System Lustre N/A
Network Connectivity 200 Gbps InfiniBand N/A
Redundancy Triple Parity RAID N/A

Data Backup Strategies are crucial for data integrity and disaster recovery. We use a tiered storage approach, utilizing faster storage for frequently accessed data and slower, cheaper storage for archival purposes.

Software Stack

The software stack is designed to support the entire data pipeline, from ingestion to model deployment.

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️