Server rental store

Machine learning

# Machine Learning Server Configuration

This article details the server configuration best suited for running machine learning workloads within our infrastructure. It is aimed at newcomers to the system and provides a technical overview of the hardware and software components required for optimal performance. This guide assumes a basic understanding of Server Administration and Linux Command Line.

Introduction

Machine learning (ML) tasks demand significant computational resources. Effective deployment requires careful consideration of CPU, GPU, memory, and storage. This document outlines a recommended configuration, focusing on balancing cost and performance. We’ll cover hardware specifications, software requirements, and essential configuration steps. This server will primarily be used for Model Training and Inference Serving.

Hardware Specifications

The following table outlines the recommended hardware components. Note that these are *minimum* specifications; scaling up based on workload demands is strongly encouraged. Further details on Hardware Procurement can be found on the internal wiki.

Component Specification Notes
CPU Intel Xeon Gold 6338 (32 cores) or AMD EPYC 7763 (64 cores) Higher core counts are beneficial for parallel processing.
RAM 256 GB DDR4 ECC Registered Crucial for handling large datasets and complex models.
GPU NVIDIA A100 80GB or AMD Instinct MI250X The GPU is the most critical component for ML workloads.
Storage (OS) 500GB NVMe SSD For fast boot times and system responsiveness.
Storage (Data) 4TB NVMe SSD RAID 0 or 8TB SATA SSD RAID 10 Fast storage is essential for data loading and processing. RAID configuration impacts performance and redundancy.
Network Interface 100 GbE High bandwidth is needed for data transfer.

Software Configuration

The operating system of choice is Ubuntu Server 22.04 LTS. This provides a stable and well-supported platform. The following software packages are required:

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️