Server rental store

AI in Saba

# AI in Saba: Server Configuration

This article details the server configuration supporting the Artificial Intelligence (AI) initiatives within the Saba learning platform. This guide is intended for newcomers to the Saba server administration team and provides a technical overview of the hardware and software required to run AI-powered features. Familiarity with Linux server administration and basic networking concepts is recommended.

Overview

The Saba platform has been undergoing a transformation to integrate AI capabilities, primarily focusing on personalized learning recommendations, automated content tagging, and intelligent assessment. This requires a significant investment in server infrastructure capable of handling the computational demands of machine learning models. The current architecture utilizes a distributed system, separating data storage, model training, and inference services. We will outline the core components below.

Hardware Specifications

The following tables detail the hardware specifications for the three primary server roles: Data Storage, Model Training, and Inference.

Server Role CPU RAM Storage Network Interface
Data Storage 2 x Intel Xeon Gold 6248R (24 cores/CPU) 512GB DDR4 ECC REG 100TB NVMe SSD RAID 10 100GbE
Model Training 2 x AMD EPYC 7763 (64 cores/CPU) 1TB DDR4 ECC REG 2 x 8TB NVMe SSD (RAID 1) + 20TB HDD (Data Backup) 100GbE
Inference 4 x Intel Xeon Silver 4210 (10 cores/CPU) 256GB DDR4 ECC REG 4TB NVMe SSD 25GbE

These specifications are subject to change based on evolving AI model complexity and user load. Regular monitoring of server performance is crucial.

Software Stack

The software stack is built upon a foundation of Ubuntu Server 22.04 LTS. Specific software versions are maintained via our internal package repository to ensure consistency and compatibility.

Component Version Purpose
Operating System Ubuntu Server 22.04 LTS Base operating system for all servers.
Python 3.10.6 Primary language for AI model development and deployment.
TensorFlow 2.12.0 Machine learning framework.
PyTorch 2.0.1 Alternative machine learning framework.
PostgreSQL 14.7 Database for storing training data and model metadata. See also Database Administration.
Redis 7.0.12 In-memory data store for caching and fast data access.
Docker 20.10.21 Containerization platform for deploying AI services.
Kubernetes 1.26.3 Container orchestration platform. See also Kubernetes Deployment.

All code is managed using Git version control and hosted on our internal GitLab instance. CI/CD pipelines are used to automate the build, testing, and deployment process.

Network Configuration

The AI servers are deployed within a dedicated VLAN to isolate traffic and enhance security. The network architecture utilizes a three-tier model:

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️