AI in the World Bank
- AI in the World Bank: Server Configuration & Architecture
This article details the server configuration supporting Artificial Intelligence (AI) initiatives within the World Bank. It is geared towards new system administrators and developers onboarding to the platform. We will cover hardware, software, and network considerations crucial for maintaining a robust and scalable AI infrastructure. Understanding these details is paramount for successful deployment and maintenance of AI-driven solutions at the World Bank. This document assumes basic familiarity with Linux server administration and networking concepts. See Server Administration Basics for introductory material.
Overview
The World Bank leverages AI for a variety of applications, including risk assessment, fraud detection, project monitoring, and economic forecasting. These applications demand significant computational resources and specialized software. Our current architecture is a hybrid model, utilizing a combination of on-premise servers and cloud resources (primarily Amazon Web Services). This allows for flexibility and scalability while maintaining control over sensitive data. This document focuses specifically on the on-premise components. For information on the cloud infrastructure, please refer to the Cloud Services Documentation.
Hardware Infrastructure
The core of our on-premise AI infrastructure consists of a cluster of high-performance servers. The following table details the specifications of a typical server node:
Component | Specification | |
---|---|---|
CPU | 2 x Intel Xeon Gold 6338 (32 cores per CPU, 64 total) | |
RAM | 512 GB DDR4 ECC Registered RAM | |
Storage | 4 x 4TB NVMe SSD (RAID 0) for OS and active data | 8 x 16TB SAS HDD (RAID 6) for long-term storage |
GPU | 4 x NVIDIA A100 (80GB HBM2e) | |
Network Interface | 2 x 100 GbE Mellanox ConnectX-6 | |
Power Supply | 2 x 2000W Redundant Power Supplies |
These servers are housed in a dedicated, climate-controlled server room with redundant power and cooling. Details on the physical security of the server room can be found in the Physical Security Protocol. The network backbone utilizes a high-speed InfiniBand network to minimize latency between nodes during model training.
Software Stack
The software stack is built around a Linux operating system (CentOS 7) and a variety of AI/ML frameworks. Key components include:
- Operating System: CentOS 7, hardened according to Security Hardening Guidelines
- Containerization: Docker and Kubernetes are used for application deployment and orchestration. See Docker Deployment Guide and Kubernetes Cluster Management.
- AI/ML Frameworks: TensorFlow, PyTorch, scikit-learn, and XGBoost are the primary frameworks used for model development and training.
- Data Storage: Hadoop Distributed File System (HDFS) is used for large-scale data storage and processing.
- Database: PostgreSQL is used as the primary database for metadata and model management. Refer to the Database Administration Guide.
- Monitoring: Prometheus and Grafana are used for system monitoring and alerting.
The following table details the software versions currently in use:
Software | Version |
---|---|
CentOS | 7.9.2009 |
Docker | 20.10.7 |
Kubernetes | 1.23.4 |
TensorFlow | 2.9.1 |
PyTorch | 1.12.1 |
Hadoop | 3.3.1 |
PostgreSQL | 13.7 |
Prometheus | 2.35.0 |
Grafana | 8.4.5 |
Network Configuration
The server cluster is connected to the World Bank's internal network via a dedicated VLAN. The network is segmented to isolate the AI infrastructure from other systems. The following table outlines the key network parameters:
Parameter | Value |
---|---|
VLAN ID | 100 |
Subnet Mask | 255.255.255.0 |
Gateway | 192.168.100.1 |
DNS Servers | 192.168.1.10, 192.168.1.11 |
Firewall | iptables, configured according to Firewall Ruleset |
All traffic to and from the AI servers is monitored and logged for security purposes. See the Network Security Policy for detailed information. Access to the AI servers is restricted to authorized personnel only, utilizing strong authentication mechanisms (multi-factor authentication). Connection details for accessing the servers can be found in the Access Control List.
Future Considerations
We are continually evaluating new technologies to improve the performance and scalability of our AI infrastructure. Future plans include:
- Upgrading to newer generation GPUs (NVIDIA H100).
- Implementing a distributed training framework (Horovod).
- Expanding our use of cloud resources for burst capacity.
- Exploring the use of specialized hardware accelerators (e.g., TPUs).
- Further integration with the Data Lake Initiative.
Related Documents
- Server Administration Basics
- Cloud Services Documentation
- Physical Security Protocol
- InfiniBand network
- Security Hardening Guidelines
- Docker Deployment Guide
- Kubernetes Cluster Management
- Database Administration Guide
- Firewall Ruleset
- Network Security Policy
- Access Control List
- Data Lake Initiative
- Backup and Recovery Procedures
- Disaster Recovery Plan
- Change Management Process
Intel-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Core i7-6700K/7700 Server | 64 GB DDR4, NVMe SSD 2 x 512 GB | CPU Benchmark: 8046 |
Core i7-8700 Server | 64 GB DDR4, NVMe SSD 2x1 TB | CPU Benchmark: 13124 |
Core i9-9900K Server | 128 GB DDR4, NVMe SSD 2 x 1 TB | CPU Benchmark: 49969 |
Core i9-13900 Server (64GB) | 64 GB RAM, 2x2 TB NVMe SSD | |
Core i9-13900 Server (128GB) | 128 GB RAM, 2x2 TB NVMe SSD | |
Core i5-13500 Server (64GB) | 64 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Server (128GB) | 128 GB RAM, 2x500 GB NVMe SSD | |
Core i5-13500 Workstation | 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000 |
AMD-Based Server Configurations
Configuration | Specifications | Benchmark |
---|---|---|
Ryzen 5 3600 Server | 64 GB RAM, 2x480 GB NVMe | CPU Benchmark: 17849 |
Ryzen 7 7700 Server | 64 GB DDR5 RAM, 2x1 TB NVMe | CPU Benchmark: 35224 |
Ryzen 9 5950X Server | 128 GB RAM, 2x4 TB NVMe | CPU Benchmark: 46045 |
Ryzen 9 7950X Server | 128 GB DDR5 ECC, 2x2 TB NVMe | CPU Benchmark: 63561 |
EPYC 7502P Server (128GB/1TB) | 128 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/2TB) | 128 GB RAM, 2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (128GB/4TB) | 128 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/1TB) | 256 GB RAM, 1 TB NVMe | CPU Benchmark: 48021 |
EPYC 7502P Server (256GB/4TB) | 256 GB RAM, 2x2 TB NVMe | CPU Benchmark: 48021 |
EPYC 9454P Server | 256 GB RAM, 2x2 TB NVMe |
Order Your Dedicated Server
Configure and order your ideal server configuration
Need Assistance?
- Telegram: @powervps Servers at a discounted price
⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️