Croatian National Cloud

From Server rental store
Revision as of 00:28, 29 August 2025 by Admin (talk | contribs) (Automated server configuration article)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Croatian National Cloud: Technical Documentation

This document details the hardware configuration of the Croatian National Cloud (HNC), a high-performance computing and cloud infrastructure designed to serve the needs of Croatian research institutions, government agencies, and private sector companies. This document will cover hardware specifications, performance characteristics, recommended use cases, comparisons to similar configurations, and essential maintenance considerations.

1. Hardware Specifications

The Croatian National Cloud utilizes a hyperconverged infrastructure (HCI) approach, built around a dense cluster of high-performance servers. The core of the infrastructure consists of approximately 128 compute nodes, with a dedicated storage cluster and network fabric. The following details the specifications of a single compute node, the storage node configuration, and the networking components.

1.1 Compute Node Specifications

Each compute node is designed for maximum computational density and efficiency.

Component Specification
CPU Dual Intel Xeon Platinum 8380 (40 cores/80 threads per CPU) - Total 80 cores/160 threads
CPU Clock Speed 2.3 GHz base, 3.4 GHz Turbo Boost Max 3.0
Chipset Intel C621A
RAM 512 GB DDR4-3200 ECC Registered DIMMs (16 x 32GB)
RAM Configuration 8 channels per CPU, for a total of 16 channels
Storage - Local Boot 480 GB NVMe PCIe Gen4 SSD (Intel Optane P4800X Series)
Network Interface Card (NIC) Dual Port 100GbE Mellanox ConnectX-6 Dx
Power Supply Unit (PSU) 2000W 80+ Platinum Redundant Power Supplies
Motherboard Supermicro X12DPG-QT6
Form Factor 2U Rack Server
RAID Controller Integrated Intel RSTe SATA RAID Controller

Each node also includes a dedicated BMC (Baseboard Management Controller) for remote management and monitoring, utilizing an IPMI 2.0 interface. The BMC is accessed via a dedicated out-of-band network. See BMC Configuration for more details.

1.2 Storage Node Specifications

The storage infrastructure is built around a distributed, scale-out architecture based on software-defined storage.

Component Specification
CPU Dual Intel Xeon Gold 6338 (32 cores/64 threads per CPU)
RAM 256 GB DDR4-3200 ECC Registered DIMMs (8 x 32GB)
Storage – Storage Drives 16 x 16TB SAS 12Gbps 7.2K RPM HDD (Seagate Exos X16) per node. Total Raw Capacity per node: 256TB
Storage – SSD Cache 8 x 1.92TB NVMe PCIe Gen4 SSD (Samsung PM1733) per node. Total Cache per node: 15.36TB
RAID Controller Hardware RAID Controller with Write-Back Caching (Broadcom MegaRAID SAS 9361-8i)
Network Interface Card (NIC) Dual Port 100GbE Mellanox ConnectX-6 Dx
Power Supply Unit (PSU) 1600W 80+ Platinum Redundant Power Supplies
Motherboard Supermicro X12DPi-N6
Form Factor 2U Rack Server

There are 32 storage nodes, providing a total raw storage capacity of 8.192 Petabytes. The storage is deployed using a distributed file system – see Distributed File System Architecture for further details. Data redundancy is achieved through erasure coding, providing equivalent protection to RAID 6 with improved efficiency.

1.3 Networking Infrastructure

The network infrastructure is a critical component of the HNC, providing high bandwidth and low latency connectivity between compute nodes, storage nodes, and external networks.

Component Specification
Core Switches Arista 7050X Series (48 x 100GbE, 8 x 400GbE)
Top-of-Rack (ToR) Switches Arista 7280E Series (48 x 25GbE, 6 x 100GbE)
Interconnect Technology InfiniBand HDR (200Gbps) for inter-node communication
External Connectivity Multiple 100GbE connections to Croatian Academic and Research Network (CARNET) and international peering points. See Network Topology for a diagram.
Load Balancing HAProxy with Dynamic Routing

The network is segmented into multiple VLANs for security and isolation. Firewalling is provided by dedicated hardware firewalls, integrated with intrusion detection and prevention systems. See Network Security Policy for detailed information.


2. Performance Characteristics

The HNC has been subjected to extensive benchmarking to characterize its performance capabilities. Results are presented below.

2.1 Compute Node Benchmarks

  • **SPEC CPU 2017:** Average scores of 150 (int) and 300 (fp) per core. These scores indicate strong performance in both integer and floating-point workloads. Detailed results can be found in SPEC CPU 2017 Results.
  • **High Performance Linpack (HPL):** Achieved a sustained performance of 1.8 PFLOPS on the entire cluster. See HPL Benchmark Report for details.
  • **STREAM Triad:** Memory bandwidth of 750 GB/s, demonstrating excellent memory performance.
  • **IOzone:** Local NVMe SSDs achieved speeds of up to 7 GB/s for sequential reads and writes.

2.2 Storage Cluster Benchmarks

  • **FIO (Flexible I/O Tester):** Sustained aggregate IOPS of 3 million with 4KB random reads, and 1.5 million IOPS with 4KB random writes.
  • **MDTest:** Aggregate bandwidth of 120 GB/s for sequential reads and writes.
  • **NFS Benchmarks:** Average throughput of 80 GB/s when accessed over NFS. See Storage Performance Analysis for more details.

2.3 Real-World Performance

  • **Molecular Dynamics Simulations (GROMACS):** Significantly reduced simulation times compared to previous infrastructure, enabling larger and more complex simulations.
  • **Machine Learning Training (TensorFlow):** Accelerated training times for deep learning models, particularly for large datasets.
  • **Data Analytics (Spark):** Improved query performance for large-scale data analytics tasks. See Spark Performance Optimization for details.



3. Recommended Use Cases

The Croatian National Cloud is suitable for a wide range of applications, including:

  • **Scientific Computing:** Research in fields such as physics, chemistry, biology, and astronomy. Ideal for computationally intensive simulations and data analysis.
  • **Machine Learning and Artificial Intelligence:** Training and deployment of machine learning models for various applications, including image recognition, natural language processing, and predictive analytics.
  • **Big Data Analytics:** Processing and analyzing large datasets to extract valuable insights.
  • **Genomics Research:** Analyzing genomic data to identify disease markers and develop personalized medicine.
  • **Financial Modeling:** Running complex financial models and simulations.
  • **Weather Forecasting:** Running high-resolution weather forecasting models.
  • **Digital Archiving:** Long-term storage and preservation of digital assets.
  • **Disaster Recovery:** Providing a secure and reliable disaster recovery solution.
  • **High-Performance Web Applications:** Hosting and scaling demanding web applications.


4. Comparison with Similar Configurations

The HNC’s configuration is comparable to other national-level cloud infrastructures. The following table provides a comparison with two similar systems: PRACE EuroHPC Supercomputers and the German National Cloud (DFNCloud).

Feature Croatian National Cloud (HNC) PRACE EuroHPC Supercomputers (Example: Leonardo) DFNCloud
Compute Nodes CPU Dual Intel Xeon Platinum 8380 AMD EPYC Rome 7763 (64 cores/128 threads) Intel Xeon Scalable (Varying models)
Compute Nodes RAM 512 GB DDR4-3200 2 TB DDR4 128-512 GB DDR4
Compute Nodes Storage 480 GB NVMe SSD N/A (Compute nodes primarily rely on shared storage) N/A (Compute nodes primarily rely on shared storage)
Storage Capacity 8.192 PB >10 PB >10 PB
Interconnect InfiniBand HDR 200Gbps Slingshot-11 Infiniband, Ethernet
Network Bandwidth 100/400 GbE External 400 GbE External 100 GbE External
Primary Use Case Broad range, incl. Scientific Computing, AI/ML High-Performance Scientific Computing Cloud services, Data Storage, Research Infrastructure

The HNC achieves a balance between compute power, storage capacity, and network performance, making it versatile for a wider range of applications compared to the more specialized PRACE systems. Compared to the DFNCloud, the HNC offers greater peak performance due to its more powerful CPUs and faster interconnect. See Competitive Analysis for a more in-depth comparison.

5. Maintenance Considerations

Maintaining the HNC requires careful planning and execution.

5.1 Cooling

The high density of the servers necessitates a robust cooling system. The data center utilizes a hot aisle/cold aisle containment strategy with in-row cooling units. Ambient temperature is maintained at 22°C ± 2°C. Regular monitoring of server inlet temperatures is critical to prevent overheating. See Data Center Cooling System for more information.

5.2 Power Requirements

The entire infrastructure consumes approximately 800 kW of power. The data center is equipped with redundant power supplies and UPS (Uninterruptible Power Supply) systems to ensure continuous operation. Power usage effectiveness (PUE) is maintained at 1.4. See Power Management Plan for details.

5.3 Software Updates

Regular software updates are crucial for security and stability. A centralized patch management system is used to deploy updates to all servers. Updates are tested in a staging environment before being rolled out to production. See Software Update Procedure.

5.4 Hardware Maintenance

Preventative maintenance is performed on all hardware components on a scheduled basis. This includes cleaning, inspection, and replacement of components as needed. A detailed hardware inventory is maintained to track all assets. See Hardware Inventory Management.

5.5 Monitoring and Alerting

A comprehensive monitoring system is in place to track the health and performance of all components. Alerts are configured to notify administrators of any issues. The monitoring system integrates with the incident management system. See System Monitoring Configuration. Centralized logging is implemented using the ELK stack (Elasticsearch, Logstash, Kibana).


Distributed File System Architecture BMC Configuration Network Topology Network Security Policy SPEC CPU 2017 Results HPL Benchmark Report Storage Performance Analysis Spark Performance Optimization Competitive Analysis Data Center Cooling System Power Management Plan Software Update Procedure Hardware Inventory Management System Monitoring Configuration Incident Management System Data Backup and Recovery Plan Disaster Recovery Procedures


Intel-Based Server Configurations

Configuration Specifications Benchmark
Core i7-6700K/7700 Server 64 GB DDR4, NVMe SSD 2 x 512 GB CPU Benchmark: 8046
Core i7-8700 Server 64 GB DDR4, NVMe SSD 2x1 TB CPU Benchmark: 13124
Core i9-9900K Server 128 GB DDR4, NVMe SSD 2 x 1 TB CPU Benchmark: 49969
Core i9-13900 Server (64GB) 64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB) 128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB) 64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB) 128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation 64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration Specifications Benchmark
Ryzen 5 3600 Server 64 GB RAM, 2x480 GB NVMe CPU Benchmark: 17849
Ryzen 7 7700 Server 64 GB DDR5 RAM, 2x1 TB NVMe CPU Benchmark: 35224
Ryzen 9 5950X Server 128 GB RAM, 2x4 TB NVMe CPU Benchmark: 46045
Ryzen 9 7950X Server 128 GB DDR5 ECC, 2x2 TB NVMe CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB) 128 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB) 128 GB RAM, 2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB) 128 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB) 256 GB RAM, 1 TB NVMe CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB) 256 GB RAM, 2x2 TB NVMe CPU Benchmark: 48021
EPYC 9454P Server 256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️