Compute Services

Canopy Wave uses Virtualization technology to provide world-leading performance GPU for AI training & inference

Canopy Wave uses Virtualization technology to provide world-leading
performance GPU for AI training & inference

Massive Parallel Processing Power

AI workloads require performing millions (or billions) of mathematical operations simultaneously

GPUs have thousands of cores designed for parallel computation, ideal for training and running neural networks efficiently

Canopy Wave on-demand GPU Cluster

NVIDIA GPUs

Featuring access to NVIDIA HGX H100, HGX H200, with connections of NVLinks and 400G RoCEv2 or InfiniBand networking

MULTI-GPU instances

Train and fine-tune AI models across instance types that best suit your needs: 1x, 2x, 4x, 8x and up to 64 NVIDIA GPUs instances, real on-demand, billed by the minute

Canopy Wave private cloud

Best GPU cluster performance in the industry. With 99.99% up-time. Have all your GPUs under the same datacenter, your workload and privacy are protected

Leadership in AI-Optimized
H100 and H200

The most high-end GPU platforms are custom-built for AI and include large numbers of Tensor Cores, NVLink and Transformer Engine
Tailored for modern AI workloads, and are benchmark leaders in training
and inference performance

NVIDIA HGX H200

Unmatched Memory Bandwidth & Capacity for Large AI Models

141 GB of HBM3e memory

Large language models (LLMs) and generative AI systems need to process huge datasets and massive parameter matrices. Speed and scale depend heavily on how fast and how much memory the GPU can access

4.8 TB/s memory bandwidth

the fastest of any NVIDIA GPU to date

Optimized for memory-bound workloads, including:

• Large transformer models
• Retrieval-augmented generation (RAG)
• Generative vision-language models
• Inference on massive context windows (e.g. greater than 100k tokens)

NVIDIA HGX H100

Transformer Engine: Purpose-Built for Training and Running Large AI Models

Higher Accuracy

Transformer Engine uses FP8 precision (8-bit floating point) with dynamic range scaling

Better Performance

Delivers up to 9x faster training and 30x faster inference vs old-generation GPUs like A100

Flexible Configurations

Dynamically switches between FP8 and FP16/FP32 for optimal accuracy + speed

Better Access to compute

Includes 72 billion transistors, 80 GB of HBM3 memory, and supports NVLink and PCIe 5.0 for fast interconnects

Why NVIDIA

CUDA is the de facto standard for AI/ML workloads, with deep integration into frameworks like TensorFlow and PyTorch. It’s not just hardware, but also the ecosystem that forms massive compatibility

CPU Nodes

Our CPU instances are optimized for general-purpose, compute-heavy, and memory-bound applications, providing flexibility and performance at scale

Processor

Utilize the latest 6th-Gen Intel Xeon Scalable processors, offering up to 64 vCPUs per instance

Memory

Each instance supports up to 256TB of DIMM, delivering high throughput for compute-intensive workloads

Intel Xeon Scalable Processors

(6th Gen)

The latest generation utilizes a disaggregated design with multiple compute and I/O chiplets interconnected via EMIB (Embedded Multi-Die Interconnect Bridge)

Core count & frequency

Engineering samples (ES1) of Granite Rapids feature up to 56 cores (1.1-2.7 GHz base/turbo), with production models expected to reach 84-90 cores

Memory support

12-channel DDR5-6400 with MCR DIMMs, delivering up to 1.6x higher bandwidth than previous generations

Cache & interconnect

Each compute tile includes 2MB L2 cache and 4MB L3 cache, while the platform supports PCIe Gen5 (136 lanes) and CXL 2.0 for GPU/FPGA acceleration

Enhanced GPU cluster performance

Canopy Wave uses powerful and efficient CPUs to enable higher utility and performance from GPU clusters. Let CPUs handle generalized computing needs, freeing GPUs to focus on high-intensity tasks

Parallel processing & AI acceleration

Modern CPU servers leverage AVX-512 and VNNI (Vector Neural Network Instructions) to boost AI inference throughput by 2-4x compared to older architectures

Multi-threading

Hyper-Threading enables 112 threads on a 56-core CPU, optimizing multi-tasking efficiency for virtualization and HPC workloads

Energy efficiency

Intel’s Dynamic Voltage and Frequency Scaling (DVFS) and RAPL (Runtime Average Power Limiting) reduce idle power consumption by 30%, while TCO improvements reach 68% through server consolidation (5-10:1 replacement ratio)

Bare metal GPU cluster in private cloud

Private, secure GPU cluster for large AI deployments. Short or long term contracts for 256 to 2000 GPUs in InfiniBand or RoCEv2 networking

Get the latest and greatest NVIDIA GPUs

Canopy Wave provides the best performing GPUs clusters with 99.99% uptime, 24/7 support to maximize reliability We use the highest safety standards to ensure data security

NVIDIA HGX B200

The NVIDIA HGX B200, powered by eight NVIDIA Blackwell GPUs and fifth-generation NVLink™, delivers up to 3× faster training and 15× faster inference compared to previous generations, making it the ideal unified AI platform for businesses at any stage

NVIDIA HGX H200

The first GPU featuring HBM3e memory, the H200 sets new standards for generative AI and HPC workloads with unprecedented memory capacity and bandwidth, significantly accelerating LLM training and inference performance

NVIDIA HGX H100

Built on the NVIDIA Hopper™ architecture with a dedicated Transformer Engine, the H100 accelerates LLMs by up to 30×, setting new benchmarks for conversational AI and efficiently powering trillion-parameter language models

Get full visibility of your Cluster

Canopy Wave DCIM Platform can provide you with full visibility of the cluster. Getting to know your utilization rate, health condition, and uptime in one
single dashboard to get your cluster fully under control

Our DCIM platform can help early detect possible failure and send out corresponding work orders to minimize interruption and keep industry
leading performance and uptime

Ready to get started?

Create your Canopy Wave cloud account to launch GPU clusters immediately or contact us to reserve a long term contractCreate your Canopy Wave cloud account to launch GPU clusters immediately
or contact us to reserve a long term contract

Compute Services

Massive Parallel Processing Power

Canopy Wave on-demand GPU Cluster

NVIDIA GPUs

MULTI-GPU instances

Canopy Wave private cloud

Leadership in AI-Optimized H100 and H200

NVIDIA HGX H200

141 GB of HBM3e memory

4.8 TB/s memory bandwidth

Optimized for memory-bound workloads, including:

NVIDIA HGX H100

Higher Accuracy

Better Performance

Flexible Configurations

Better Access to compute

Why NVIDIA

CPU Nodes

Processor

Memory

Intel Xeon Scalable Processors

(6th Gen)

Core count & frequency

Memory support

Cache & interconnect

Enhanced GPU cluster performance

Parallel processing & AI acceleration

Multi-threading

Energy efficiency

Bare metal GPU cluster in private cloud

Get the latest and greatest NVIDIA GPUs

NVIDIA HGX B200

NVIDIA HGX H200

NVIDIA HGX H100

Get full visibility of your Cluster

Ready to get started?

Leadership in AI-Optimized
H100 and H200