banner

Compute Services

Canopy Wave uses Virtualization technology to provide world leading
performance GPU for AI training & inference

Processing Power

Massive Parallel Processing Power

AI workloads require performing millions (or billions) of mathematical operations simultaneously

GPUs have thousands of cores designed for parallel computation, ideal for training and running neural networks efficiently

Canopy Wave on-demand GPU Cluster

NVIDIA GPUs

NVIDIA GPUs

Featuring access to NVIDIA HGX H100, HGX H200, with connection of NVLINKS and 400G RoCEV2 or InfiniBand networking

Multi-GPU Instances

MULTI-GPU instances

Train and fine-tune AI models across instance types that best suits your need: 1x, 2x, 4x, 8x and up to 64 NVIDIA GPUs instances, real on-demand, billed by minute

Private Cloud

Canopy Wave private cloud

Best GPU cluster performance in the industry. With 99.99% up time. Have all you GPUs under same datacenter, your workload and privacy is protected

Leadership in AI-Optimized H100 and H200

  • The most high-end GPU platforms custom-built for AI and include large numbers of Tensor Cores, NVLink and Transformer Engine

  • Tailored for modern AI workloads, and are benchmark leaders in training and inference performance

NVIDIA H200 GPU

NVIDIA HGX H200

Unmatched Memory Bandwidth & Capacity for Large AI Models

Memory

141 GB of HBM3e memory

Large language models (LLMs) and generative AI systems need to process huge datasets and massive parameter matrices. Speed and scale depend heavily on how fast and how much memory the GPU can access

Bandwidth

4.8 TB/s memory bandwidth

the fastest of any NVIDIA GPU to date

Workloads

Optimized for memory-bound workloads, including:

  • • Large transformer models
  • • Retrieval-augmented generation (RAG)
  • • Generative vision-language models
  • • Inference on massive context windows (e.g. greater than 100k tokens)

NVIDIA HGX H100

Transformer Engine: Purpose-Built for Training and Running Large AI Models

Higher Accuracy

Higher Accuracy

Transformer Engine uses FP8 precision (8-bit floating point) with dynamic range scaling

Better Performance

Better Performance

Delivers up to 9x faster training and 30x faster inference vs old-generation GPUs like A100

Flexible Configurations

Flexible Configurations

Dynamically switches between FP8 and FP16/FP32 for optimal accuracy + speed

Better Access

Better Access to compute

Includes 72 billion transistors, 80 GB of HBM3 memory, and supports NVLink and PCIe 5.0 for fast interconnects

Why NVIDIA

CUDA is the de facto standard for AI/ML workloads, with deep integration into frameworks like TensorFlow and PyTorch. It’s not just hardware, but also the ecosystem that forms massive compatibility

Cpu Servers

CPU Nodes

Our CPU instances are optimized for general-purpose, compute-heavy, and memory-bound applications, providing flexibility and performance at scale

CPU Icon

Processor

Utilize the latest 6th-Gen Intel Xeon Scalable processors, offering up to 64 vCPUs per instance

Memory Icon

Memory

Each instance supports up to 256TB of DIMM, delivering high throughput for compute-intensive workloads

Intel Xeon Scalable Processors

(6th Gen)

The latest generation utilizes a disaggregated design with multiple compute and I/O chiplets interconnected via EMIB (Embedded Multi-Die Interconnect Bridge)

Core Count

Core count & frequency

Engineering samples (ES1) of Granite Rapids feature up to 56 cores (1.1-2.7 GHz base/turbo), with production models expected to reach 84-90 cores

Memory Support

Memory support

12-channel DDR5-6400 with MCR DIMMs, delivering up to 1.6x higher bandwidth than previous generations

Cache

Cache & interconnect

Each compute tile includes 2MB L2 cache and 4MB L3 cache, while the platform supports PCIe Gen5 (136 lanes) and CXL 2.0 for GPU/FPGA acceleration

Enhanced GPU cluster performance

Canopy Wave uses powerful and efficient CPUs to enable higher utility and performance from GPU clusters. Let CPUs handle generalized computing needs, freeing GPUs to focus on high-intensity tasks

Parallel processing & AI acceleration

Modern CPU servers leverage AVX-512 and VNNI (Vector Neural Network Instructions) to boost AI inference throughput by 2-4x compared to older architectures

Multi-threading

Hyper-Threading enables 112 threads on a 56-core CPU, optimizing multi-tasking efficiency for virtualization and HPC workloads

Energy efficiency

Intel’s Dynamic Voltage and Frequency Scaling (DVFS) and RAPL (Runtime Average Power Limiting) reduce idle power consumption by 30%, while TCO improvements reach 68% through server consolidation (5-10:1 replacement ratio)

Bare metal GPU cluster in private cloud

Private, secure GPU cluster for large AI deployments. Short or long term contracts for 256 to 2000 GPUs in InfiniBand or RoCEV2 networking

Bare Metal

Get the latest and greatest NVIDIA GPUs

Canopy Wave provides the best performing GPUs clusters with 99.99% uptime, 24/7 support to maximize reliability. We use highest safety stander to ensure data security

NVIDIA HGX B200

NVIDIA HGX B200

The NVIDIA HGX B200, powered by eight NVIDIA Blackwell GPUs and fifth-generation NVLink™, delivers up to 3× faster training and 15× faster inference compared to previous generations, making it the ideal unified AI platform for businesses at any stage

NVIDIA HGX H200

NVIDIA HGX H200

The first GPU featuring HBM3e memory, the H200 sets new standards for generative AI and HPC workloads with unprecedented memory capacity and bandwidth, significantly accelerating LLM training and inference performance

NVIDIA HGX H100

NVIDIA HGX H100

Built on the NVIDIA Hopper™ architecture with dedicated Transformer Engine, the H100 accelerates LLMs by up to 30×, setting new benchmarks for conversational AI and efficiently powering trillion-parameter language models

Get full visibility of your Cluster

Canopy Wave DCIM Platform provide you with full visibility of the cluster. Getting to know your utilization rate, health condition, and uptime in one single dashboard to get your cluster fully under control

Our DCIM platform can help early detect possible failure and send out corresponding work orders to minimize interruption and keep industry leading performance and uptime

Cluster

Ready to get started?

Create your Canopy Wave cloud account to launch GPU clusters immediately or contact us to reserve a long term contract