Sign up now! New useSign up now! New users get $20 in free creditsDeepSeek V3.1
Kimi Linear 48B A3B Instruct API
CodeLLM

Kimi Linear 48B
A3B Instruct API

All You Need To Know About Kimi Linear 48B A3B Instruct API

Overview

Model Provider:Moonshot AI
Model Type:CODE/LLM
State:Ready

Key Specs

Quantization:BF16
Parameters:48B
Context:128k
Pricing:$0.10 input / $0.40 output
Try Model
Quick Start
Reserve Dedicated Endpoint

Introduction

Kimi Linear is a hybrid linear attention architecture that outperforms traditional attention methods. It is powered by Kimi Delta Attention (KDA), a novel mechanism that optimizes RNN memory for superior hardware efficiency and performance.

For long-context tasks up to 1 million tokens, Kimi Linear reduces KV cache requirements by 75% and boosts decoding throughput by 6x. The core KDA kernel and model checkpoints, trained on 5.7T tokens, are open-sourced.

Kimi Linear 48B A3B Instruct API Usage

Model

Endpoint

moonshot-ai/Kimi-Linear-48B-A3B-Instruct


        1
        curl -X POST https://inference.canopywave.io/v1  
      
        2
         -H "Content-Type: application/json"  
      
        3
         -H "Authorization: Bearer $CANOPYWAVE_API_KEY"  
      
        4
         -d '{ 
      
        5
           "model": "moonshotai/Kimi-Linear-48B-A3B-Instruct", 
      
        6
           "messages": [ 
      
        7
             {"role": "system", "content": "You are a helpful assistant."}, 
      
        8
             {"role": "user", "content": "please tell me a story."} 
      
        9
           ], 
      
        10
         }'
      
Contact us

Hi. Need any help?