sherpa-qwen3-asr

Qwen3-ASR 0.6B int8 speech recognition API.
Powered by Qwen3-ASR (ONNX int8) + sherpa-onnx.
Pure CPU or CUDA GPU inference, 52 languages supported.

Endpoints

POST /api/v1/recognize

Upload an audio file and get transcription.

curl -X POST http://localhost:8000/api/v1/recognize \
  -F "file=@audio.wav"
POST /v1/audio/transcriptions

OpenAI-compatible transcription endpoint.

curl -X POST http://localhost:8000/v1/audio/transcriptions \
  -F "file=@audio.wav" \
  -F "model=Qwen/Qwen3-ASR-0.6B"
GET /api/v1/health

Check service health and model status.

Parameters

ParamTypeDefaultDescription
filefilerequiredAudio file (wav, mp3, flac, ogg, m4a)
languagestring"" (auto)Force language: "Chinese", "English", "Korean", etc.

Supported Formats

WAV, MP3, FLAC, OGG, M4A, AAC, OPUS, WEBM — auto-resampled to 16kHz mono.

Docs: Swagger UI | ReDoc