Qwen3-ASR 0.6B int8 speech recognition API.
Powered by Qwen3-ASR (ONNX int8) + sherpa-onnx.
Pure CPU or CUDA GPU inference, 52 languages supported.
/api/v1/recognize
Upload an audio file and get transcription.
curl -X POST http://localhost:8000/api/v1/recognize \ -F "file=@audio.wav"
/v1/audio/transcriptions
OpenAI-compatible transcription endpoint.
curl -X POST http://localhost:8000/v1/audio/transcriptions \ -F "file=@audio.wav" \ -F "model=Qwen/Qwen3-ASR-0.6B"
/api/v1/health
Check service health and model status.
| Param | Type | Default | Description |
|---|---|---|---|
file | file | required | Audio file (wav, mp3, flac, ogg, m4a) |
language | string | "" (auto) | Force language: "Chinese", "English", "Korean", etc. |
WAV, MP3, FLAC, OGG, M4A, AAC, OPUS, WEBM — auto-resampled to 16kHz mono.
Docs: Swagger UI | ReDoc