Audio transcription using the OpenAI Whisper models
Go to file
2023-10-05 22:40:41 +03:00
.github/workflows add mingw to workflow 2023-10-05 22:40:41 +03:00
internal Implement automatic model and Whisper.dll downloading 2023-10-05 21:05:30 +03:00
pkg/whisper initial commit 2023-10-04 01:09:38 +03:00
tmp initial commit 2023-10-04 01:09:38 +03:00
.gitignore Deploy configuration 2023-10-05 21:34:34 +03:00
.goreleaser.yaml add mingw to workflow 2023-10-05 22:40:41 +03:00
go.mod Implement automatic model and Whisper.dll downloading 2023-10-05 21:05:30 +03:00
go.sum Implement automatic model and Whisper.dll downloading 2023-10-05 21:05:30 +03:00
LICENSE initial commit 2023-10-04 01:09:38 +03:00
main.go Implement automatic model and Whisper.dll downloading 2023-10-05 21:05:30 +03:00
README.md Deploy configuration 2023-10-05 21:34:34 +03:00

Whisper API Server (Go)

⚠️ This project is a work in progress (WIP).

This API server enables audio transcription using the OpenAI Whisper models.

Setup

  • Download the desired model from huggingface
  • Update the model path in the main.go file
  • Download Whisper.dll from github (Library.zip) and place it in the project's root directory
  • Build project: go build . (you only need go compiler, without gcc)

Usage example

Make a request to the server using the following command:

curl http://localhost:3000/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \

Receive a response in JSON format:

{
  "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}

Roadmap

  • Implement automatic model downloading from huggingface
  • Implement automatic Whisper.dll downloading from Guthub releases
  • Provide prebuilt binaries for Windows
  • Include instructions for running on Linux with Wine (likely possible).

Credits