Audio transcription using the OpenAI Whisper models
Go to file
Timofey Gelazoniya 8e4cb72b50
initial commit
2023-10-04 01:09:38 +03:00
pkg/whisper initial commit 2023-10-04 01:09:38 +03:00
tmp initial commit 2023-10-04 01:09:38 +03:00
.gitignore initial commit 2023-10-04 01:09:38 +03:00
LICENSE initial commit 2023-10-04 01:09:38 +03:00
README.md initial commit 2023-10-04 01:09:38 +03:00
go.mod initial commit 2023-10-04 01:09:38 +03:00
go.sum initial commit 2023-10-04 01:09:38 +03:00
handler.go initial commit 2023-10-04 01:09:38 +03:00
main.go initial commit 2023-10-04 01:09:38 +03:00
state.go initial commit 2023-10-04 01:09:38 +03:00
utils.go initial commit 2023-10-04 01:09:38 +03:00

README.md

Whisper API Server (Go)

⚠️ This project is a work in progress (WIP).

This API server enables audio transcription using the OpenAI Whisper models.

Setup

  • Download the desired model from huggingface
  • Update the model path in the main.go file
  • Download Whisper.dll from github (Library.zip) and place it in the project's root directory
  • Build project: go build . (you only need go compiler, without gcc)

Usage example

Make a request to the server using the following command:

curl http://localhost:3000/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \

Receive a response in JSON format:

{
  "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}

Roadmap

  • Implement automatic model downloading from huggingface
  • Implement automatic Whisper.dll downloading from Guthub releases
  • Provide prebuilt binaries for Windows
  • Include instructions for running on Linux with Wine (likely possible).

Credits