Audio transcription using the OpenAI Whisper models
Go to file
2023-10-05 23:31:16 +03:00
.github/workflows add mingw to workflow 2023-10-05 22:40:41 +03:00
internal Implement automatic model and Whisper.dll downloading 2023-10-05 21:05:30 +03:00
pkg/whisper initial commit 2023-10-04 01:09:38 +03:00
tmp initial commit 2023-10-04 01:09:38 +03:00
.gitignore Deploy configuration 2023-10-05 21:34:34 +03:00
.goreleaser.yaml goreleaser: disable archives 2023-10-05 22:51:13 +03:00
go.mod enable cors 2023-10-05 23:31:16 +03:00
go.sum enable cors 2023-10-05 23:31:16 +03:00
LICENSE initial commit 2023-10-04 01:09:38 +03:00
main.go enable cors 2023-10-05 23:31:16 +03:00
README.md update readme 2023-10-05 23:29:43 +03:00

Whisper API Server (Go)

⚠️ This project is a work in progress (WIP).

This API server enables audio transcription using the OpenAI Whisper models.

Setup

  • Download .exe from Releases
  • Just run it!

Build from source

Download the sources and use go build. For example, you can build using the following command:

go build -ldflags "-s -w" -o server.exe main.go

Usage example

Make a request to the server using the following command:

curl http://localhost:3000/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \

Receive a response in JSON format:

{
  "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}

Usage with Obsidian

To integrate this with the Obsidian voice recognotion plugin, follow these steps:

  1. Open the plugin's settings.
  2. Set the following values:
    • API KEY: sk-1
    • API URL: http://localhost:3000/v1/audio/transcriptions
    • Model: whisper-1

Roadmap

  • Implement automatic model downloading from huggingface
  • Implement automatic Whisper.dll downloading from Guthub releases
  • Provide prebuilt binaries for Windows
  • Include instructions for running on Linux with Wine (likely possible).
  • Use flags to override the model path
  • Use flags to override the model type (when downloading the model)
  • Use flags to override the port

Credits