zeldon/whisper-api-server

Fork 0

mirror of https://github.com/xzeldon/whisper-api-server.git synced 2025-07-01 21:08:16 +03:00

Go to file

xzeldon 482616fb4c

Ensure the existence of the temp directory in the binary root

2023-10-05 23:42:04 +03:00

.github/workflows

add mingw to workflow

2023-10-05 22:40:41 +03:00

internal

Ensure the existence of the temp directory in the binary root

2023-10-05 23:42:04 +03:00

pkg/whisper

initial commit

2023-10-04 01:09:38 +03:00

tmp

initial commit

2023-10-04 01:09:38 +03:00

.gitignore

Deploy configuration

2023-10-05 21:34:34 +03:00

.goreleaser.yaml

goreleaser: disable archives

2023-10-05 22:51:13 +03:00

go.mod

enable cors

2023-10-05 23:31:16 +03:00

go.sum

enable cors

2023-10-05 23:31:16 +03:00

LICENSE

initial commit

2023-10-04 01:09:38 +03:00

main.go

enable cors

2023-10-05 23:31:16 +03:00

README.md

update readme

2023-10-05 23:29:43 +03:00

README.md

Whisper API Server (Go)

⚠️ This project is a work in progress (WIP).

This API server enables audio transcription using the OpenAI Whisper models.

Setup

Download .exe from Releases
Just run it!

Build from source

Download the sources and use go build. For example, you can build using the following command:

go build -ldflags "-s -w" -o server.exe main.go

Usage example

Make a request to the server using the following command:

curl http://localhost:3000/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \

Receive a response in JSON format:

{
  "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}

Usage with Obsidian

To integrate this with the Obsidian voice recognotion plugin, follow these steps:

Open the plugin's settings.
Set the following values:
- API KEY: sk-1
- API URL: http://localhost:3000/v1/audio/transcriptions
- Model: whisper-1

Roadmap

Implement automatic model downloading from huggingface
Implement automatic Whisper.dll downloading from Guthub releases
Provide prebuilt binaries for Windows
Include instructions for running on Linux with Wine (likely possible).
Use flags to override the model path
Use flags to override the model type (when downloading the model)
Use flags to override the port

Credits

Const-me/Whisper project
goConstmeWhisper for the remarkable Go bindings for Const-me/Whisper
Georgi Gerganov for GGML models