2023-10-03 22:09:38 +00:00
# Whisper API Server (Go)
## ⚠️ This project is a work in progress (WIP).
This API server enables audio transcription using the OpenAI Whisper models.
# Setup
2023-10-05 20:13:18 +00:00
- Download `.exe` from [Releases ](https://github.com/xzeldon/whisper-api-server/releases/latest )
- Just run it!
2024-12-25 20:31:00 +00:00
# Build from source (Windows)
## Prerequisites
- GCC Compiler Installed in your PATH (You can get it from [here ](https://github.com/niXman/mingw-builds-binaries ))
- Install Go (https://go.dev/doc/install)
2023-10-05 20:13:18 +00:00
2024-03-11 10:06:16 +00:00
Before build make sure that **CGO_ENABLED** env is set to **1**
```
2024-03-14 13:02:37 +00:00
$env:CGO_ENABLED = "1"
2024-03-11 10:06:16 +00:00
```
2024-03-14 13:02:37 +00:00
you can check this with this command
2024-03-11 10:06:16 +00:00
```
go env
```
Also you have to have installed gcc x64 i.e. by MYSYS
2023-10-05 20:13:18 +00:00
Download the sources and use `go build` .
For example, you can build using the following command:
```bash
go build -ldflags "-s -w" -o server.exe main.go
```
2023-10-03 22:09:38 +00:00
# Usage example
Make a request to the server using the following command:
```sh
2024-12-25 22:35:38 +00:00
curl http://localhost:3000/v1/audio/transcriptions \
2023-10-03 22:09:38 +00:00
-H "Content-Type: multipart/form-data" \
-F file="@/path/to/file/audio.mp3" \
```
Receive a response in JSON format:
```json
{
2024-12-25 20:31:00 +00:00
"text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
2023-10-03 22:09:38 +00:00
}
```
2023-10-05 20:29:43 +00:00
# Usage with [Obsidian](https://obsidian.md/)
2023-10-05 20:49:15 +00:00
1. Install [Obsidian voice recognotion plugin ](https://github.com/nikdanilov/whisper-obsidian-plugin )
2. Open the plugin's settings.
3. Set the following values:
2023-10-05 20:29:43 +00:00
- API KEY: `sk-1`
2024-12-25 22:35:38 +00:00
- API URL: `http://localhost:3000/v1/audio/transcriptions`
2023-10-05 20:29:43 +00:00
- Model: `whisper-1`
2023-10-03 22:09:38 +00:00
# Roadmap
2023-10-05 18:07:34 +00:00
- [x] Implement automatic model downloading from [huggingface ](https://huggingface.co/ggerganov/whisper.cpp/tree/main )
- [x] Implement automatic `Whisper.dll` downloading from [Guthub releases ](https://github.com/Const-me/Whisper/releases )
2023-10-05 18:34:34 +00:00
- [x] Provide prebuilt binaries for Windows
2023-10-03 22:09:38 +00:00
- [ ] Include instructions for running on Linux with Wine (likely possible).
2024-12-25 22:35:38 +00:00
- [x] Use flags to override the model path
- [x] Use flags to override the port
2023-10-03 22:09:38 +00:00
# Credits
- [Const-me/Whisper ](https://github.com/Const-me/Whisper ) project
- [goConstmeWhisper ](https://github.com/jaybinks/goConstmeWhisper ) for the remarkable Go bindings for [Const-me/Whisper ](https://github.com/Const-me/Whisper )
- [Georgi Gerganov ](https://github.com/ggerganov ) for GGML models