stt-tts-service Skill
description: Lightweight local speech-to-text and text-to-speech service for OpenClaw
Want an agent-native computer in the browser? Try HappyCapy.
Cloud sandbox for AI agents · No setup · Run autonomous workflows from your browser
Affiliate link — we may earn a commission at no extra cost to you.
Why use this skill
stt-tts-service is most useful when you want an agent workflow that is more structured than an ad-hoc prompt. Instead of restating the same expectations every time, a dedicated SKILL.md file gives the assistant a repeatable brief. In this case, the core value is clarity: the repo already frames the workflow around utility skills tasks, and the skill source gives you a portable starting point you can evaluate, adapt, and reuse. The inferred platform for this skill is Generic Skills, which helps you judge whether it is likely to feel native in your current agent ecosystem or whether it is better treated as a general reference.
That matters because AI assistants are better when the operating context is explicit. A good skill turns hidden team expectations into visible instructions. It can name preferred tools, describe failure modes, define what “done” looks like, and reduce the amount of corrective prompting you need after the first draft. For developers exploring the wider SKILL.md ecosystem, this page helps answer the practical question: is this skill specific and maintained enough to be worth trying?
How to evaluate and use it
Start with the source repo and the preview below. The preview tells you whether the instructions are actionable or just aspirational. Strong skills usually describe triggers, recommended tools, steps, and known pitfalls. Weak skills tend to stay generic. This one lives in diegosouzapw/awesome-omni-skill, which gives you a concrete repo context, update history, and direct ownership trail.
Once you confirm the scope looks right, test it on a small task before making it part of a larger workflow. If it improves consistency, keep it. If it is too broad, outdated, or conflicts with your own process, treat it as a reference rather than a drop-in rule. That is the healthiest way to use directory-discovered skills: not as magic plugins, but as reusable operational knowledge that still deserves judgment.
SKILL.md preview
Previewing the source is one of the fastest ways to judge whether a skill is truly useful. This snippet comes from the public file in the linked repository.
---
name: stt-tts-service
description: Lightweight local speech-to-text and text-to-speech service for OpenClaw
version: 1.0.0
author: community
tags:
- speech
- audio
- transcription
- synthesis
- voice
---
# STT-TTS Service
A lightweight, local speech-to-text (STT) and text-to-speech (TTS) service that runs on any device connected to your OpenClaw server. Perfect for voice-enabled workflows and flexible resource allocation.
## Features
- **Speech-to-Text**: Transcribe audio using faster-whisper (4x faster than OpenAI Whisper)
- **Text-to-Speech**: Generate natural speech using piper-tts or pyttsx3 fallback
- **100% Local**: No cloud APIs, works offline after initial model download
- **Flexible Deployment**: Run on any device - Raspberry Pi, laptop, or GPU server
- **HTTP API**: Simple REST endpoints for easy integration
## Quick Start
### Installation
```bash
# Clone or download this skill
cd stt-tts-service
# Install dependencies
pip install -r requirements.txt
# Start the service
python main.py
```
### Docker Deployment
```bash
docker build -t stt-tts-service .
docker run -p 8765:8765 stt-tts-service
```
## API Endpoints
### POST /stt - Speech to Text
Transcribe audio files to text.
```bash
curl -X POST http://localhost:8765/stt \
-F "audio=@recording.wav"
```
**Response:**
```json
{
"text": "Hello, this is the transcribed text.",
"language": "en",
"duration": 3.5
}
```
### POST /tts - Text to Speech
Convert text to audio.
```bash
curl -X POST http://localhost:8765/tts \
-H "Content-Type: application/json" \
-d '{"text": "Hello w
...