Convert text to speech audio using mb voice CLI. Use when the user asks you to speak, say something aloud, generate audio, or produce a voice recording.
Scanned 5/28/2026
Install via CLI
openskills install xvirobotics/metabot---
name: voice
description: Convert text to speech audio using mb voice CLI. Use when the user asks you to speak, say something aloud, generate audio, or produce a voice recording.
---
## Text-to-Speech (Voice Output)
Generate MP3 audio from text using the `mb voice` CLI.
### Quick Commands
```bash
# Generate MP3, prints file path to stdout
mb voice "Hello, this is a test"
# Generate and play immediately
mb voice "Hello" --play
# Save to specific file
mb voice "Hello" -o greeting.mp3
# Override provider and voice
mb voice "Hello" --provider doubao --voice zh_female_wanqudashu_moon_bigtts
# Pipe text (useful for long content)
echo "Long text here" | mb voice
echo "Long text" | mb voice -o output.mp3
```
### When to Use
- User asks you to "say", "speak", "read aloud", or "generate audio/voice"
- User wants a voice recording or audio version of text
- User requests TTS (text-to-speech) output
### Available Providers & Voices
**Edge TTS (default, free, no key needed):**
- `zh-CN-XiaoyiNeural` (default) — Female Chinese
- `zh-CN-YunxiNeural` — Male Chinese
- `zh-CN-XiaoxiaoNeural` — Female Chinese
- `en-US-JennyNeural` — Female English
**Doubao (default when Volcengine keys configured):**
- `zh_female_wanqudashu_moon_bigtts` (default) — Female Chinese
- Other Volcengine voice IDs from the TTS console
**OpenAI (when OPENAI_API_KEY set):**
- `alloy` (default), `echo`, `fable`, `onyx`, `nova`, `shimmer`
**ElevenLabs (when ELEVENLABS_API_KEY set):**
- Voice IDs from the ElevenLabs console
### Text Limits
- Doubao: ~300 Chinese characters (longer text is auto-truncated)
- OpenAI / ElevenLabs / Edge: ~4000 characters
### Guidelines
- For short text (greetings, alerts), use inline: `mb voice "text"`
- For longer text, pipe through stdin: `echo "..." | mb voice`
- The output file is MP3 format
- Use `--play` only when the user explicitly wants to hear the audio (it blocks until playback completes)
- When saving files for the user, use `-o` with a descriptive filename
- To send the audio to the user in Feishu, copy the file to the outputs directory:
`cp /tmp/mb-voice-xxx.mp3 /tmp/metabot-outputs/<chatId>/`
No comments yet. Be the first to comment!
Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
Adversarial multi-agent planning skill. Self-orchestrates 5 hostile category members (unspecified-low, unspecified-high, deep, ultrabrain, artistry) via team-mode for ruthless cross-critique debate, distills only the defensible insights, then MANDATORILY hands the distilled insight bundle to the `plan` agent for executable plan formalization. Use when planning needs maximum rigor and surfacing of weak assumptions, blind spots, and over-engineering. Triggers: 'hyperplan', 'hpp', '/hyperplan', ...
**Complete production-ready guide for Google Gemini embeddings API** This skill provides comprehensive coverage of the `gemini-embedding-001` model for generating text embeddings, including SDK usage, REST API patterns, batch processing, RAG integration with Cloudflare Vectorize, and advanced use cases like semantic search and document clustering. ---
Use when a repo needs CodeGraph plus ast-grep for Codex MCP setup, exploration, impact analysis, structural search, or safe refactor planning.
Interview, source-challenge, verify, save, and ADR-gate fuzzy coding requests into Codex-ready implementation specs. Use when a feature, bugfix, refactor, migration, repo-wide change, or architecture task needs user-verified requirements, source-backed decisions, durable architecture decisions, acceptance criteria, validation commands, rollout notes, saved spec/ADR files, and a Codex execution prompt. Do not use when already fully specified or when the user wants direct implementation now.