Convert text to speech audio using mb voice CLI. Use when the user asks you to speak, say something aloud, generate audio, or produce a voice recording.
Scanned 5/28/2026
Install via CLI
openskills install xvirobotics/metabot---
name: voice
description: Convert text to speech audio using mb voice CLI. Use when the user asks you to speak, say something aloud, generate audio, or produce a voice recording.
---
## Text-to-Speech (Voice Output)
Generate MP3 audio from text using the `mb voice` CLI.
### Quick Commands
```bash
# Generate MP3, prints file path to stdout
mb voice "Hello, this is a test"
# Generate and play immediately
mb voice "Hello" --play
# Save to specific file
mb voice "Hello" -o greeting.mp3
# Override provider and voice
mb voice "Hello" --provider doubao --voice zh_female_wanqudashu_moon_bigtts
# Pipe text (useful for long content)
echo "Long text here" | mb voice
echo "Long text" | mb voice -o output.mp3
```
### When to Use
- User asks you to "say", "speak", "read aloud", or "generate audio/voice"
- User wants a voice recording or audio version of text
- User requests TTS (text-to-speech) output
### Available Providers & Voices
**Edge TTS (default, free, no key needed):**
- `zh-CN-XiaoyiNeural` (default) — Female Chinese
- `zh-CN-YunxiNeural` — Male Chinese
- `zh-CN-XiaoxiaoNeural` — Female Chinese
- `en-US-JennyNeural` — Female English
**Doubao (default when Volcengine keys configured):**
- `zh_female_wanqudashu_moon_bigtts` (default) — Female Chinese
- Other Volcengine voice IDs from the TTS console
**OpenAI (when OPENAI_API_KEY set):**
- `alloy` (default), `echo`, `fable`, `onyx`, `nova`, `shimmer`
**ElevenLabs (when ELEVENLABS_API_KEY set):**
- Voice IDs from the ElevenLabs console
### Text Limits
- Doubao: ~300 Chinese characters (longer text is auto-truncated)
- OpenAI / ElevenLabs / Edge: ~4000 characters
### Guidelines
- For short text (greetings, alerts), use inline: `mb voice "text"`
- For longer text, pipe through stdin: `echo "..." | mb voice`
- The output file is MP3 format
- Use `--play` only when the user explicitly wants to hear the audio (it blocks until playback completes)
- When saving files for the user, use `-o` with a descriptive filename
- To send the audio to the user in Feishu, copy the file to the outputs directory:
`cp /tmp/mb-voice-xxx.mp3 /tmp/metabot-outputs/<chatId>/`
No comments yet. Be the first to comment!
Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
Adversarial multi-agent planning skill. Self-orchestrates 5 hostile category members (unspecified-low, unspecified-high, deep, ultrabrain, artistry) via team-mode for ruthless cross-critique debate, distills only the defensible insights, then MANDATORILY hands the distilled insight bundle to the `plan` agent for executable plan formalization. Use when planning needs maximum rigor and surfacing of weak assumptions, blind spots, and over-engineering. Triggers: 'hyperplan', 'hpp', '/hyperplan', ...
Implement persistent memory patterns for AI agents using AgentDB. Includes session memory, long-term storage, pattern learning, and context management. Use when building stateful agents, chat systems, or intelligent assistants.
**Complete production-ready guide for Google Gemini embeddings API** This skill provides comprehensive coverage of the `gemini-embedding-001` model for generating text embeddings, including SDK usage, REST API patterns, batch processing, RAG integration with Cloudflare Vectorize, and advanced use cases like semantic search and document clustering. ---
Create and manage Claude Code skills in HASH repository following Anthropic best practices. Use when creating new skills, modifying skill-rules.json, understanding trigger patterns, working with hooks, debugging skill activation, or implementing progressive disclosure. Covers skill structure, YAML frontmatter, trigger types (keywords, intent patterns), UserPromptSubmit hook, and the 500-line rule. Includes validation and debugging with SKILL_DEBUG. Examples include rust-error-stack, cargo-dep...