- API Type
- Python library and model API via Transformers library; no REST API endpoint provided by Suno
- Core Functions
- generate_audio() for text-to-audio, text_to_semantic() for semantic token generation, semantic_to_waveform() for audio waveform synthesis, generate_voice() for voice generation
- Authentication
- No authentication required for open-source model; self-hosted deployment uses local file access
- Parameters
- text_temp (0.0-1.0 for diversity), waveform_temp (0.0-1.0 for audio diversity), history_prompt for voice cloning, early stopping controls
- Output Format
- NumPy audio array at 24kHz sample rate; compatible with standard audio processing libraries
- Integration Methods
- Direct Python library integration via pip install, Transformers library integration, OpenVINO optimization support, third-party platforms (Coqui, HuggingFace)
- Documentation
- GitHub repository documentation, Transformers library docs, Coqui TTS documentation, OpenVINO examples, community tutorials
- SDKs and Libraries
- Python library available; unofficial SDKs and wrappers in community projects; OpenVINO integration for optimization
- Deployment Options
- Self-hosted on local GPU/CPU, cloud deployment (AWS, Google Cloud, Azure), Docker containerization, inference optimization via OpenVINO
- Rate Limits
- No rate limits for self-hosted deployment; inference speed limited by hardware capabilities