Audio Ad Production is a full pipeline for creating ready-to-upload audio ads for Spotify, Pandora, iHeartRadio, and programmatic audio platforms. You provide a topic and a few settings. ILLIXIS handles the rest: AI script writing, professional voiceover, original background music, and broadcast-standard mixing.
The pipeline runs in three automated stages. First, ILLIXIS writes three distinct script variations so you can A/B test different creative angles. Second, after you pick a script, AI voice generation creates a professional voiceover and AI music generation composes original royalty-free music (no stock library, no licensing issues). Third, the mixing engine combines voiceover and music with volume ducking, fade in/out, and LUFS normalization to meet platform loudness standards. The output is a downloadable MP3 and WAV file.
Total production time is roughly 90 seconds after you approve a script.
The Audio Ads dashboard shows all your audio ads organized by status. Filter using the tabs at the top:
| Tab | What It Shows |
|-----|---------------|
| All | Every audio ad regardless of status |
| Draft | Ads being created or generating scripts |
| Script | Ads with scripts ready for review |
| Producing | Ads currently generating voiceover, music, or mixing |
| Complete | Finished ads ready for download |
To create your first audio ad, click the "Create Audio Ad" button on the dashboard.
The create form collects the information ILLIXIS needs to write your scripts. Required fields are marked with an asterisk.
| Field | Required | Description |
|-------|----------|-------------|
| Ad Title | Yes | A name for this ad (e.g., "Summer Sale 30s Spotify Ad"). Internal only -- not used in the script. |
| What are you advertising? | Yes | Describe the product, offer, or message. Include key selling points, pricing, and any details the ad should mention. The more specific you are, the better the scripts. |
| Duration | No | 15, 30, or 60 seconds. Defaults to 30 seconds. See "Duration Options" below for word count targets. |
| Ad Type | No | The structural template for the ad. Defaults to Direct Response. See "Ad Types" below. |
| Voice | No | The AI voice for the voiceover. 7 options: 4 female (Rachel, Charlotte, Matilda, Lily) and 3 male (Adam, Daniel, Charlie). Defaults to Rachel. |
| Tone | No | The delivery style. Options: Professional, Energetic, Conversational, Warm, Urgent, Authoritative, Friendly. Defaults to Professional. |
| Music Style | No | The background music genre. Options: Acoustic, Modern, Cinematic, Electronic, Ambient. Defaults to Modern. |
| Target Audience | No | Who the ad is aimed at (e.g., "Small business owners aged 30-50"). If left empty, falls back to the target audience configured in your brand settings. |
| Call to Action | No | What the listener should do (e.g., "Visit example.com" or "Call 1-800-EXAMPLE"). Included at the end of each script variation. |
After submitting, ILLIXIS immediately begins generating three script variations. You are redirected to the Script Review page, which polls for completion automatically. Script generation typically takes 15-20 seconds.
Once generation completes, you see three script cards laid out in a grid. Each card represents a different creative approach:
| Variation | Approach | Description |
|-----------|----------|-------------|
| 1 | Emotional Hook | Leads with a feeling, aspiration, or relatable moment |
| 2 | Stat-Driven Hook | Leads with a surprising number, fact, or bold claim |
| 3 | Question Hook | Leads with a thought-provoking question |
Each card displays:
Selecting a script. Click "Select This Script" on the card you want. The selected card gets a highlighted border, and an edit area appears below the grid.
Editing a script. After selecting, you can modify the script text in the editor. A live word count shows your current count relative to the target, color-coded:
Click "Save Edits" to save changes. The word count and estimated duration update immediately.
Regenerating scripts. If none of the three variations work, click "Regenerate All Scripts" at the bottom of the page. This creates three entirely new variations using the same ad settings.
After selecting (and optionally editing) a script, click "Produce Audio Ad" in the page header. Production runs three steps sequentially, all fully automated:
| Step | Service | Duration | What Happens |
|------|---------|----------|--------------|
| 1. Voiceover | AI voice generation | ~30 seconds | Converts your script to spoken audio using the selected voice. Generates word-level timestamps for precision. |
| 2. Music | AI music generation | ~50 seconds | Composes original royalty-free background music based on the music style and the selected variation's music suggestion. Duration is matched to the voiceover plus a 2-second tail. |
| 3. Mixing | Audio engine | ~2 seconds | Combines voiceover and music. Trims or loops music to fit. Applies volume ducking during speech, fade in/out, and LUFS normalization to -14 LUFS (Spotify/IAB standard). Exports MP3 (192kbps) and WAV (16-bit, 44.1kHz). |
You are redirected to the Preview page, which shows a spinner and the current production step. The page polls every 3 seconds and refreshes automatically when production completes.
The preview page appears once production finishes. It contains four sections:
Stats bar. Four metrics at the top:
Waveform players. Three audio tracks with waveform visualization powered by wavesurfer.js:
| Track | Waveform Color | Description |
|-------|---------------|-------------|
| Final Mix | Gold | The complete audio ad, ready for upload |
| Voiceover | Blue | The isolated spoken audio |
| Music | Green | The isolated background music |
Each track has play/pause controls and a time display. Playing one track automatically pauses the others.
Download cards. Two download options:
| Format | Specs | Best For |
|--------|-------|----------|
| MP3 | 192kbps | Uploading to ad platforms (Spotify Ad Studio, Pandora AMP, etc.) |
| WAV | 16-bit, 44.1kHz | Lossless audio for further editing in external tools |
Regeneration options. Three buttons at the bottom:
After production, you can regenerate individual components without starting from scratch. Each regeneration triggers a new mix automatically.
Voiceover regeneration. Generates a new voiceover from the same script text. You can switch to a different voice at the same time. After the new voiceover is generated, the system automatically re-mixes it with the existing music.
Music regeneration. Generates new background music. You can optionally provide a custom music prompt to override the AI's suggestion. After the new music is generated, the system automatically re-mixes it with the existing voiceover.
Re-mix. Re-runs the mixing step with adjusted parameters. The mixing service uses the following settings:
| Setting | Default | Description |
|---------|---------|-------------|
| Music volume | -12 dB | Volume of music relative to voiceover |
| Duck during voice | Enabled | Lowers music by an additional -8 dB during voiceover sections |
| Fade in | 500 ms | Music fade-in at the start |
| Fade out | 1000 ms | Music fade-out at the end |
Regeneration counts are tracked per ad. There is no hard limit on regenerations, but each regeneration incurs the same usage cost as initial generation.
Each ad type provides a specific structural template that guides the AI's script writing.
| Ad Type | Structure | Best For |
|---------|-----------|----------|
| Direct Response | Hook > Offer > Urgency > CTA | Sales, promotions, limited-time offers |
| Brand Awareness | Scene-setting > Brand story > Emotional connection > Tagline | Building brand affinity, top-of-funnel |
| Testimonial | Problem (before) > Discovery > Result (after) > CTA | Social proof, customer stories |
| Problem / Solution | Pain point > Agitate > Solution > Proof > CTA | Addressing specific pain points |
| Announcement | News hook > What's new > Why it matters > How to get it | Product launches, new features, events |
Duration determines the target word count. Scripts are written at approximately 2.7 words per second, which is standard pacing for audio advertising.
| Duration | Word Count Target | Tolerance | Best For |
|----------|-------------------|-----------|----------|
| 15 seconds | ~40 words | 36-44 words | Quick recall, retargeting, bumper ads |
| 30 seconds | ~80 words | 72-88 words | Standard ad unit, most common format |
| 60 seconds | ~160 words | 144-176 words | Storytelling, detailed offers, brand building |
The 30-second format is the most widely used in audio advertising and is the default.
Audio ad production is a quota-based feature. Usage is checked before script generation.
| Plan | Audio Ads per Month | Overage |
|------|---------------------|---------|
| Trial | 2 total | N/A |
| Starter | 5 | $1 per additional ad |
| Professional | 20 | $1 per additional ad |
| Enterprise | 100 | $1 per additional ad |
One "audio ad" counts as one complete production -- script generation through final mix. Regenerating voiceover, music, or re-mixing an existing ad does not count against your quota.
Transform your articles into platform-ready video scripts with scene breakdowns, timing, and B-roll suggestions. Perfect for creating TikToks, Reels, and Shorts from existing content.
ILLIXIS offers voice cloning to give your brand a consistent, recognizable voice across all video and audio content. Clone your CEO's voice, brand spokesperson, or any voice that represents your brand.
ILLIXIS generates fully animated videos from still images using advanced AI models. Unlike Ken Burns animations that simply pan and zoom, AI video generation creates actual motion — hair blowing, fabric rippling, water flowing.
Ken Burns Mode transforms static images into engaging videos with smooth pan and zoom animations. It's the faster, lower-cost alternative to full AI video generation while still creating professional motion effects.
Social video captions in ILLIXIS have two configuration dimensions: Caption Type content angle and Caption Style visual presentation.
One platform. You approve. ILLIXIS executes. Marketing that just happens.
Marketing, Unstacked.