Aurora enables natural, cinematic avatar generation for a wide variety of use cases — from talking heads and product demos to expressive performances.
This guide outlines **best practices** for achieving the most realistic and expressive results.
🧠 Overview
Aurora turns any photo or single image into a lifelike talking avatar.
It works best when the audio, avatar, and prompt align naturally in emotion and context.
The Aurora editor allows you to:
Upload or select an avatar photo
Add a voice input (script or audio)
Choose a prompt to define the avatar’s behavior and tone
🖼️ Image Guidelines
Aurora is designed to be flexible and works well with a variety of image types and styles — from realistic portraits to stylized art or full-body compositions.
To help the model interpret your input effectively:
You can use photos, renders, or character art — Aurora adapts automatically.
Ensure the main subject is clear and distinguishable in the image.
If motion or expressions look unnatural, try using an image with a clearer face or pose.
For multi-scene consistency, use images with similar style and framing (e.g., all portrait shots or all illustrations).
There are no strict limitations on angle, composition, or lighting — Aurora dynamically adjusts to your chosen image.
🔊 Audio & Voice Settings
🎧 Recommended Settings
Voice Model: Always use Voice Model V3 for the most natural sync and expressive range.
Voice Speed: Keep speech moderate and clear — avoid overly fast delivery.
Audio Source: You can upload recorded audio or type a script for TTS generation.
⚙️ Tips for Best Results
If lip-sync or movement feels off, slow down the voice slightly.
Use natural pauses between sentences to improve rhythm and breathing.
Keep tone consistent across multiple clips for seamless storytelling.
🧩 Prompt Design — The Key to Realism
Prompts define how the avatar behaves — body movement, emotion, and style.
Use concise, descriptive prompts that match your voice tone and scene context.
🏗️ Base Prompt
Use this as the foundation for all Aurora generations:
4K studio interview, medium close‑up (shoulders‑up crop). Solid light‑grey seamless backdrop, uniform soft key‑light—no lighting change. Presenter faces lens, steady eye‑contact. Hands remain below frame, body perfectly still. Ultra‑sharp.
💡 Tip:
Use GPT to automatically optimize your final prompt by combining this base style with your specific use case. For example: “Generate an optimized Aurora prompt for a skincare product demo” — GPT will blend cinematic framing with the expressive behavior needed for that context.
🎭 Prompt Examples
Below are Aurora prompts for common use cases:
🎤 Natural Talking Avatar
The person is talking and facing the camera directly and very naturally with breathing chest mild movement. Natural explaining gestures and eye movements.
🎶 Singing Avatar
The person moves in sync with the rhythm of the song, their body expressing the flow and emotion of the music. They sing with emotion, their voice and facial expressions reflecting the depth of the lyrics and music, and every note resonating with heartfelt feeling and passion for the performance.
🎙️ Podcast
The person is looking and facing to the side as if talking to someone in that direction, with engaging expression showing interest in the topic.
🎥 Any Angles
The person looks and faces at the camera naturally talking to viewers.
📱 Product Avatars — Holding Phone
The person holding the phone is showing the screen of the phone to the camera while talking and casually look at the phone and points at the phone from time to time-to-time while expressing.
💧 Product Avatars — Facial Skin Care
The person holding the product is showing the label face of the product to the camera while explaining, the person touches her face to explain the effects of the product on her skin.
🎁 Product Avatar — Any Object
The person holding the product is showing the label face of the product to the camera while explaining, the person points at the object on her hand from time-to-time while explaining.
💬 Product Avatars — Enthusiastic Explanation
The person holding the product is showing the label face of the product to the camera while explaining, the person's hands move enthusiastically trying to explain the benefit of the product.
🤳 Selfie Avatar
The person is talking in front of the camera with one hand not visible into the camera. The camera has a slight shake as hand-held camera.
🐾 Pet and Avatar
The person is stroking the pet's fur while talking in front of the camera.
⚡ Intense Expressions
The person speaks with intense emotion and enthusiasm.
🧍 Animation
The character speaks with unique characteristics and with natural movements.
🐕 Animals — Four-Legged
The animal speaks with natural animal mouth movements. The animal stands up with all four legs.
🦊 Animal — Humanoid
The animal speaks with natural animal mouth movements while moving like a human being expressing the topic.
🧪 Troubleshooting & Optimization
If your video doesn’t look quite right, try these adjustments:
Refine your prompt — small tweaks can make big improvements in realism.
Adjust voice speed — slightly slower pacing often improves sync.
Try a different image — if motion feels mismatched to the pose.
Experiment with angles — e.g., “slightly tilted,” “shoulders-up,” or “three-quarter view.”
🧭 Quick Checklist
Category | Best Practice |
Voice | Use Voice Model V3, moderate speech speed |
Prompt | Use base cinematic setup + behavior description |
Audio | Include natural pauses for breathing and emphasis |
Regeneration | Adjust prompt or voice pace if sync feels off |
Use Case Fit | Match prompt type to content (product, podcast, music, etc.) |
🌟 Final Notes
Aurora adapts beautifully to many creative styles — experiment freely.
Keep voice, image, and prompt emotionally consistent for the best results.
Use expressive cues like “gentle gestures,” “confident smile,” or “animated energy” to fine-tune realism.
If results vary, re-render with slightly revised description or pacing for optimal performance.
