Automated & Human Video Workflow Overview
Hey Dave!
I did a ton of
automated video in 2024 and pivoted to
human content for 2025 with Vidzilla, but I’m still doing a lot of automation—it works. For automated content, I typically
target 60-90 seconds in length, prioritizing quality.
Here's an example video:
Basic Workflow
- Script
- Audio
- Images
- Animation/Sync
- Render
1. Script
If you're working at scale, you might want to define:
- Number of paragraphs & sentences
- Purpose of each sentence
For example:
This ensures
consistency across thousands of videos. Not mandatory but useful.
- I use GPT to write but I'm testing Deepseek. I always use the best model for the task.
- For 10-20 videos, I hand-edit scripts. For 1,000+, I run a helper script to filter out weaker scripts.
- I don’t use brand voice but assign emotiondynamically, e.g.:
- “For the third sentence, emphasize excitement in plain language.”
- “Explain it clearly, highlighting family fun.”
- CTAs (Call-to-Actions) are injected, e.g.:
- “In the last sentence of the second paragraph, invite users to reach out for assistance.”
At the end, I
split the script so TTS can process each paragraph separately.
- Example: A 5-paragraph script becomes TTS1.txt, TTS2.txt, TTS3.txt, etc.
2. Audio
- Eleven Labs API for TTS
- Saved favorite voices & settings for easy recall
- Highest quality settings → Outputs: TTS1.wav, TTS2.wav, etc. as mono 48K, 16bit
Audio Processing
- Python to trim & reduce large gaps in .wav files
- Optional VST plugin stack(classic broadcast voiceover chain):
- 1176 compressor → LA2A compressor → EQ
- Some Eleven Labs voices need EQ (sibilance or lack of body)
- Loudness settings:
- Voice files: -18 LUFS (Adobe Audition)
- Mixed music files: -14 LUFS (Python & Audition)
- Python stitches audio files into master.wav
3. Images
Models change, so the best options vary.
Current go-to:
Ideogram, Recraft, Stability (all via API).
Image Strategy
- Avoid AI for technical items (it makes up fake objects). Use real photos.
- Number of images per video: Usually 8 images per video, with transitions.
- Consistent image types per video template (e.g., same type of image at each spot).
- Sometimes, I use Python to analyze scripts & generate 8 matching image prompts.
- Highly targeted prompts, e.g.:
- “A local HVAC company owner interacting with his community at a neighborhood picnic.”
- Focus on emotion & connection, not decoration. Images sell the story.
Thumbnail Creation
- AI-generated via Ideogram or Python.
- Python detects product photo size/shape, fits it to a canvas, and overlays text.
- Python checker script for detecting bad hands & artifacts (MS Phi works well).
No AI Video Tools
- AI video doesn't connect with humans.
- Pexels for real video (avoid Storyblocks—legal risks).
4. Animation & Sync
After Effects via CLI → Better results.
Steps
- Python generates static PNGs for end, side, and image panels separately.
- Animated end slides → Exported as MP4, then subtract duration from master.wav to get image slide durations.
- Image slides animated into an MP4 (typically using slide-and-blur transitions with gradual scale-up expressions).
- Optional overlays (e.g., light leaks) for style.
- Text slides animated to match script segment durations (from split scripts).
- Final text animation video: textSlides.mp4.
5. Render
Final render components:
- intro.mp4
- slides.mp4
- textSlides.mp4
- outro.mp4
- master.wav
Rendering Process
- Python stacks MP4s (slides.mp4 underneath textSlides.mp4 in Z-order).
- Scene transitions applied with Python.
- Final video matches master.wav duration, ensuring smooth sync.
- YouTube-ready quality settings (balanced file size vs. quality).
- Python generates SEO-friendly file names for thumbnails & videos.
Bonus: Automating YouTube Upload & SEO
- AI agent handles YouTube uploads, descriptions, and metadata.
- Python script generates a unique blog post with video schema markup.
- AI agents can repurpose content for social media, Vimeo, etc., maximizing reach.
Final Thoughts
Hope this was helpful—just a quick brain dump. Let me know if you have any questions! I formatted this using GPT so I could be more like Lucian!