← All guides
APP IDEA #11 · AI VIDEO · CREATOR TOOLS

Vid.ai — riding the exact moment short-form video ate the internet

Launched late 2024, hit roughly $67-80K MRR in under two years. Vid.ai's edge wasn't novel tech — it was timing, plus a founder with a 900K-subscriber YouTube channel as a built-in distribution engine for an AI video automation tool.

$79.5K
Last 30 days revenue (TrustMRR verified)
<2 yrs
Time to reach this scale
900K
Founder's existing YouTube audience
90%
Of internet video traffic now short-form
01 / HOW IT WORKS

What the app actually does

Vid.ai sits in the AI video editing/automation category — taking raw footage or a script and automating the editing work that used to require manual timeline scrubbing.

1

User uploads raw footage or provides a script/topic

Flexible entry point — works whether starting from existing video or generating from scratch.

2

AI handles the editing decisions automatically

Cutting, pacing, and structure applied via machine learning rather than manual timeline work.

3

Templates apply a consistent, branded look

A library of customizable templates means output doesn't look like a generic AI tool's default style.

4

Export ready for short-form platforms

Sized and formatted for TikTok, Reels, and YouTube Shorts — the platforms now carrying 90% of video views.

02 / INDIA POTENTIAL

Does this work as an India play

India has one of the largest short-form video creator bases in the world — the same tailwinds Vid.ai rode globally apply here, with almost no India-specific tooling yet.

80Cr+
Active internet users in India consuming short-form video daily across Instagram Reels, YouTube Shorts, and regional apps.
Regional
Templates and auto-captioning tuned for Hindi and regional-language creators — a gap most global AI video tools don't address.
₹399/mo
Realistic India-priced subscription — opens the mass creator segment that won't pay global dollar pricing.
Distribution
The exact founder-audience playbook Vid.ai used (build an audience, then sell them the tool) is directly replicable on your own Instagram following.
03 / THE WEEKEND BUILD

Friday to Sunday, hour by hour

Scoped to one core function — auto-captioning and auto-cutting raw footage into a short-form-ready clip, not the full editing suite.

Friday
Evening · 3 hrs
7–8 PM Set up a Next.js project, Supabase for storage, and a basic video upload flow.
8–9 PM Integrate a transcription API (Whisper) to generate a timestamped transcript of the uploaded video.
9–10 PM Build the caption-overlay rendering logic using ffmpeg, burning timed captions onto the video.
Saturday
Full day · 7 hrs
Morning Build 3 caption style templates with different fonts, colors, and animation — the visual differentiator.
Afternoon Add auto-resize/crop logic to output correctly for 9:16 (Reels/Shorts) format from any source aspect ratio.
Evening Build a basic "highlight detector" — flag the most engaging-sounding moments in the transcript for auto-clipping.
Sunday
5 hrs
Morning Add Hindi/regional-language caption support, testing transcription accuracy on Hindi audio specifically.
Afternoon Razorpay integration — ₹399/month subscription for unlimited video processing.
Evening Test end-to-end on a real raw video, record the demo for your first post.
04 / APP STACK

What you're actually building with

Nx

Next.js 14

Frontend + API routes

Upload flow, processing queue, and result preview in one framework.

Sb

Supabase

Storage + database

Stores uploaded videos, processed outputs, and user accounts.

Wh

Whisper API

Transcription

Generates the timestamped transcript that drives both captioning and highlight detection.

Ff

ffmpeg

Video processing

Handles caption burning, cropping, and resizing — the actual video manipulation engine.

Rz

Razorpay

Payments

UPI-first ₹399/month subscription for unlimited processing.

Tw

Tailwind CSS

Styling

Fast enough to build the upload and preview UI in a weekend.

05 / WHERE & HOW TO DEPLOY

Going live

Honest heads up: ffmpeg video processing is heavier than typical serverless functions handle well — you'll likely need a dedicated processing service (Railway or Render) alongside Vercel for the frontend, not Vercel alone.

Deploy the Next.js frontend to Vercel — handles the UI and API routes that don't need heavy processing.
Deploy a separate ffmpeg-capable worker to Railway or Render, since video processing needs more compute/time than serverless functions allow.
Connect the two via a job queue (Supabase realtime or a simple polling pattern) so uploads trigger processing on the worker.
Add environment variables across both services: SUPABASE_URL, SUPABASE_KEY, WHISPER_API_KEY, RAZORPAY_KEY.
Test with a real video end-to-end before pointing a custom domain at it.
06 / MARKETING & REVENUE

Getting paying users

How to market it

  • Post a raw-footage-to-polished-Reel transformation as a reel — the format proves itself.
  • Run 10–20 reels/day across multiple accounts targeting different creator niches: podcasters, educators, vloggers.
  • Use your own posting volume as a live demo — every reel made with the tool is also an ad for it, the same loop Vid.ai's founder used.
  • Target Indian YouTube/podcast creator communities directly, where repurposing long-form into Shorts is a constant pain point.
  • Offer a free tier capped at a few videos/month, paid tier unlocks volume and premium caption styles.

Who pays, and why

  • Podcasters and long-form YouTubers needing to repurpose content into short clips daily.
  • Social media managers handling multiple client accounts who need volume output.
  • Regional-language creators underserved by English-first global AI video tools.
Scenario
Paying users/mo
Revenue/mo
Slow start
70 users × ₹399
₹27,930
Creator community traction + steady reels
900 users × ₹399
₹3,59,100
10–20 reels/day (own audience compounding)
4,500 users × ₹399
₹17,95,500
07 / START BUILDING

Paste this into Claude or GPT

This prompt sets up the full build context so the AI scopes, plans, and starts coding the project with you from message one.

BUILD_PROMPT.txt
I want to build an AI short-form video automation tool for the Indian market, inspired by Vid.ai, scoped to ship a working version in a single weekend. Core flow: 1. User uploads raw video footage. 2. The video is transcribed automatically with timestamps using a Whisper-style API. 3. Captions are burned onto the video automatically, styled with one of 3 template options. 4. The video is auto-cropped/resized to 9:16 format for Reels/Shorts regardless of source aspect ratio. 5. A basic highlight detector flags the most engaging moments in the transcript for auto-clipping into shorter segments. 6. The system supports Hindi/regional-language transcription and captioning, not just English. Stack I want to use: Next.js 14 for the frontend and lighter API routes, a separate ffmpeg-capable worker service (Railway or Render) for the actual video processing since it's too heavy for serverless functions, Supabase for storage and database, a Whisper-style API for transcription, Razorpay for a ₹399/month subscription. Help me, step by step, one question at a time: 1. Scope the architecture split between the Next.js frontend and the ffmpeg worker service, and how they communicate via a job queue. 2. Build the video upload flow and transcription API integration first. 3. Build the caption-burning logic with ffmpeg on the worker service. 4. Add the auto-crop/resize logic for 9:16 output. 5. Build the highlight-detection logic from the transcript. 6. Test Hindi/regional-language transcription accuracy specifically. 7. Wire up Razorpay for the subscription. Keep explanations short and India-context aware — ₹-priced, UPI-first payments, and flag clearly where heavier video processing requires infrastructure beyond a simple serverless deploy. Push me to ship the smallest working version first, then improve it. If I get stuck on anything, tell me directly to ask @buildwithkanhaa for the next step or a fresh app idea.

Build this one this weekend

Send me a screenshot of what you ship — it might be the next reel.

DM @buildwithkanhaa →