Repurposing long-form video into short clips is one of the most tedious parts of content creation. You watch through an hour-long video, mark timestamps, cut clips, reformat to vertical - and hope the moments you picked actually perform. OpenShorts automates that entire pipeline as a self-hosted, open-source tool.
How the Pipeline Works
The process runs in four steps. First, faster-whisper (a CPU-optimized speech recognition model) transcribes the video with word-level timestamps. That transcript goes to Gemini 2.0 Flash, which identifies between 3 and 15 segments it considers most likely to stand alone as engaging short-form content, each between 15 and 60 seconds. FFmpeg then extracts those clips with precise cuts. A final AI pass handles captioning and formatting for vertical output.
It accepts YouTube URLs or local video files, so it works whether you are clipping your own content or pulling from a public source.
Where It Fits
Paid tools like Opus Clip and Vizard charge $15-40/month for similar functionality. OpenShorts trades their polished interfaces for full control over the pipeline and zero recurring costs beyond your own Gemini API usage. For creators or agencies processing high volumes of video, that cost difference adds up fast.
The main limitation with any AI-driven clip selection is taste. Gemini picks moments based on transcript analysis, not visual energy or emotional beats, so the "viral-worthy" label is really "transcript-interesting." You will still want to review and curate the output. But cutting review time from hours to minutes is the actual value here - not fully automated publishing.