← back to skills
🎥

video-analyzer-skill

Download, transcribe, analyze any video

v1.1.0 MIT License ✓ OpenClaw
~clawhub install video-analyzer-skill

What it does

Works with YouTube, X/Twitter, TikTok. Local Whisper processing for privacy. Extracts TL;DRs, timestamps, and actionable insights from any video.

How it works

Two-tier architecture for maximum speed. Level 1 grabs YouTube captions via yt-dlp in ~3 seconds — instant results when captions exist. Level 2 falls back to local Whisper transcription (~20 seconds) for videos without captions or for higher accuracy. Everything runs locally: zero API keys, zero cloud costs, 100% private. Your videos never leave your machine.

Features

Level 1: YouTube captions via yt-dlp (~3s) — instant when available
🎙️ Level 2: Local Whisper transcription (~20s) — works on any video
📥 Download MP4/MP3 from YouTube, X/Twitter, TikTok
📝 Timestamped transcripts with speaker detection
🧠 TL;DR + key moments + actionable insights extraction
🔍 Search by timestamp — find exactly where something was said
🔒 100% local/private — nothing leaves your machine

Example

output
$ "analyze https://youtube.com/watch?v=dQw4w9WgXcQ"

⚡ Level 1: YouTube captions found (3.2s)
📄 Transcript: 3,847 words | Duration: 12:34

## TL;DR
Rick Astley's iconic 1987 hit. The video has become
the internet's most famous bait-and-switch meme.

## Key Moments
[0:00] Intro drum beat
[0:18] "We're no strangers to love..."
[0:52] First chorus — "Never gonna give you up"
[2:10] Bridge section
[3:14] Final chorus and fade out

## Insights
• Simple, repetitive structure = high memorability
• 80s synth production still holds up
• Meme status drove 1.5B+ views

Requirements

yt-dlp ffmpeg whisper-cpp uv (Python)