How accurate is the extraction?

VidContext uses Gemini 3.1 Pro at 2 frames per second with high resolution. It captures on-screen text, brand logos, audio cues, and scene transitions that most humans miss on a first watch.

What about compliance and data privacy?

Videos are deleted immediately after processing. No video storage, no retention. All traffic over HTTPS. API key authentication on every request. We never log or store video content.

APIVideo Intelligence Platform

Better than a Transcript

Give your AI agent eyes. One API call turns any video into structured intelligence — visuals, brands, on-screen text, audio, pacing, and every detail no transcript captures.

Start free Read the docs

No credit card required. 20 free credits on signup.

Analysis modes

97%

Cheaper than manual

<60s

Per video

Context Mode

response.txt

METADATA

Duration: 2:34

Resolution: 1920x1080

Platform: YouTube

VISUAL DESCRIPTION

[00:00-00:12] Wide aerial shot of modern warehouse.

Industrial LED lighting, concrete floors.

Three forklifts visible in background.

TRANSCRIPT

[00:00] Speaker 1: "Welcome to our Q3

operations facility tour..."

BRANDS DETECTED

OSHA (safety poster, 0:15)

Caterpillar (forklift logo, 0:22)

ON-SCREEN TEXT

"Q3 Operations Review" — white, top-center

Processed in 47s

5 layers analyzed

8 modes available

The Problem

Your AI agent reads text, understands images, processes audio. But send it a video and it just... stops.

Transcripts miss 80% of what matters — visuals, brands, editing, pacing, on-screen text. Your agent deserves the full picture.

Text

Supported

Images

Supported

Audio

Supported

Video

Blind spot

VidContext fixes this with one API call

See The Difference

Transcript vs VidContext

What a transcript gives your AI

"Welcome to our Q3 operations facility tour. As you can see, we've made significant improvements to our safety protocols this quarter..."

That's it. Raw text. No visuals, no brands, no scene descriptions, no on-screen text, no pacing analysis.

What VidContext gives your AI

VISUAL DESCRIPTION

[00:00-00:12] Wide aerial shot of modern warehouse. Industrial LED lighting, concrete floors. Three forklifts visible.

TRANSCRIPT

[00:00] "Welcome to our Q3 operations facility tour..."

BRANDS DETECTED

OSHA (safety poster, 0:15), Caterpillar (forklift, 0:22)

ON-SCREEN TEXT

"Q3 Operations Review" — white, top-center

+ AUDIO, PACING, SCORES...

Comparison

What your AI sees today

Ask any AI model to analyze a video. Here's what you get.

ChatGPT

2/10

Can describe individual frames if you upload screenshots. Cannot process actual video files. No audio analysis. You have to manually extract frames and upload them one by one.

Manual frame extraction required

VidContext

9/10

Structured, scored output every time. Visuals, transcript, brands, on-screen text, audio analysis, pacing, scene descriptions — all in a consistent, machine-readable format. 8 specialized modes for different use cases.

Complete video intelligence

8 Modes

One video, eight lenses

Context

Full video understanding for any AI. Scenes, transcript, brands, on-screen text.

Editor

Frame-by-frame breakdown. Cut points, zoom cues, caption timing.

Creator Analysis

Content performance scoring. Hook, pacing, retention prediction.

Ad Analysis

Ad effectiveness. Message clarity, persuasion, compliance.

E-commerce

Product visibility, purchase psychology, conversion scoring.

Training

Pedagogical effectiveness. Cognitive load, engagement scoring.

UGC Vetting

Creator evaluation. Authenticity, brand safety scoring.

Competitor Intel

Competitive intelligence. Strategy, threat scoring.

Try all 8 modes free

Use Cases

Built for real workflows

Replace hours of manual video review

The problem

Your team watches videos frame by frame. At $25+/hour, reviewing 100 videos costs $2,500 in labor.

The fix

Process 100 one-minute videos for $20 with structured scoring and recommendations.

97% cost reduction

<60s per video

8 modes

How It Works

Three steps. That's it.

Upload

Send a video file. MP4, MOV, WebM — up to 100MB free, 500MB on Pro.

Pick a mode

Choose from 8 specialized analysis frameworks.

Get intelligence

Structured, scored output in under 60 seconds.

Integrations

Works with your stack

REST API

MCP Server

cURL

Python

Node.js

Any HTTP Client

# One API call. Any language.

curl -X POST https://api.vidcontext.com/v1/analyze \

-H "X-API-Key: vc_your_key" \

-F "file=@video.mp4" \

-F "mode=context"

Pricing

Simple, transparent

Start free. Scale with credit packs. Go unlimited with your own Gemini key.

Free

Get started instantly

No credit card needed

✓20 credits on signup
✓All 8 analysis modes
✓100MB files, 60s videos
✓API access
✓Credits never expire

Start free

Credit Packs

Pay as you go

credits

$10

$0.20/credit

Best value

200

credits

$30

$0.15/credit

Buy credits

One-time purchase. Credits never expire.

Unlimited

Pro (BYOK)

Bring your own Gemini API key

MonthlyAnnual

$29/mo

Cancel anytime

✓Unlimited video processing
✓Use your own Gemini API key
✓All 8 analysis modes
✓500MB files, 15 min videos
✓Priority processing
✓API + MCP access
✓All 8 analysis frameworks

Subscribe to Pro

Video deleted immediatelyCredits never expireCancel anytime. No contracts.

1 credit = 1 minute of video (rounded up). BYOK Pro uses your Gemini API key directly — you pay Google for processing.

FAQ

Questions & answers

What happens to my video after processing?

Deleted immediately. We never store your video files. Only the text output is saved in your account.

How is this different from a transcript?

A transcript gives you raw text — what was said. VidContext gives you everything: visual scene descriptions, on-screen text, brand detection, audio analysis, pacing, and expert scoring across specialized frameworks. It's the difference between giving your AI a paragraph vs giving it complete video intelligence.

What AI model powers the analysis?

Gemini 3.1 Pro at 2 frames per second with high resolution. The extraction pipeline runs specialized analysis frameworks to produce structured output covering metadata, transcript, visual scenes, audio, on-screen text, brand detection, and expert scoring.

What is BYOK (Bring Your Own Key)?

Our Pro subscription lets you use your own Gemini API key. You get unlimited access to all 8 VidContext analysis modes, our prompt library, and our structured output pipeline. You pay Google directly for the AI processing. This gives you unlimited videos at your own Gemini cost (~$0.03-0.11 per minute).

Do I need an account to try it?

No. You get 5 analyses without even creating an account. Sign up for 20 free credits — no credit card required. After that, buy credit packs ($10 for 50 credits, $30 for 200) or subscribe to Pro with your own Gemini key for unlimited processing.

What are the 8 analysis modes?

Context (full video understanding), Editor (frame-by-frame breakdown), Creator Analysis (content scoring), Ad Analysis (ad effectiveness), E-commerce (conversion analysis), Training (pedagogical effectiveness), UGC Vetting (creator evaluation), and Competitor Intelligence (competitive threat scoring).

How hard is it to integrate?

One endpoint, one file upload, one response. Most developers integrate in under 5 minutes. We support cURL, Python, Node, and any HTTP client. There's also an MCP server for Claude, Cursor, and Windsurf.

Is this just a wrapper around Gemini?

Gemini handles the vision layer. The 8 analysis modes, scoring frameworks, structured output format, and the full extraction pipeline are proprietary. Raw Gemini gives you a paragraph. VidContext gives you expert-scored analysis with actionable recommendations.

Can I use the output commercially?

Yes. The output belongs to you. No restrictions on commercial use, redistribution, or downstream processing.

Give your AI agent eyes

Stop teaching your AI with transcripts. Start analyzing videos in seconds. No credit card. 20 free credits on signup.

Start free Read the docs