What happens to my video after processing?

Deleted immediately. We never store your video files. Only the text output is saved in your account.

What are the 8 analysis modes?

Context (extracts everything into structured text), Editor (frame-by-frame breakdown for AI video editors), Creator Analysis (content performance scoring), Ad Analysis (ad effectiveness for media buyers), E-commerce (product video conversion analysis), Training (pedagogical effectiveness for L&D), UGC Vetting (creator evaluation for brand partnerships), and Competitor Intelligence (competitive threat scoring).

How accurate is the extraction?

VidContext uses Gemini 3.1 Pro at 2 frames per second with high resolution. It captures on-screen text, brand logos, audio cues, and scene transitions that most humans miss on a first watch.

Is this just a wrapper around Gemini?

Gemini handles the vision layer. The 8 analysis modes, scoring frameworks, structured output format, and the full extraction pipeline are proprietary. Raw Gemini gives you a paragraph. VidContext gives you expert-scored analysis with actionable recommendations.

What about compliance and data privacy?

Videos are deleted immediately after processing. No video storage, no retention. All traffic over HTTPS. API key authentication on every request. We never log or store video content.

Give your AI agent eyes

LLMs cannot watch video. They cannot process frames, listen to audio, or read on-screen text from a video file. VidContext converts any video into structured text your agent can reason about — via a single API call or the built-in MCP server.

Try free — no account needed View API docs

The gap in every AI agent

Your agent can read documents, search the web, query databases, and write code. But hand it a video file and it hits a wall. Video is the largest source of unstructured information on the internet, and your agent is blind to it.

VidContext bridges this gap. It watches the video — every frame, every word spoken, every piece of text on screen — and returns a structured text document your agent can process like any other input. Scenes, transcripts, detected brands, audio analysis, and scoring frameworks. All from one call.

Built for agent workflows

One-call video-to-text

Send a video file or URL, receive structured text your agent can immediately reason about. No multi-step pipeline to orchestrate.

MCP server included

Install with pip install vidcontext-mcp. Your agent gets video understanding as a native tool — no custom integration code required.

Works with any LLM

The output is structured text. Feed it into Claude, GPT, Gemini, Llama, Mistral, or any model. No vendor lock-in.

8 analysis modes

Context, Editor, Ad Analysis, Creator Analysis, E-commerce, Training, UGC Vetting, and Competitor Intelligence. Choose the lens that matches your agent's task.

Privacy-first processing

Video files are deleted immediately after analysis. Your agent gets the structured output — the raw video is never stored.

Predictable JSON schema

Consistent output structure across every video. Your agent's parsing logic works reliably without handling edge cases in response format.

Two ways to integrate

REST API

curl -X POST https://api.vidcontext.com/v1/analyze \
  -H "X-API-Key: vc_your_key_here" \
  -F "file=@video.mp4" \
  -F "output_format=context"

# Returns structured JSON in ~50 seconds
# Feed the response directly into your agent's context

MCP server (for Claude, Cursor, and MCP-compatible agents)

# Install the MCP server
pip install vidcontext-mcp

# Add to your agent's MCP config:
{
  "mcpServers": {
    "vidcontext": {
      "command": "vidcontext-mcp",
      "args": ["--api-key", "vc_your_key_here"]
    }
  }
}

# Your agent can now call VidContext as a tool:
# "Analyze this video and tell me what brands appear"
# The agent handles the rest automatically.

The MCP server exposes VidContext as a tool your agent can call directly. No wrapper code needed.

What agents build with VidContext

Competitive intelligence agents

Monitor competitor video ads, product launches, and social content. Your agent watches hundreds of videos and surfaces trends, messaging changes, and brand positioning shifts.

Content moderation pipelines

Automate video review at scale. Your agent understands what the video communicates — not just what individual frames contain — and flags policy violations with full context.

E-commerce product analysis

Agents that analyze product demo videos, unboxing content, and review videos. Extract product features, pricing, comparisons, and sentiment automatically.

Research and knowledge extraction

Build agents that watch lecture videos, conference talks, and training content. Extract key concepts, create summaries, and build searchable knowledge bases from video libraries.

Frequently asked questions

How do AI agents use video data?

AI agents built on LLMs like Claude, GPT, and Gemini work with text. They cannot natively process video files. VidContext converts video into structured text — scene descriptions, transcripts, on-screen text, brand detection, and audio analysis — so your agent can reason about video content the same way it reasons about any other document.

What is MCP and how does it work with VidContext?

MCP (Model Context Protocol) is a standard for giving AI agents access to external tools. VidContext provides an MCP server you can install with 'pip install vidcontext-mcp'. Once configured, your agent can call VidContext directly as a tool — no custom API integration code needed.

Which AI agent frameworks are supported?

VidContext works with any framework that can make HTTP requests or use MCP tools. This includes LangChain, CrewAI, AutoGPT, Claude with tool use, OpenAI function calling, custom Python agents, and n8n or Make automation workflows. If your agent can call a REST API, it can use VidContext.

Can I use this with Claude, GPT, or Gemini?

Yes. VidContext outputs structured text that any LLM can process. Use it as a tool in Claude's tool-use system, as a function in OpenAI's function calling, or as context injected into any prompt. The MCP server works natively with Claude Desktop and any MCP-compatible client.

How fast is processing for real-time agent workflows?

VidContext processes a 3-minute video in about 50 seconds. This is fast enough for automated pipelines, background processing, and agent workflows that handle video asynchronously. For agents that need to respond in real time, you can pre-process videos and cache the results.

Ready to start?

5 free analyses without an account. 20 credits on signup. No credit card required.

Try VidContext free