VidContext + LangChain

LLMs can read text and see images, but they cannot watch video. Register VidContext as a LangChain tool and your agent can analyze any video file — extracting transcripts, visual scenes, brands, and audio — then reason about the results like any other text input.

How it works

  1. 1

    Define the tool

    Create a Python function decorated with @tool that sends a video file to the VidContext API and returns the structured analysis as text.

  2. 2

    Register with your agent

    Add the tool to your agent's tool list. LangChain agents will automatically call it when a task involves video content.

  3. 3

    Agent analyzes video

    When your agent encounters a video file, it calls the VidContext tool, receives structured text (transcript, scenes, brands, audio), and incorporates it into its reasoning.

  4. 4

    Agent acts on insights

    Your agent can now summarize the video, answer questions about its content, compare multiple videos, or trigger downstream actions based on what it found.

LangChain tool implementation

import os, json, requests
from langchain.tools import tool

VIDCONTEXT_API_KEY = os.environ.get("VIDCONTEXT_API_KEY", "")

@tool
def analyze_video(file_path: str, mode: str = "context") -> str:
    """Analyze a video file and return structured text.

    Args:
        file_path: Path to a local video file.
        mode: Analysis mode. One of: context, editor,
              analysis, ad, ecommerce, training, ugc, competitor.

    Returns:
        Structured analysis including transcript, visual
        scenes, on-screen text, brands, and audio details.
    """
    with open(file_path, "rb") as f:
        response = requests.post(
            "https://api.vidcontext.com/v1/analyze",
            headers={"X-API-Key": VIDCONTEXT_API_KEY},
            files={"file": f},
            data={"output_format": mode},
            timeout=300,
        )
    response.raise_for_status()
    return json.dumps(response.json(), indent=2)

Then add analyze_video to your agent's tools list alongside your other tools.

What you can build

Research agent with video evidence

Build an agent that can process video depositions, recorded interviews, or lecture recordings. It extracts the full transcript and visual context, then answers questions or writes summaries grounded in the actual content.

Content moderation agent

Create an agent that reviews user-uploaded videos for brand safety, policy violations, or compliance issues. VidContext extracts everything visible and audible, and your agent applies your moderation rules.

Marketing analysis agent

Give your marketing agent the ability to analyze competitor video ads. Use the ad mode to get scored evaluations of hook strength, messaging clarity, call-to-action effectiveness, and production quality.

Training content evaluator

Build an agent that reviews educational or training videos. It can assess whether key topics are covered, identify gaps in the material, and generate quizzes or summaries from the video content.

Alternative: MCP server

If your stack supports Model Context Protocol, skip the custom tool and run pip install vidcontext-mcp instead. Compatible with Claude Desktop, Cursor, Windsurf, and other MCP clients. See the full MCP guide.

Frequently asked questions

What does the VidContext tool return to my agent?

Structured text that includes a timestamped transcript, visual scene descriptions, on-screen text, detected brands, and audio analysis. Your agent can reason about all of this content as regular text.

Which LangChain versions are supported?

The tool decorator approach works with LangChain 0.1+ and 0.2+. The VidContext API is a standard REST endpoint, so it works with any version that supports custom tools.

Can I use the MCP server instead?

Yes. Install vidcontext-mcp with pip and configure it as an MCP server. Ideal if your agent framework supports MCP natively, such as Claude Desktop or Cursor.

What analysis modes should I use for my agent?

The context mode works best for general-purpose agents. Use ad for marketing agents, ecommerce for product-focused agents, and competitor for competitive intelligence workflows.

Give your agent eyes

Get your API key and add video analysis to LangChain in under 10 minutes. 20 free credits on signup.

Get API key