How much funding did Twelve Labs raise?

Twelve Labs raised $100 million in a Series B funding round.

What is Twelve Labs building with its video AI platform?

Twelve Labs is working on AI that can make video searchable and usable at the level of individual seconds, rather than relying only on filenames, folders, captions or transcripts.

Why does Twelve Labs think video AI matters?

CEO Jae Lee argues that the world happens in motion, not text, and that video is closer to the way humans receive signals to learn about the world.

Twelve Labs Grabs $100M as Video AI Battles Chatbots

Q: Who led Twelve Labs’ $100 million Series B round?

The round was led by NEA and NAVER Ventures, with participation from Amazon, Radical Ventures, Korea Investment Partners, Index Ventures, Quadrille Capital and Red Bull Ventures.

Q: How does PYMNTS connect Twelve Labs’ funding to the broader AI market?

PYMNTS frames the funding within a broader wave of AI-native software categories, including video generators, AI-native search products, coding assistants and AI companion apps.

The round was co-led by NEA and NAVER Ventures, with participation from Amazon, Radical Ventures, Korea Investment Partners, Index Ventures, Quadrille Capital, and Red Bull Ventures, according to TwelveLabs’ own announcement. The company did not disclose a valuation in the supplied materials.

“Five years ago, we began with a simple observation: The world does not happen in text. It happens in motion,” Co-Founder and CEO Jae Lee wrote.

Twelve Labs lands $100 million to scale its video AI platform

The core thesis behind Twelve Labs video AI is blunt: text has been made programmable, but video remains largely locked away from machines. Lee argues that most AI systems still work from compressed descriptions of reality, while video carries the raw sequence: motion, sound, objects, speech, context, and timing.

TwelveLabs says the funding will help it advance Marengo and Pegasus, its core video models, and scale what it calls a Video Cognition System. In the company’s framing, that system is meant to make video archives searchable at the level of specific seconds, not just filenames, folders, captions, or transcripts.

The company’s blog describes three technical layers: perception, memory, and reasoning. Marengo maps visual, audio, speech, and on-screen text signals into a searchable representation. Pegasus turns those representations into descriptions, answers, summaries, scene boundaries, entities, temporal segments, and semantic context, according to the company.

That architecture matters because TwelveLabs is not pitching a consumer video generator in this announcement. It is pitching infrastructure for organizations sitting on footage they can’t efficiently query. PYMNTS connected the raise to a broader wave of AI-native software categories, including video generators, AI-native search products, coding assistants, and companion apps.

TwelveLabs’ claim	What it means in practical terms
Video is still “dark matter” to machines	Large archives exist, but much of their content is hard to search semantically
Every second should be addressable	Users should be able to locate exact moments, not just files
Models should be native to video	The company rejects treating video as captions plus sampled frames
Reasoning must span archives	Questions may require comparing events across many clips, not one video

The funding sharpens the race to make video searchable and useful for AI

This raise turns TwelveLabs’ argument into a capitalized test: can video become a first-class input for AI agents, the way text already has? Lee wrote that “the last decade of AI made text programmable,” while video has not yet had the same shift.

“The world’s video is still mostly dark matter to machines,” Lee said, noting that it sits in places like “archives … drones, and satellites,” mostly still accessed “through filenames, folders, captions, transcripts, and human memory.”

The company says video represents “upwards of 90%” of the world’s data, a figure it used in its Series B press materials. That is a company claim, but it explains the size of the bet: if even a fraction of that footage becomes searchable and usable by AI systems, video search stops being a narrow feature and becomes a workflow layer for enterprises.

The confirmed verticals are specific. TwelveLabs says it has traction in media and entertainment and is moving into the public sector, including work with governments. Its press materials also name advertising, security, sports, and automotive as areas driving demand for its platform.

The strongest counterpoint is that video AI is hard to operationalize. The company itself says brute-force approaches fail in both directions: feeding entire video libraries into a model’s context window would require technology and compute that enterprises could not justify, while turning video into a static database creates structure without intelligence. That is the technical gap TwelveLabs says its Video Cognition System is built to close.

Amazon’s role adds another layer. The supplied materials say Amazon participated in the round, while additional company materials describe AWS as TwelveLabs’ preferred cloud provider and say the models are distributed through Amazon Bedrock and TwelveLabs’ own API. That connects the startup’s product story to infrastructure choices, a pressure point we’ve also covered in Runaway AI Spending Forces a Return to Cloud Controls.

The consumer side is also relevant, but only as context. PYMNTS previously described video generators as part of a new AI app boom, alongside AI companions, conversational search, and prompt-based coding tools. That same shift can be seen in adjacent software categories where AI changes how users discover, trust, and act on information, including the issue raised in Shopify Trustpilot Deal Puts AI-Era Trust on the Line.

Twelve Labs now has to turn video AI hype into enterprise adoption

The raise gives TwelveLabs room to build, but the next proof point is adoption, not vocabulary. Terms like Video Superintelligence and Video Cognition System sound ambitious. Enterprise buyers will care whether the system finds the right moment, answers with evidence, works across messy archives, and does so at a cost that makes sense.

The company’s own roadmap points to that test. It says the money will go toward advancing Marengo and Pegasus, scaling the Video Cognition System into major video archives, and expanding the team. It also says it is hiring researchers, engineers, product builders, and operators.

TwelveLabs has already moved beyond models into applications. Its press materials say the company recently launched Rodeo, its first application-layer product, as part of a push to put the system directly in the hands of creators, operators, and decision-makers without requiring integration work.

The open questions are commercial. The supplied materials do not disclose revenue, customer count, valuation, deployment scale, or pricing. They also do not show independent benchmark results for the latest models. That leaves investors and customers watching for evidence that Twelve Labs video AI can move from impressive search demos to daily production use.

The practical metrics are clear:

Deployments: Named enterprise rollouts in media, public sector, security, sports, advertising, or automotive.
Developer uptake: Usage through TwelveLabs’ API and Amazon Bedrock.
Model reliability: Accuracy across long, noisy, multi-speaker, multi-scene footage.
Workflow depth: Whether customers treat video search as a core operating tool, not a novelty.
Infrastructure fit: Whether the AWS relationship helps scale workloads without becoming a cost drag.

If TwelveLabs is right, video becomes one of the next major battlegrounds in AI infrastructure because it carries information text cannot preserve. If it is wrong, the $100 million buys time for a hard lesson: enterprises may want searchable video, but they will only pay for it when the answers are reliable, grounded, and fast enough to replace manual review.

The Bottom Line

The $100 million raise signals strong investor conviction that video could become a major AI interface.
Twelve Labs is targeting a hard problem: making vast video archives searchable and understandable at precise moments.
Backing from Amazon, NEA, NAVER Ventures, and others could help the startup scale its video AI models faster.

Twelve Labs Grabs $100M as Video AI Battles Chatbots

Analyst Take

Twelve Labs lands $100 million to scale its video AI platform

The funding sharpens the race to make video searchable and useful for AI

Twelve Labs now has to turn video AI hype into enterprise adoption

The Bottom Line

Twelve Labs Core Video AI Models

Twelve Labs Series B Funding

Sources

XOOMAR Insights Team

Explore More Topics

Related Articles

Billions Ride on AWS Public Sector AI's Cloud Grab

Profitable Venice AI Snags $1B Crown on Privacy Bet

Fable 5 Returns as Anthropic Battles Safety Doubts

$30 Kobo Libra Colour Deal Revives Old Price After Hike

5 Must-Stream Picks Hide in Prime Video's 69-Title July Drop

FTC Hits Amazon With $2.25M Identity Theft Fine Over Records

$6.3M Bet Pushes Dawnguard Into Cloud Security Design

Deep Tech Bet Pulls Ashton Kutcher From Sound Ventures

Trump Turns USMCA Renewal Into a Trade Pressure Trap

One Click Lets DeepSeek Ransomware Raid Your Files

Don't miss the signal

Model	Role
Marengo	Maps visual, audio, speech, and on-screen text signals into a searchable representation.
Pegasus	Generates descriptions, answers, summaries, scene boundaries, entities, temporal segments, and semantic context.