Twelve Labs video AI just pulled in $100 million because investors are making a deep tech bet that the next AI interface won’t be a chat box, it will be searchable footage. The startup said Wednesday, July 1, that it raised a $100 million Series B to expand its work on models that can understand, index, retrieve, and reason over video, according to PYMNTS.

Twelve Labs Grabs $100M as Video AI Battles Chatbots
XOOMAR Intelligence
Analyst Take
The round was co-led by NEA and NAVER Ventures, with participation from Amazon, Radical Ventures, Korea Investment Partners, Index Ventures, Quadrille Capital, and Red Bull Ventures, according to TwelveLabs’ own announcement. The company did not disclose a valuation in the supplied materials.
“Five years ago, we began with a simple observation: The world does not happen in text. It happens in motion,” Co-Founder and CEO Jae Lee wrote.
Twelve Labs lands $100 million to scale its video AI platform
The core thesis behind Twelve Labs video AI is blunt: text has been made programmable, but video remains largely locked away from machines. Lee argues that most AI systems still work from compressed descriptions of reality, while video carries the raw sequence: motion, sound, objects, speech, context, and timing.
TwelveLabs says the funding will help it advance Marengo and Pegasus, its core video models, and scale what it calls a Video Cognition System. In the company’s framing, that system is meant to make video archives searchable at the level of specific seconds, not just filenames, folders, captions, or transcripts.
The company’s blog describes three technical layers: perception, memory, and reasoning. Marengo maps visual, audio, speech, and on-screen text signals into a searchable representation. Pegasus turns those representations into descriptions, answers, summaries, scene boundaries, entities, temporal segments, and semantic context, according to the company.
That architecture matters because TwelveLabs is not pitching a consumer video generator in this announcement. It is pitching infrastructure for organizations sitting on footage they can’t efficiently query. PYMNTS connected the raise to a broader wave of AI-native software categories, including video generators, AI-native search products, coding assistants, and companion apps.
| TwelveLabs’ claim | What it means in practical terms |
|---|---|
| Video is still “dark matter” to machines | Large archives exist, but much of their content is hard to search semantically |
| Every second should be addressable | Users should be able to locate exact moments, not just files |
| Models should be native to video | The company rejects treating video as captions plus sampled frames |
| Reasoning must span archives | Questions may require comparing events across many clips, not one video |
The funding sharpens the race to make video searchable and useful for AI
This raise turns TwelveLabs’ argument into a capitalized test: can video become a first-class input for AI agents, the way text already has? Lee wrote that “the last decade of AI made text programmable,” while video has not yet had the same shift.
“The world’s video is still mostly dark matter to machines,” Lee said, noting that it sits in places like “archives … drones, and satellites,” mostly still accessed “through filenames, folders, captions, transcripts, and human memory.”
The company says video represents “upwards of 90%” of the world’s data, a figure it used in its Series B press materials. That is a company claim, but it explains the size of the bet: if even a fraction of that footage becomes searchable and usable by AI systems, video search stops being a narrow feature and becomes a workflow layer for enterprises.
The confirmed verticals are specific. TwelveLabs says it has traction in media and entertainment and is moving into the public sector, including work with governments. Its press materials also name advertising, security, sports, and automotive as areas driving demand for its platform.
The strongest counterpoint is that video AI is hard to operationalize. The company itself says brute-force approaches fail in both directions: feeding entire video libraries into a model’s context window would require technology and compute that enterprises could not justify, while turning video into a static database creates structure without intelligence. That is the technical gap TwelveLabs says its Video Cognition System is built to close.
Amazon’s role adds another layer. The supplied materials say Amazon participated in the round, while additional company materials describe AWS as TwelveLabs’ preferred cloud provider and say the models are distributed through Amazon Bedrock and TwelveLabs’ own API. That connects the startup’s product story to infrastructure choices, a pressure point we’ve also covered in Runaway AI Spending Forces a Return to Cloud Controls.
The consumer side is also relevant, but only as context. PYMNTS previously described video generators as part of a new AI app boom, alongside AI companions, conversational search, and prompt-based coding tools. That same shift can be seen in adjacent software categories where AI changes how users discover, trust, and act on information, including the issue raised in Shopify Trustpilot Deal Puts AI-Era Trust on the Line.
Twelve Labs now has to turn video AI hype into enterprise adoption
The raise gives TwelveLabs room to build, but the next proof point is adoption, not vocabulary. Terms like Video Superintelligence and Video Cognition System sound ambitious. Enterprise buyers will care whether the system finds the right moment, answers with evidence, works across messy archives, and does so at a cost that makes sense.
The company’s own roadmap points to that test. It says the money will go toward advancing Marengo and Pegasus, scaling the Video Cognition System into major video archives, and expanding the team. It also says it is hiring researchers, engineers, product builders, and operators.
TwelveLabs has already moved beyond models into applications. Its press materials say the company recently launched Rodeo, its first application-layer product, as part of a push to put the system directly in the hands of creators, operators, and decision-makers without requiring integration work.
The open questions are commercial. The supplied materials do not disclose revenue, customer count, valuation, deployment scale, or pricing. They also do not show independent benchmark results for the latest models. That leaves investors and customers watching for evidence that Twelve Labs video AI can move from impressive search demos to daily production use.
The practical metrics are clear:
- Deployments: Named enterprise rollouts in media, public sector, security, sports, advertising, or automotive.
- Developer uptake: Usage through TwelveLabs’ API and Amazon Bedrock.
- Model reliability: Accuracy across long, noisy, multi-speaker, multi-scene footage.
- Workflow depth: Whether customers treat video search as a core operating tool, not a novelty.
- Infrastructure fit: Whether the AWS relationship helps scale workloads without becoming a cost drag.
If TwelveLabs is right, video becomes one of the next major battlegrounds in AI infrastructure because it carries information text cannot preserve. If it is wrong, the $100 million buys time for a hard lesson: enterprises may want searchable video, but they will only pay for it when the answers are reliable, grounded, and fast enough to replace manual review.
The Bottom Line
- The $100 million raise signals strong investor conviction that video could become a major AI interface.
- Twelve Labs is targeting a hard problem: making vast video archives searchable and understandable at precise moments.
- Backing from Amazon, NEA, NAVER Ventures, and others could help the startup scale its video AI models faster.
Twelve Labs Core Video AI Models
| Model | Role |
|---|---|
| Marengo | Maps visual, audio, speech, and on-screen text signals into a searchable representation. |
| Pegasus | Generates descriptions, answers, summaries, scene boundaries, entities, temporal segments, and semantic context. |
Twelve Labs Series B Funding
Sources
Written by
XOOMAR Insights Team
Research and Editorial Desk
The XOOMAR Insights Team pairs automated research with human editorial judgment. We track hundreds of sources across technology, fintech, trading, SaaS, and cybersecurity, cross-check the facts, and explain what happened, why it matters, and what to watch next. We do not just rewrite headlines. Every article is fact-checked and scored for reliability before it goes live, and we link back to the original sources so you can verify anything yourself.
Explore More Topics
Related Articles
TechnologyBillions Ride on AWS Public Sector AI's Cloud Grab
AWS is spending billions to make its cloud the default home for government AI, from classified workloads to spy agency migrations.
TechnologyProfitable Venice AI Snags $1B Crown on Privacy Bet
Venice AI hit a $1B valuation and says it's profitable, turning privacy-first model access into a serious AI market challenge.
TechnologyFable 5 Returns as Anthropic Battles Safety Doubts
Fable 5 is back, but new safeguards make Anthropic’s relaunch look more like a security test than a victory.
Technology$30 Kobo Libra Colour Deal Revives Old Price After Hike
Kobo Libra Colour is back at $229.99, erasing a $30 price hike and slipping $20 below Kindle Colorsoft, but the sale may not last.
Technology5 Must-Stream Picks Hide in Prime Video's 69-Title July Drop
Prime Video's July drop looks huge, but five picks make the 69-title refresh easier to stream without wasting the night.
CybersecurityFTC Hits Amazon With $2.25M Identity Theft Fine Over Records
Amazon will pay $2.25M after the FTC said identity theft victims were denied records needed to prove fraud on fake accounts.
Cybersecurity$6.3M Bet Pushes Dawnguard Into Cloud Security Design
Dawnguard raised $6.3M to turn secure cloud architecture into enforceable code before systems ship.
TechnologyDeep Tech Bet Pulls Ashton Kutcher From Sound Ventures
Kutcher is leaving Sound Ventures to start a new early-stage VC firm with Morgan Beller, betting on deep tech beneath AI's boom.
Global TrendsTrump Turns USMCA Renewal Into a Trade Pressure Trap
Trump kept USMCA alive but refused long-term renewal, turning trade certainty into leverage over Canada and Mexico.
CybersecurityOne Click Lets DeepSeek Ransomware Raid Your Files
DeepSeek produced enough browser-native ransomware scaffolding for a low-skill attacker to finish, Check Point warns.
Don't miss the signal
Get our weekly roundup of the stories that matter across tech, fintech, and trading. No noise, just signal.
Free forever. No spam. Unsubscribe anytime.