Building trust through more reliable AI

RagMetrics is an AI evaluation platform designed for Retrieval-Augmented Generation (RAG) systems. As enterprises adopt large language models (LLMs) for AI assistants and semantic search tools, ensuring reliable outputs is crucial. Our platform assesses retrieval relevance and generation accuracy, helping teams identify inaccuracies, conduct A/B tests, and automate quality assurance. RagMetrics is the first end-to-end observability solution for text-based generative AI, making it inherently trustworthy.

Join us

Leadership

RagMetrics was founded by seasoned technologists from Google, Microsoft, Chyron, and Cloudflare, leaders with decades of hands-on experience in machine learning and responsible AI. Together, they’re building the foundation for trustworthy, enterprise-grade generative AI.

The RagMetrics team brings deep expertise in evaluating LLMs and solving real-world challenges like hallucinations, retrieval drift, and prompt fragility. Their expertise drives the RagMetrics’ mission: to help organizations deploy production-ready GenAI solutions.

Olivier Cohen

CEO

Hernan Lardiez

COO

Mike Moreno

CMO

Investor Opportunities

RagMetrics is currently self-funded and actively seeking early-stage investment to scale product development and accelerate enterprise traction.

Stage & Funding

Pre-seed / Seed, with founders having invested initial resources to build a robust MVP and validate core use cases.

Clear Market Signal

We address a fast-growing need for AI evaluation and QA tools, especially for RAG-based LLM systems.

Founding Team

Deep domain expertise from Microsoft, Google, and Chyron, highlighting strong founder-market fit and technical execution capability.

Interested investors are invited to connect for more details on roadmap, traction, and projected milestones.

Contact Now

RagMetrics Newsletter

Information about RagMetrics and the AI industry; including LLM Judge and AI Testing.

June 26, 2025

The Urgency of Testing GenAI and LLM Solutions

Traditional software development includes different testing methods, while Gen AI and LLM solutions are usually never thoroughly...

June 26, 2025

Bridging the Gap Between Theory and Practice in Hallucination Detection

Or how hallucination detection works!

June 26, 2025

AI Agents in Regulated Markets: Evaluation and Monitoring

AI Agents fail to meet consumer's needs when not deployed thoughtfully.

Frequently Asked Questions

Have another question? Please contact our team!

Contact Our Team

Do you have an API?

Yes, we do.

Can you run your system on a Private Cloud or on-prem?

Yes, we can run as a hosted service, on-prem, or on a private cloud.

How does an experiment work?

It's as easy aconnecting your pipeline, your public model (Anthropic, Gemini, OpenAI, DeepSeek, etc.), creating a task, labeling a dataset, selecting your criteria, and starting to run an experiment!

Which information do you need to run an experiment?

Your public API keys, the endpoint of your pipeline, a source of domain expertise for your labelled data, and a concrete description of the task of your model, as well as your own criteria of success!

Can I use my own foundational model?

Yes, it's as easy as copying and pasting your endpoint URL.

Validate LLM Responses and Accelerate Deployment

RagMetrics enables GenAI teams to validate agent responses, detect hallucinations, and speed up deployment through AI-powered QA and human-in-the-loop review.

Get Started

Resources

Developer Docs Learning (our SEO/GEO play)

Company

Subscribe Newsletter: