Maihem: Leading the Charge in AI Quality Assurance and Reliability

Jun 16, 2024 by Daryna Falko

What is Maihem?

Maihem is a pioneering startup that has developed AI agents specifically designed to test AI products. Founded in 2023, Maihem addresses a critical need in the AI industry: ensuring the quality, performance, and reliability of AI applications, particularly large language models (LLMs), before and after deployment. With a small but highly skilled team of two active founders, Maihem leverages their extensive academic and professional backgrounds to deliver innovative solutions in AI quality assurance.

Who Are the Founders of Maihem?

Max Ahrens and Eduardo Candela are the visionary co-founders behind Maihem.

Max Ahrens, the Co-Founder and CEO of Maihem, holds a PhD and Postdoc in Natural Language Processing from the University of Oxford. He led a significant research project on harmful narrative detection with large language models, funded by a $500,000 grant from the Alan Turing Institute and the British Ministry of Defence. Max's prior experience includes working as a consultant with McKinsey, advising global companies on digitization strategies.

Eduardo Candela, the Co-Founder and CTO of Maihem, has a robust background in AI and data science. He previously worked as a Technical Program Manager at Tesla and a Data Scientist at the Bosch Center for AI. Eduardo holds a PhD in AI Safety for Autonomous Vehicles from Imperial College London, an MSc in Operations Research from MIT, and a BSc in Robotics from ITAM. His passion for building state-of-the-art AI products is evident in his work at Maihem.

What Problem Does Maihem Address?

Traditional quality assurance methods are insufficient for large language models (LLMs) due to their probabilistic nature and the high variability of their responses. Unlike traditional software, which produces a limited set of predefined results, LLMs can generate thousands of different responses, leading to numerous potential failure points. Prominent examples of LLM failures include Chevrolet’s chatbot selling a new car for $1 and DPD’s chatbot swearing at its customers. These incidents highlight the risks associated with deploying LLMs without thorough testing.

How Does Maihem’s Solution Work?

Maihem’s AI agents provide continuous testing for LLM applications, ensuring that they perform reliably and safely. The key features of Maihem’s solution include:

Simulating Thousands of Users: Maihem’s AI agents can simulate thousands of users to test LLM applications before they go live, uncovering potential issues that may arise in real-world scenarios.
Custom Performance and Risk Metrics: Maihem evaluates LLM applications using custom performance and risk metrics, tailored to the specific needs and goals of each application.
Hyper-Realistic Simulated Data: By generating hyper-realistic simulated data, Maihem helps improve and fine-tune LLM applications, ensuring they meet high standards of performance and reliability.

Why is Traditional Quality Assurance Ineffective for LLMs?

Traditional quality assurance processes are designed for deterministic systems, where the outcomes are predictable and limited in number. LLMs, however, operate as probabilistic black boxes, generating a vast array of potential responses to any given input. This inherent unpredictability makes it challenging to apply conventional testing methods. The high variability in responses means that there are numerous ways an LLM can fail, necessitating a more robust and comprehensive testing approach.

What Are the Benefits of Using Maihem’s AI Agents?

Using Maihem’s AI agents for quality assurance offers several significant benefits:

Enhanced Performance and Reliability: Continuous testing and fine-tuning of LLM applications lead to improved performance and reliability, reducing the risk of failures in live environments.
Increased Safety: By identifying and mitigating potential issues before deployment, Maihem’s AI agents enhance the safety of LLM applications, preventing incidents that could harm a company’s reputation.
Efficiency and Scalability: Automating the quality assurance process with AI agents increases efficiency and scalability, allowing companies to test their applications more thoroughly and in less time than manual methods.

How Did the Founders’ Backgrounds Influence Maihem’s Development?

Max Ahrens and Eduardo Candela’s extensive backgrounds in AI and related fields have significantly influenced Maihem’s development. Their academic research and professional experiences have provided them with deep insights into the challenges and opportunities in AI quality assurance. Max’s expertise in natural language processing and harmful narrative detection, combined with Eduardo’s experience in AI safety for autonomous vehicles and data science, have enabled them to create a solution that addresses the unique needs of LLM applications.

What Is the Vision Behind Maihem?

The vision behind Maihem is to make AI more reliable, safer, and better performing. Max and Eduardo met during their PhD studies in London and realized they shared a common goal of improving AI technology. By transferring their proprietary research from AI safety for self-driving cars to LLM applications, they aim to enhance the overall quality and trustworthiness of AI systems. Their commitment to innovation and excellence drives Maihem’s mission to provide state-of-the-art quality assurance solutions for AI products.

How Does Maihem Differ from Other Quality Assurance Solutions?

Maihem stands out from other quality assurance solutions due to its focus on AI-driven testing and its ability to simulate complex user interactions with LLM applications. Unlike traditional testing methods, which may not fully capture the variability and complexity of LLM responses, Maihem’s AI agents offer a more comprehensive and realistic testing approach. This allows companies to identify and address potential issues more effectively, ensuring their AI products meet the highest standards of quality and reliability.

What’s Next for Maihem?

As Maihem continues to grow, the startup aims to expand its capabilities and reach. The founders are focused on further enhancing their AI agents and developing new features that will provide even greater value to their clients. By staying at the forefront of AI quality assurance, Maihem seeks to play a pivotal role in advancing the field of artificial intelligence and helping companies build more reliable and safe AI applications.

Conclusion

Maihem is a forward-thinking startup that addresses a critical need in the AI industry: the quality assurance of LLM applications. With a highly skilled team led by Max Ahrens and Eduardo Candela, Maihem leverages advanced AI agents to simulate user interactions, evaluate performance, and improve the reliability and safety of AI products. As the demand for robust AI solutions continues to grow, Maihem is poised to make a significant impact, ensuring that AI applications perform optimally and meet the highest standards of quality and reliability.