Mozilla.ai (Page 2)

Introducing any-llm: A unified API to access any LLM provider

When it comes to using LLMs, it’s not always a question of which model to use: it’s also a matter of choosing who provides the LLM and where it is deployed. Today, we announce the release of any-llm, a Python library that provides a simple unified interface to access the most popular providers.

Wasm-agents: AI agents running in your browser

One of the main barriers to a wider adoption of open-source agents is the dependency on extra tools and frameworks that need to be installed before the agents can be run. In this post, we show how to write agents as HTML files, which can just be opened and run in a browser.

Smarter Prompts for Better Responses: Exploring Prompt Optimization and Interpretability for LLMs

Generative AI models are highly sensitive to input phrasing. Even small changes to a prompt or switching between models can lead to different results. Adding to the complexity, LLMs often act as black-boxes, making it difficult to understand how specific prompts influence their behavior.

The Challenge of Choosing the Right LLM

Imagine you could effortlessly navigate the universe of LLMs, always knowing which one is the perfect fit for your specific query. Today, this is a very difficult challenge. So, how do you efficiently manage and use LLMs for various tasks? This is where LLM Routing emerges as a crucial strategy.

What do you mean by AI testing...?

We recently discussed the increasing need to test applications that make use of AI with tests that target problems specific to AI models. But an immediate follow-up question then arises: What specific problems? How is that testing different?

AIssert: Testing LLM Integrations

Since LLMs exploded into public awareness, we have witnessed their integration into a vast array of applications. However, this also introduces new complexities, especially in testing. At Mozilla.ai, we did some research on the need to introduce formal testing to the end to end app.

Introducing Any-Agent: An abstraction layer between your code and the many agentic frameworks

Since the launch of ChatGPT in 2022, generative AI and LLMs have rapidly entered everyday life. The viral adoption of these tools was unprecedented, and in some ways contentious. In order to grant greater capabilities to LLMs, they can be integrated into a framework that’s referred to as an “Agent”.

Evaluating Local LLMs on Translation Use Case with Lumigator

LLMs have transformed a wide array of NLP applications, but concerns about privacy, data security, and control hinder LLM adoption for both individuals and enterprises. In this blog post, we focus on local LLMs, which offer a compelling alternative to cloud-based solutions.

📘+🐊 Blueprint Releases & Lumigator's Latest Features

Hello Developers & AI Enthusiasts, Welcome to the first Mozilla.ai Update. We're thrilled to share the latest developments with you, including updates on our Blueprints and Lumigator toolkit. You'll also find community highlights and information about upcoming events where we can connect and collaborate. Blueprint

Open-source AI is hard. Blueprints can help!

We’re excited to introduce Mozilla.ai Blueprints and the Blueprints Hub! Cut through the clutter of clunky tool integration, and see how you can speed up your development and spark your creativity.

Build Your Own Timeline Algorithm: A Blueprint

Timeline algorithms should be useful for people, not for companies. Their quality should not be evaluated in terms of how much time people spend on a platform, but rather in terms of how well they serve their users’ purposes.

Benchmarking DeepSeek R1 Models using Lumigator: A Practical Evaluation for Zero-Shot Clinical Summarization

New state-of-the-art models emerge every few weeks, making it hard to keep up, especially when testing and integrating them. In reality, many available models may already meet our needs. The key question isn’t “Which model is the best?” but rather, “What’s the smallest model that gets the job done?”

Evaluating DeepSeek V3 with Lumigator

A typical user may be building a summarization application for their domain and wondering: “Do I need to go for a model as big as DeepSeek, or can I get away with a smaller model?”. This takes us to the key elements: Metrics, Models, and Datasets.

Map Features in OpenStreetMap with Computer Vision

Mozilla.ai developed and released the OpenStreetMap AI Helper Blueprint. If you love maps and are interested in training your own computer vision model, you’ll enjoy diving into this Blueprint.

Deploying DeepSeek V3 on Kubernetes

Previously, we explored how LLMs like Meta’s Llama reshaped AI, offering transparency and control. We discussed open-weight models like DeepSeek and deployment options. Now, we show how to deploy DeepSeek V3, a powerful open-weight model, on a Kubernetes cluster using vLLM.

Running an open-source LLM in 2025

The landscape of LLMs has evolved dramatically since ChatGPT burst onto the scene in late 2022. At Mozilla.ai, we’re focused on improving trust in open-source AI by supporting their use in appropriate situations and their proper evaluation.

Structured Question Answering

When deciding on a new Blueprint, we focus on selecting an end application that is both practical and impactful, along with the best techniques to implement it. With endless possible applications of LMs today, selecting one that is actually useful can be challenging.

Lumigator is here!

Lumigator is a developer-first tool designed and built by the community to help engineers evaluate and compare AI models with ease. Lumigator empowers developers to make data-driven choices when integrating AI models into their applications.

Blueprint Deep Dive: Turn Documents into Podcasts Locally with Open-Source AI

Blueprints are customizable workflows that help developers build AI applications using open-source tools and models. In this blog, we’ll dive into our first Blueprint: document-to-podcast. We’ll explain how it works, our technical decisions, and how you can use and customize it yourself.

Introducing Blueprints: Customizable AI workflows for Developers

Developers today face many challenges when trying to integrate AI into their apps or building an “AI solution” from scratch. At Mozilla.ai, we’re committed to breaking down these barriers with Blueprints – our initiative to help developers adopt open-source AI tools and models with confidence.

Image of two hands stacking small pebbles on a group of rocks

Let’s build an app for evaluating LLMs

Lumigator 🐊 is a self-hosted, open-source Python application for evaluating large language models using offline metrics. It targets common machine learning use-cases, starting with summarization, and is extensible at the task and job level.

Taming randomness in ML models with hypothesis testing and marimo

The behavior of ML models is often affected by randomness at different levels, from the initialization of model parameters to the dataset split into training and evaluation. Thus, predictions made by a model (including the answers an LLM gives to your questions) are potentially different every time you run it.

Introducing Lumigator 🐊

An MVP for Simplifying AI Model Selection In today’s fast-moving AI landscape, choosing the right large language model (LLM) for your project can feel like navigating a maze. With hundreds of models, each offering different capabilities, the process can be overwhelming. That’s why Mozilla.ai is developing Lumigator,

The Importance of Ground Truth Data in AI Applications: An Overview

Introduction We are witnessing explosive growth in the development of artificial intelligence (AI) applications across various industries, from virtual agents to healthcare diagnostics and autonomous driving. This growth is powered by vast datasets and advanced algorithms that enable AI systems to learn and make decisions in real-world scenarios. However, the