Pruna AI was founded in 2023 by Bertrand Charpentier, John Rachwan, Rayan Nait Mazi and Stephan Günnemann.

It is a Franco-German startup based in Paris (Station F) and Munich that has just raised $6.5 million in seed funding from EQT Ventures, with participation from Daphni, Motier and Kima Ventures. Backed by recognized figures such as Roxanne Varza, Olivier Pomel and Hervé Nivon, Pruna AI has established itself as a key player in AI model optimization.

Its solution enables, through a single line of code, any AI model to be made faster, smaller, less expensive and above all more ecological, regardless of the hardware used. It covers a wide spectrum of applications: computer vision, NLP, audio and graphics, for both predictive and generative use cases.

Pruna AI was selected as one of 5 European startups in the AI program launched by Meta, Hugging Face and Scaleway. Its optimization engine addresses a major challenge: the exponential consumption of computing power, which has become an economic and ecological constraint for businesses. Their technology can reduce the carbon emissions of AI models by up to 91%.

Created by machine learning researchers from institutions such as the Technical University of Munich, Pruna AI aims to make AI accessible, high-performing and sustainable for businesses, with a particular focus on European SMEs.

With Pruna AI, Bertrand Charpentier and his team want to transform the artificial intelligence landscape. Their optimization engine enables companies, even the smallest ones, to deploy high-performance models while addressing economic and environmental challenges.

 

Tell us about the genesis of Pruna AI

Our goal is to solve a fundamental problem in the AI ecosystem: the cost and energy impact of models. If 80 million people, equivalent to Germany’s population, generated 5 pages with ChatGPT every day, it would take two nuclear power plants running 24/7 to cover the energy demand.

A recent report by Deloitte highlights the growing environmental impact of AI, offering a detailed analysis of data center energy consumption. According to their projections, in a high adoption scenario, electricity demand could reach 3,550 TWh by 2050, a figure exceeding the current consumption of the European Union. This growth would be largely driven by artificial intelligence, which would increase from 11% to 68% of this consumption.

Our solution is an AI model optimization engine. With a single line of code, we automatically reduce the size, execution time, cost and carbon footprint of models. It works on any type of hardware and covers several types of AI: vision, NLP, audio and graph, whether for predictive or generative uses.

Access to computing power is a growing problem. During my PhD at the Technical University of Munich, I found that generating a simple scientific paper costs around 10,000 euros in computing resources. For companies developing large models, this bill often amounts to several million euros. For example, to generate a high-definition image, you need the equivalent of the energy contained in a phone battery. Some scaleups produce up to 5 million images per day. If we extend this over a month, the energy cost savings can exceed 1.5 to 1.7 million euros.

So we developed a solution to democratize access to AI performance. Thanks to our automatic compressions, even startups or SMEs can exploit complex models at lower cost. This has transformed our way of seeing AI: it becomes an accessible, high-performing and above all sustainable tool.

 

Other companies regularly attempt to develop similar solutions. They are often acquired by major players like NVIDIA or Red Hat, who seek to internalize expertise around model compression.

Unlike a consulting approach that compresses models one by one, manually, we aim for complete process automation to guarantee reproducibility, speed and efficiency.

Some companies sometimes try to specialize in a specific type of model or hardware, which limits their adaptability. Conversely, we meet this challenge by offering a flexible solution that adapts to various models and hardware environments that can frequently vary.

 

What is your 3-year vision?

It rests on two major pillars: making AI accessible and sustainable. Today, many companies cannot use it due to high costs or significant technological constraints. By simplifying model compression, we reduce these barriers, enabling wider adoption. AI models consume enormous amounts of energy, which limits their deployment in many use cases.

Over a two to three-year horizon, this vision materializes in several ways. For now, we mainly focus on compression for inference, that is, the phase where models are deployed and used in production. But we have also developed solid expertise in training compression, optimizing AI model learning, which represents a major challenge for reducing costs and energy impact.

Another essential aspect is model performance and reliability, which are at the heart of our approach. Compression must never come at the expense of these criteria. This involves rigorous benchmarking to validate that the compressed model, like the base model, remains reliable, even under extreme conditions.

This is where my expertise in reliability through uncertainty estimation comes in. It consists of ensuring that the model maintains safe and consistent behavior, even when faced with abnormal or unexpected situations. This ensures AI that is not only high-performing, but also reliable and robust.

 

Which AI solutions have you selected?

Unlike other approaches that use a single compression method, we combine several techniques to find the ideal configuration according to needs: inference time reduction, cost minimization, hardware compatibility.

We also use platforms like Hugging Face. We take existing models, automatically compress them and republish them optimized. This allows companies to compare compressed models and gain performance without additional effort.

 

How did you overcome cultural and human challenges when integrating AI?

The first barrier is popularization. Many companies still perceive AI as “black magic.” We had to prove its reliability with concrete results: inference time reduction: from 13.5 seconds to 3.6 seconds, cost reduction: from 10,000 euros to 2,600 euros for one million images. For this, we developed a comparator that is accessible here https://flux-pruna-benchmark.vercel.app/

We demonstrate through these tools that AI can bring real gains: increased performance, financial savings and reduced energy footprint.

 

What technical challenges did you encounter during implementation?

One of the major challenges was to make our solution easy to use while guaranteeing optimal performance. Our engine automatically tests several combinations of compression methods to adapt to the use case.

Another challenge: establishing trust. Our customers want to ensure that compression does not degrade model performance. We provide comparative benchmarks to validate each step.

 

What positive changes have you observed in your team’s dynamics thanks to AI?

We will be a team of 13 in January. We actively use tools like GitHub Copilot and Cursor to accelerate our work. This allows us to save time on repetitive tasks and stay focused on innovation and R&D.

From the start, we integrated AI into our processes, which fostered natural adoption. The team is also constantly monitoring to test the latest innovations.

 

What essential advice would you give to SMEs hesitating to take the leap towards AI?

You have to dare to test. Many companies don’t take the leap for fear of complexity or cost. But AI is not an insurmountable revolution. Start with a simple project, measure the results and adjust.

The important thing is to define clear metrics: cost reduction, time savings, user satisfaction or carbon footprint reduction. AI is a science, not magic. The benefits are concrete and measurable.