MAI‑Transcribe‑1 is Microsoft’s first‑generation in‑house speech recognition model. It supports 25 languages and is optimized for real‑world, noisy enterprise audio, such as meetings and call centers.
Today, we’re announcing MAI-Image-2 — pushing MAI into the top three text-to-image labs in the world on the Arena.ai leaderboard. You can try it now in the MAI Playground, where you can experiment with the latest available MAI models and share feedback directly with our teams.
Microsoft releases MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 in-house AI models. Inside the strategy reshaping the 3B OpenAI partnership.
MAI released models that can transcribe voice into text as well as generate audio and images after the group's formation six months ago.
Microsoft has launched three in-house AI models — MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 — through Foundry, undercutting OpenAI and Google on price.
With MAI-Voice-1 and MAI-Transcribe-1, Microsoft is delivering exactly that: a comprehensive, first-party audio AI stack purpose-built for developers.
Starting today, every developer can build with MAI models, including MAI-Transcribe-1, through Microsoft Foundry. You can also try them in the MAI Playground (US only).
Today we're announcing 3 new world class MAI models, available in ...
MAI-Image-2-Efficient is your production workhorse. Use it when you need volume, speed, and tight cost control — product shots, marketing creatives, UI mockups, branded assets, batch pipelines.
MAI-Image-1 is currently available in all countries that can access Bing Image Creator and Copilot Labs. Today, we’re announcing MAI-Image-1, our first image generation model developed entirely in-house, debuting in the top 10 text-to-image models on LMArena.