The Microsoft Foundry mannequin catalog retains rising — and I don’t consider I’m the one one who thinks choosing the fitting mannequin for every activity is turning into a problem for anybody constructing brokers. Is GPT-5.4 overkill for this immediate? Ought to I be utilizing a Claude mannequin for this reasoning step? Do I actually need to wire up 4 totally different deployments simply to maintain my agent quick, good, and manageable?
That is the place Mannequin Router mannequin is available in — and it may simplify how brokers make the most of fashions in Microsoft Foundry. As a pleasant addition, it now routes to Claude fashions too, together with the most recent Claude Opus 4.7. In the identical wave, Microsoft Foundry additionally introduced in MAI-Picture-2e, a quicker and extra environment friendly sibling of MAI-Picture-2 for text-to-image era.
Let me stroll you thru each.
What’s Mannequin Router mannequin?
Mannequin Router is a mannequin in Microsoft Foundry that intelligently routes your prompts in actual time to essentially the most appropriate massive language mannequin behind the scenes. You deploy it as soon as — like every other Foundry mannequin — and from then in your agent (or app) talks to a single deployment, whereas Mannequin Router decides per request whether or not the immediate needs to be dealt with by a small, quick mannequin or by a top-tier reasoning powerhouse.
The present model is 2025-11-18 (newest) and it’s a dwelling model — new fashions and capabilities are added in place, no version-bump migration wanted.
The choice occurs based mostly on immediate complexity, reasoning wants, activity sort, and different attributes — and it doesn’t retailer your prompts. It honors your deployment knowledge zone boundaries.
Why use Mannequin Router in brokers?
For brokers, it is a significant shift. An agent sometimes does many small steps — some trivial, some reasoning-heavy — and utilizing the identical costly mannequin for each single one is wasteful. With Mannequin Router, the agent calls one single mannequin deployment, and Mannequin Router does the dispatching:
- A easy classification step? Routed to a small mannequin — low-cost and quick.
- A fancy multi-step reasoning activity? Routed to a prime reasoning mannequin — correct.
- A activity the place Claude is genuinely the very best device? Routed to Claude.
That makes Mannequin Router an extremely versatile workhorse — one mannequin your agent calls, many fashions doing the precise work beneath.
You may also decide a routing mode to bias the choice:
- Balanced (default) — considers each value and high quality dynamically. Nice general-purpose default.
- High quality — prioritizes accuracy. Finest for complicated reasoning and important outputs.
- Price — prioritizes value financial savings. Excellent for high-volume, budget-sensitive workloads.
And with Mannequin subset, you determine precisely which underlying fashions are eligible for routing — helpful for compliance, value management, or for guaranteeing a minimal context window throughout the set. Constructed-in computerized failover is the icing on the cake — if one mannequin has a transient problem, your request quietly falls over to the following greatest one.
Claude fashions in Mannequin Router — together with Opus 4.7 (preview)
The most recent Mannequin Router model provides a recent set of Anthropic fashions alongside the OpenAI, Meta, xAI and DeepSeek line-up:
- claude-haiku-4-5
- claude-sonnet-4-5
- claude-opus-4-1
- claude-opus-4-6
- claude-opus-4-7 — Anthropic’s most succesful mannequin
One necessary nuance — and that is the one most individuals miss on day one: Mannequin Router help for the Claude household is presently in preview, and Claude fashions should be deployed individually from the Microsoft Foundry mannequin catalog earlier than Mannequin Router can path to them. The OpenAI fashions within the routing set are run “from inside” Mannequin Router and don’t want a separate deployment. Claude is the exception — deploy the Claude variants you need first, then allow them in your Mannequin Router subset, and the magic kicks in.
Price noting that Claude will not be the one one in preview. The 2025-11-18 routing set additionally marks DeepSeek-V3.1, DeepSeek-V3.2, gpt-oss-120b, Llama-4-Maverick-17B-128E-Instruct-FP8, grok-4, and grok-4-fast-reasoning as Mannequin Router preview entries. The OpenAI GPT-4.x / GPT-5.x household is the GA core at present — the remaining is a quickly rising preview frontier.

That is precisely the sort of setup I would like for an enterprise agent — let Mannequin Router decide between OpenAI and Claude per immediate, in Balanced or High quality mode, and let me cease arguing with myself about which mannequin to hard-code. Simply plan for it as a preview at present and validate rigorously earlier than you push it into manufacturing.
Limits — nonetheless necessary
Right here is the catch I at all times remind clients about: the efficient context window of Mannequin Router is the restrict of the smallest underlying mannequin. Meaning an API name with a really massive context will solely succeed if the immediate occurs to be routed to a mannequin that may deal with it.
- Use Mannequin subset to limit routing to fashions that every one help the context window you want.
- Shrink the immediate — summarize it, truncate to the related elements, or use doc embeddings to retrieve solely what issues.

Area-wise, Mannequin Router is presently (when writing this text) out there in East US 2 and Sweden Central, on World Normal and Knowledge Zone Normal deployments.
Imaginative and prescient inputs are accepted (all underlying fashions settle for picture enter), however the routing choice itself relies on the textual content solely. Audio enter will not be supported.
When would I exploit Mannequin Router?
The largest cause for me is easy — I wish to give my brokers a much less mannequin endpoints – even simply the Mannequin Router one. I configure one Mannequin Router deployment, level my brokers at it, and from that second on Mannequin Router can use whichever mannequin in its disposal most closely fits the immediate — small and quick for trivial steps, top-tier reasoning for the onerous ones, and even Claude fashions for circumstances the place Anthropic is the fitting device (so long as I’ve deployed Claude fashions individually to my Foundry first).
That single-endpoint sample simplifies agent constructing. My agent code doesn’t have to know which mannequin is greatest for which step. It doesn’t want an enormous swap assertion of “if reasoning activity → name mannequin X, else name mannequin Y.” It simply calls Mannequin Router — and Mannequin Router does the dispatching throughout all the things I’ve made out there to it.
And no, Mannequin Router will not be a “silver bullet” that’s reply to all the things. There are various circumstances and the explanation why you wish to management which mannequin to make use of. There are additionally many circumstances the place Mannequin Router will simply work.
Mannequin Router provides additionally:
- A clear method to optimize for value or high quality with out rewriting agent code each time a brand new mannequin lands.
- A straightforward method to fold in brand-new fashions (like Claude Opus 4.7) — deploy them as soon as, add them to the subset, and the brokers decide them up robotically.
- Constructed-in failover for resilience.
Meet MAI-Picture-2e — Microsoft’s quicker picture mannequin
Now one thing almost-completely totally different — however nonetheless about fashions. The second I wish to spotlight is MAI-Picture-2e, one in every of Microsoft’s first-party picture era fashions in Microsoft Foundry.

MAI-Picture-2e is a text-to-image era mannequin that produces high-quality, visually wealthy photos from pure language prompts. It’s constructed on prime of MAI-Picture-2 with a transparent promise — as much as 22% quicker and 4 occasions extra environment friendly than MAI-Picture-2, whereas protecting the identical degree of high quality. For builders constructing at scale, that’s the smartest selection.
Key capabilities:
- Textual content-to-image era — high-quality photos from pure language prompts.
- Photorealistic picture synthesis — reasonable imagery with constant visible construction, effectively fitted to idea visualization and content material creation.
- Product, branding and industrial design — product imagery, advertising and marketing visuals, model property, and industrial artistic workflows.
Specs:
- Enter size: as much as 32,000 tokens for the immediate.
- Output: a single PNG picture.
- Picture measurement: each width and peak should be no less than 768 pixels. The entire pixel depend (width × peak) should not exceed 1,048,576 — equal to 1024×1024. Both dimension can exceed 1024 so long as the full stays inside that price range — for instance 768×1365 is okay.
- Areas: World Normal deployment in West Central US, East US, West US, West Europe, Sweden Central, and South India.
You may deploy MAI-Picture-2e like every other Microsoft Foundry mannequin — from the Foundry portal or with a one-liner in Azure CLI — and also you name it by means of the MAI picture era API endpoint at https://<resource-name>.providers.ai.azure.com/mai/v1/photos/generations, utilizing Microsoft Entra ID or an API key. Or simply experiment with it utilizing the Playground in Microsoft Foundry.
When would I exploit MAI-Picture-2e?
- Excessive-volume, fast-turnaround situations — product imagery at scale, advertising and marketing variations, branded property, anyplace effectivity and value per picture matter.
- Inventive content material era — idea artwork, illustrations, and design exploration the place the velocity increase enables you to iterate extra in the identical time.
- Photorealistic visuals for advertising and marketing and industrial use.
For those who want absolutely the highest-fidelity output and velocity will not be the precedence, MAI-Picture-2 remains to be within the catalog. However for many workflows I’ve been constructing these days, MAI-Picture-2e is the higher default — quicker, cheaper, identical high quality bar.
A hat tip to these of Microsoft Foundry
Microsoft Foundry continues to evolve and acquire new options — GPT-5.5 and Claude fashions in Mannequin Router and MAI-Picture-2e for picture era are two good examples. Mannequin Router is the piece that makes that composition sensible. MAI-Picture-2e is the piece that makes high-volume picture workloads sustainable.
Hat on, AI flowing — give Mannequin Router a attempt, and if doable with a Claude Opus 4.7 deployment in your subset, and spin up an MAI-Picture-2e deployment subsequent to it.
Keep tuned — there may be extra Foundry goodness coming and here’s a tip for that: Have you ever registered to Microsoft Construct 2026? When you have not – do it now, it’s taking place subsequent week! –> https://build.microsoft.com/

Sources & additional studying
Revealed by
I work, weblog and discuss Future Work : AI, Microsoft 365, Copilot, Loop, Azure, and different providers & platforms within the cloud connecting digital and bodily and other people collectively.
I’ve 30 years of expertise in IT enterprise on a number of industries, domains, and roles.
View all posts by Vesa Nopanen





