Have you ever ever puzzled how a lot the mannequin you choose in Copilot Cowork truly adjustments the consequence — and the invoice? I ran a correct little experiment.
Immediately is July 1, 2026 — the day you want Copilot Credit obtainable to maintain utilizing Cowork. Which makes this the proper second to ask the query everybody retains asking me: which mannequin ought to I choose, and what does it price me? In my post on the UI refresh and the /cost skill(opens in new window) I confirmed you find out how to see what a process prices. This put up is the pure follow-up: similar immediate, completely different brains, measured facet by facet.
And the good factor is that this turns into a reference I can re-run each time a brand new mannequin reveals up within the picker. 🤠 Let’s take a more in-depth look.
- The experiment – one prompt, every model
- The results – same prompt, different outcomes
- Conclusion – what I would actually do
- Closing thoughts
The concept is easy. I take one immediate, run it by every mannequin in Cowork, after which decide two issues: how good is the consequence, and what number of Copilot Credit did it burn (through /price). Nothing fancy concerning the process — a coaching introduction deck for Cowork, the form of factor plenty of you’re constructing proper now anyway.
The fashions within the ring: Claude Sonnet 5, Claude Opus 4.8, GPT-5.5, and Auto (which lets Cowork choose the very best mannequin for the job). Then, for good measure, I in contrast in opposition to Copilot Chat and Copilot in PowerPoint — as a result of these don’t contact your credit in any respect.
One necessary notice earlier than the outcomes: I intentionally didn’t say something about how the slides ought to look. No template, no colour steering, no “make it match our model”. Simply the content material construction. That’s on goal — I wish to see what every mannequin does when left to its personal style. Right here is the precise immediate I used, the identical one each single time:
Create a visible 6-slide presentation for a coaching introducing Microsoft Copilot Cowork to massive enterprise firm information staff, who've used Microsoft 365 Copilot. Uncover info from Microsoft Study and Microsoft Help
Embody these slides within the presentation
1. Title slide — "Introducing Cowork"
2. What's Cowork (plain-language definition)
3. Key capabilities throughout electronic mail, calendar, conferences and paperwork
4. Copilot Chat versus Copilot Brokers versus Copilot Cowork
5. Expertise and automation — what you may construct
6. How you can get began
Maintain the wording concise
Let me take them one after the other, as a result of the unfold right here genuinely stunned me.
Claude Sonnet 5 – the effectivity champion
Sonnet used 1 Copilot Credit score. One!
Now, I’m not going to fake I totally belief that quantity — the quantity is so small that I’m truthfully unsure it’s being calculated appropriately. However even when the true determine is larger, the purpose stands.
Replace: At some point later ( 2nd of July) after I ask for the price on this similar dialog, I now get 23,7. I’d not be stunned to see this rise to 100-150 credit ultimately as Sonnet 5 token price is 2/5 of Claude Opus. There may be very probably a bug in Copilot Credit calculation concerning Sonnet 5.

The ensuing PowerPoint is sort of good wanting. Easy, nevertheless it appears to be like adequate. For the worth, it is a winner, no query.



Claude Opus 4.8 – a very good wanting deck
Opus used 325.8 Copilot Credit — about $3.26 on PayGo at $0.01 per credit score.
The presentation appears to be like good. It’s clearly higher than the one Sonnet made — the content material is extra considerate and it’s extra visible. That is the frontier mannequin doing frontier-model issues. You pay extra, you get extra polish. This doesn’t imply each slide appears to be like higher robotically – for instance from these outcomes I like Sonnet’s Copilot Chat vs Brokers vs Cowork comparability extra.



GPT-5.5 – the costly lesson
This one took a very very long time to create the presentation. And it used probably the most credit of the entire check: 1291.4 — roughly $12.91.
Right here is the trustworthy half: the ensuing PowerPoint didn’t even open till the PowerPoint app repaired it a couple of occasions. And as soon as it did open, the consequence was the weakest of all of the choices.



So my takeaway shouldn’t be “GPT-5.5 is unhealthy” — it’s extra particular than that: primarily based on this check, don’t attain for GPT-5.5 to construct displays. Use it for different kinds of labor the place it shines.
Auto – the good center path
Auto used 227.9 Copilot Credit — about $2.28.
And the result’s actually good. Visible and considerate. My hunch is that the slide deck itself was generated with Opus, whereas among the analysis was dealt with by a lighter mannequin to maintain the worth down — which is strictly what you need Auto to be doing. It got here in cheaper than pure Opus (227.9 vs 325.8 credit) whereas touchdown very shut on high quality. I’d choose this consequence out of those three.



The scoreboard
Right here is the entire thing in a single view (price at $0.01/credit score on PayGo):
| Mannequin | Copilot Credit | ~Price | Verdict |
|---|---|---|---|
| Claude Sonnet 5 | 1 when writing the put up, up to date to 23,7 a day after. In all probability a bug – I’d put this to vary 100-150 if following Sonnet 5 token prices. | ~$0.01 | Greatest worth — easy however adequate. Nonetheless good worth even 100-150 credit could be used however with smaller distinction to auto . |
| Claude Opus 4.8 | 325.8 | ~$3.26 | Excellent visuals and content material |
| GPT-5.5 | 1291.4 | ~$12.91 | Slowest, priciest, weakest deck |
| Auto | 227.9 | ~$2.28 | Excellent visuals and high quality, decrease price than Opus |
Stepping exterior Cowork – the app comparability
Cowork isn’t the one place Copilot can construct a deck for you. So I ran the identical transient by two different routes that don’t spend any credit, as a result of their utilization is included in your Copilot license.
Copilot Chat – Design a presentation
I requested Copilot Chat (on Auto) to construct the deck utilizing the Design a presentation functionality.
The consequence isn’t good — there may be clearly some effort on visuals — however it’s not the one for coaching use. It might want plenty of handbook work to get there. My tip: with Copilot Chat, it is smart to do a planning session first, the place you draft the define in a dialog, and then feed that consequence into Design a presentation to generate the slides. Two steps, significantly better final result.
Simply a few slides of this run final result:


Copilot in PowerPoint – the shock
Then I requested Copilot in PowerPoint (on Auto) to create the identical presentation. And that is the place it will get good.
PowerPoint began by asking me questions concerning the presentation earlier than it constructed something. This era took the longest of the whole lot I attempted — however the consequence may be very visible, it additionally generates pictures, and the content material is sweet too. Sonnet and Opus produced extra readable slides, however that may be a hole you would shut in PowerPoint with a greater immediate or a template. Or simply fine-tune some colours of parts to extend distinction and accessibility to learn it.



I’ve to say I used to be genuinely stunned by the standard right here ( maybe I ought to have used this extra throughout previous couple of months). The catch is the interplay mannequin: there will likely be questions it’s a must to reply, so you may’t simply hearth the immediate and stroll away. Actually you could preserve PowerPoint open the entire time it generates — in contrast to Cowork, the place you hand over the duty and may shut the browser fully. Cowork retains working within the cloud. That distinction issues greater than it sounds.
A number of clear takeaways from this one:
- For credits-per-quality, Claude Sonnet 5 is the winner — assuming /price is exhibiting me the true quantity. Even when that “1 credit score” is absolutely 100-150, it’s nonetheless a very good worth. Auto and Opus 4.8 provides you the higher wanting deck, however that visuality comes with some price – nonetheless I’d be in all probability utilizing Auto mannequin if the distinction is between 150 to 230. And GPT-5.5 — primarily based on this check — is one to stay away from displays and level at different work as a substitute.
- Copilot in PowerPoint is a winner too, and truthfully the shock of the day. It prices no credit (it’s included in your Copilot license) and it clearly makes use of Claude fashions within the background, primarily based on the consequence. The draw back is it’s a must to reply its questions and preserve the app open whereas it really works.
- We’re utilizing generative AI right here – outcomes will likely be at all times completely different. Generally Opus can present textual content alignment or different errors, and typically content material mannequin chooses simply doesn’t work for the consequence. The higher the immediate, extra constant outcomes you will get – and thus higher outcomes.
So if I’m making a single, easy PowerPoint and I’ve the time to sit down with it, I’d use Copilot in PowerPoint — no credit, nice visuals. In any other case I’d use Cowork with Sonnet to see if I get the consequence that’s adequate. Relying on complexity of the supply supplies, I would create a PowerPoint with Cowork utilizing Auto, however not earlier than I’ve examined out that the result’s what I would like with Sonnet.
However right here is the larger image. This check is simply one kind of process. The second you want a number of outcomes from a single immediate — a deck and a Phrase doc and an Excel abstract and a couple of emails — going app by app means plenty of time hopping between Microsoft 365 apps and answering questions in every one. That’s precisely the job Cowork was constructed for: one immediate, stroll away, come again to completed work.
For that form of multi-output work in Cowork, lean on Sonnet to avoid wasting credit whereas nonetheless getting fairly good outcomes — and attain for Auto or Opus when the polish genuinely issues.
New fashions will preserve touchdown within the Cowork picker (Sonnet 5 wasn’t even there yesterday). So as a substitute of re-judging each new mannequin from scratch, preserve 2–3 normal check duties ( prompts and sources) in your again pocket and re-run them every time. Identical enter, new mind — the variations leap proper out. Run these, jot down the /price quantity and a one-line high quality notice every time, and you’ve got a dwelling benchmark as a substitute of a intestine feeling. Notice to myself: preserve this precise put up because the template for the following mannequin.
What I like most about this little experiment is that mannequin alternative in Cowork is an actual lever — for high quality and for price — and now, with /price, it’s a lever you may truly measure. The winner at present won’t be the winner subsequent month, and that’s the enjoyable of it.
Have you ever run the identical immediate throughout completely different fashions but? I’d love to listen to which one got here out on prime to your form of work — drop a remark. And go run /price on the duties you truly do that week. It’s the most helpful 5 minutes you’ll spend in Cowork proper now.
Thanks for studying — and right here’s to letting the machines do the heavy lifting whereas we go seize that espresso or tea. ☕
PS. Utilizing Auto choice, Cowork used simply 76 credit to jot down the primary draft of this textual content. The extra info you present the higher consequence you get sooner.
Revealed by
I work, weblog and discuss Future Work : AI, Microsoft 365, Copilot, Loop, Azure, and different companies & platforms within the cloud connecting digital and bodily and other people collectively.
I’ve 30 years of expertise in IT enterprise on a number of industries, domains, and roles.
View all posts by Vesa Nopanen





