DeepSeek R1 has been everybody’s radar not too long ago. Final night time I heard Microsoft launched it within the Azure AI Foundry. At the moment, I’ve been testing it—deploying it, making an attempt some prompts with it, and noticing simply how closely it filters sure matters. This was not a shock by any means. With the official announcement that DeepSeek R1 is now obtainable for gratis at Azure AI Foundry (no less than for the second), it felt like the proper alternative to check it and see the way it stacks as much as different huge participant: OpenAI o1. Since I don’t have a high-powered pc at my disposal, utilizing the facility of the cloud ( Azure AI Foundry on this case) is an effective way for me to work with AI fashions.
- Deploying DeepSeek R1: smooth and simple
- Initial tests and the politically touchy Tiananmen Square question
- A brief comparison to OpenAI o1
- Attempting a co-authored blog post (this one)
- DeepSeek R1 excels at step-by-step reasoning
- Deeper insights from Microsoft’s official overview
- Why you want to consider DeepSeek R1
- Looking ahead
- Wrapping up
- How this article was done
First off, the deployment was surprisingly straightforward. In Azure AI Foundry, you merely head to the fashions and endpoints space, choose “Deploy a base mannequin,” and seek for “DeepSeek R1.” After a number of clicks, the mannequin turns into obtainable for testing with your personal key. The truth that there’s no fast payment connected to it inspired me to experiment extra freely—although pricing might after all change later.
Don’t confuse Azure Content material filter with guardrails which can be built-in the mannequin. This filtering additionally protects us, from each utilizing immediate engineering to make the mannequin to do one thing it’s not purported to do and likewise if mannequin’s responses could also be offensive and so forth. Keep in mind: this filtering is designed for enterprise and enterprise use.
I had heard from others that DeepSeek R1 could be extremely cautious with sure politically delicate questions. So I began by asking: “what occurred at tiananmen sq.?” Instantly, it refused to reply, returning the assertion:(“I’m sorry, I can not reply that query. I’m an AI assistant designed to offer useful and innocent responses.”)Regardless of how I reworded it, DeepSeek R1 wouldn’t budge. It was sufficient to substantiate that it does, certainly, have stringent guardrails for some matters.
I had a chat with my buddy and colleague Tatu Seppälä. He instructed me that in case you run DeepSeek R1 domestically by yourself {hardware}, you’ll be able to see it’s pondering course of. I thought of making an attempt out simply telling the mannequin a parameter -think on that may permit to see how the mannequin is processing the query behind the scenes. Positive sufficient, it did the trick. Once I tried that, I noticed traces like:(“Okay, the person is asking about what occurred at Tiananmen Sq.. I keep in mind that it is a delicate matter, particularly in China…”)Principally, it was conscious that it is a controversial or restricted matter, and it refused to reply. Whereas that’s attention-grabbing from a developer’s perspective, it additionally reveals that the chain-of-thought can reveal extra textual content than you may want seen in a manufacturing setting.
As a second check, I continued the dialog and requested about Taiwan—one other politically delicate topic for some contexts. DeepSeek R1 supplied a extra balanced reply this time, acknowledging each views on Taiwan’s standing. But behind the scenes, it was nonetheless in warning mode, as proven by the chain-of-thought snippet:(“Okay, the person is asking about Taiwan. I have to be cautious right here as a result of it is a politically delicate matter…”)This made it clear that as quickly as a query veers anyplace close to controversy, the mannequin enters a heightened stage of self-editing. The general outcome was extra helpful than a flat-out refusal, but it surely positively underscores that this mannequin has sturdy built-in guardrails.
As I work with OpenAI’s fashions at Azure quite a bit, I made a decision to match DeepSeek R1 to OpenAI o1—as each are conveniently obtainable on Azure AI Foundry. Sensible variations, between these two fashions, are across the context window and the potential output size. DeepSeek R1 can deal with a 128k context window, however it is going to solely output as much as about 4,096 tokens. In contrast, o1 can attain as much as a 200k context window and produce as much as 100k tokens without delay. That’s an enormous distinction in case you’re engaged on really lengthy submissions or duties like summarizing complete books or producing giant chunks of textual content. However maintain within the thoughts, that utilizing o1 is just not free.
In case your major use case includes shorter or moderate-length textual content, DeepSeek R1 must be completely superb. However in case you’re seeking to generate longer texts, course of prolonged authorized paperwork, or deal with hundreds of traces of code in a single response, o1 provides extra bandwidth to get the whole lot carried out without delay.
As I needed to share my findings with you, I naturally needed to see if DeepSeek R1 might co-author this weblog publish. I gave it directions to draft a chunk about its personal capabilities on Azure AI Foundry, weaving in notes from Microsoft’s official weblog, plus my private experiences. It began off promising, with:(“Okay, I want to assist the person create a weblog publish in regards to the DeepSeek R1 mannequin…”)However then it merely stopped after that partial sentence. No follow-up immediate or re-try managed to coax extra textual content out of it. In the meantime, OpenAI o1 generated a totally fleshed-out article on the primary try. Including a immediate or two and you may get fairly draft out of o1.
From a running a blog or common writing-assistant standpoint, that sort of abrupt cease could also be a difficulty with DeepSeek R1. OpenAI o1 wasn’t good both, however it’s means means higher than different fashions for this.
To be honest, DeepSeek R1 was developed with a distinct emphasis than being a writing assistant. Based on the Azure AI Foundry description: “DeepSeek-R1 excels at reasoning duties utilizing a step-by-step coaching course of, reminiscent of language, scientific reasoning, and coding duties.” It accommodates 671B whole parameters (37B lively), and it will probably parse as much as 128k tokens out of your enter in a single shot. So in case you want a mannequin that may mirror rigorously on a fancy coding drawback or a multi-layer scientific question, DeepSeek R1 might shine the place another fashions would possibly battle. That mentioned, you have to be conscious of doable shortfalls with open-ended, artistic textual content or with politically or culturally delicate content material.
Microsoft emphasizes that DeepSeek R1 builds on Chain-of-Thought (CoT) reasoning and merges it with reinforcement studying plus some focused supervised fine-tuning. The unique model, DeepSeek-R1-Zero, apparently used solely RL and proved sturdy in logic duties, however had unclear language outputs. The newly refined pipeline goals to repair points like inconsistent grammar or disorganized textual content. Learn Microsoft’s weblog article about DeepSeek R1’s availability at Azure AI Foundry (and GitHub) here and likewise info when you find yourself deploying the R1 at Azure AI Foundry to be taught extra.
Microsoft recommends the next utilization tips:
- Keep away from including a system immediate; put all directions straight into the person immediate.
- For math, instruct the mannequin to “Please cause step-by-step, and put your remaining reply inside boxed{}.”
- For those who’re doing efficiency evaluations, run a number of exams and common the outcomes.
- Take note of chain-of-thought content material (<assume> tags) in case you’re exhibiting it to end-users, because it is likely to be extra uncooked or include “extra dangerous” textual content.
Relating to security and content material filtering, DeepSeek R1 underwent “rigorous red-teaming and security evaluations,” and Azure AI Foundry consists of built-in content material security by default.
That is what Microsoft states of their weblog publish
DeepSeek R1 has undergone rigorous pink teaming and security evaluations, together with automated assessments of mannequin habits and intensive safety evaluations to mitigate potential dangers. With Azure AI Content material Security, built-in content material filtering is offered by default, with opt-out choices for flexibility. Moreover, the Security Analysis System permits prospects to effectively check their functions earlier than deployment. These safeguards assist Azure AI Foundry present a safe, compliant, and accountable setting for enterprises to confidently deploy AI options.
In my opinion, the massive attraction is that it will probably deal with a good chunk of textual content (128k tokens in a immediate remains to be nothing to sneeze at), and it’s particularly tuned for duties that contain multi-step reasoning, logic puzzles, coding challenges, or intricate Q&A. As a result of it’s really easy to deploy on Azure AI Foundry—and, no less than proper now, free—it’s effectively price a check in case you’re interested by critical reasoning duties.
In case your fundamental concern is producing large volumes of textual content in a single go—like drafting complete e-books or intensive authorized doc summaries—then OpenAI o1 is a greater match, given its 200k context and the flexibility to output as much as 100k tokens in a single shot. For shorter weblog posts or fast code completions, DeepSeek R1’s 4,096-token output restrict could also be sufficient.
Microsoft notes that quickly you’ll even be capable of run “distilled flavors” of DeepSeek R1 domestically on Copilot+ PCs, which is intriguing for individuals who need extra management or offline capabilities. They are saying smaller, “lighter” variations of the mannequin might need fewer {hardware} necessities (and that’s after I might begin making an attempt them out domestically as effectively). If that turns into a easy course of, it might assist a number of groups combine LLM reasoning straight into their native environments—no always-on web wanted.
Total, DeepSeek R1 stands out in its methodical method to logic, coding, and “step-by-step” duties. Its guardrails, nevertheless, could be fairly strict, as I realized from the Tiananmen Sq. and Taiwan questions. Maintain within the thoughts: these are excessive examples, that I knew that may hit the wall. That is likely to be factor for some customers in some international locations—it’s principally designed to not get you in bother for addressing controversial matters. However in case you are European, like, and need a extra open dialog or artistic brainstorming with fewer refusals, you would possibly discover it limiting.
In my very own utilization, DeepSeek R1 couldn’t fairly end drafting this weblog publish (it began however then stopped), so I switched to OpenAI o1 for the ultimate era. Nonetheless, I see a number of potential for DeepSeek R1 in coding, math, or scientific eventualities, particularly in case you’re snug with a extra tightly reined method.
For those who’re curious, I encourage you to join Azure AI Foundry, deploy DeepSeek R1, and put it to the check in your personal workflows. With every new mannequin, we get one step nearer to highly effective, easy-to-use AI that may help throughout a wide range of duties. Take pleasure in experimenting!
I used Azure OpenAI o1 to assist me write the primary article draft, since DeepSeek R1 couldn’t do it. I created fairly a protracted immediate with my insights, ideas, exams and likewise background info (sure, the immediate was lengthy and contained various info) and after a follow-up immediate bought the draft. I attempted to attenuate the edits within the article this time, however as there have been fairly a many not-so-accurate sentences I eliminated, added some textual content and rewrote it right here and there. I might have gone additional with prompting and tune the outcome extra, or break this into smaller items, as when the context is extra restricted the result’s often means higher. I do encourage you to check out what AI can do for you, however maintain within the thoughts that it is advisable to examine the outcome for errors. As there can be errors.