Asserting new fine-tuning fashions and strategies in Azure AI Foundry

May 13, 2025

19

As we speak, we’re excited to announce two main enhancements to mannequin fine-tuning in Azure AI Foundry—Reinforcement Wonderful-Tuning (RFT) with o4-mini, coming quickly, and Supervised Wonderful-Tuning (SFT) for the 4.1-nano mannequin, out there now.

As we speak, we’re excited to announce three main enhancements to mannequin fine-tuning in Azure AI Foundry—Reinforcement Wonderful-Tuning (RFT) with o4-mini (coming quickly), Supervised Wonderful-Tuning (SFT) for the GPT-4.1-nano and Llama 4 Scout mannequin (out there now). These updates replicate our continued dedication to empowering organizations with instruments to construct extremely custom-made, domain-adapted AI programs for real-world influence.

With these new fashions, we’re unblocking two main avenues of LLM customization: GPT-4.1-nano is a robust small mannequin, supreme for distillation, whereas o4-mini is the primary reasoning mannequin you possibly can fine-tune, and Llama 4 Scout is a best-in-class open supply mannequin.

Reinforcement Wonderful-Tuning with o4-mini

Reinforcement Wonderful-Tuning introduces a brand new degree of management for aligning mannequin conduct with complicated enterprise logic. By rewarding correct reasoning and penalizing undesirable outputs, RFT improves mannequin decision-making in dynamic or high-stakes environments.

Coming quickly for the o4-mini mannequin, RFT unlocks new potentialities to be used instances requiring adaptive reasoning, contextual consciousness, and domain-specific logic—all whereas sustaining quick inference efficiency.

Actual world influence: DraftWise

DraftWise, a authorized tech startup, used reinforcement fine-tuning (RFT) in Azure AI Foundry Fashions to reinforce the efficiency of reasoning fashions tailor-made for contract era and assessment. Confronted with the problem of delivering extremely contextual, legally sound options to attorneys, DraftWise fine-tuned Azure OpenAI fashions utilizing proprietary authorized information to enhance response accuracy and adapt to nuanced consumer prompts. This led to a 30% enchancment in search end result high quality, enabling attorneys to draft contracts quicker and deal with high-value advisory work.

Reinforcement fine-tuning on reasoning fashions is a possible sport changer for us. It’s serving to our fashions perceive the nuance of authorized language and reply extra intelligently to complicated drafting directions, which guarantees to make our product considerably extra helpful to attorneys in actual time.

—James Ding, founder and CEO of DraftWise.

When must you use Reinforcement Wonderful-Tuning?

Reinforcement Wonderful-Tuning is greatest suited to use instances the place adaptability, iterative studying, and domain-specific conduct are important. You must take into account RFT in case your situation includes:

Customized Rule Implementation: RFT thrives in environments the place resolution logic is very particular to your group and can’t be simply captured by way of static prompts or conventional coaching information. It permits fashions to be taught versatile, evolving guidelines that replicate real-world complexity.

Area-Particular Operational Requirements: Superb for eventualities the place inner procedures diverge from business norms—and the place success is dependent upon adhering to these bespoke requirements. RFT can successfully encode procedural variations, resembling prolonged timelines or modified compliance thresholds, into the mannequin’s conduct.

Excessive Choice-Making Complexity: RFT excels in domains with layered logic and variable-rich resolution bushes. When outcomes depend upon navigating quite a few subcases or dynamically weighing a number of inputs, RFT helps fashions generalize throughout complexity and ship extra constant, correct choices.

Instance: Wealth advisory at Contoso Wellness

To showcase the potential of RFT, take into account Contoso Wellness, a fictitious wealth advisory agency. Utilizing RFT, the o4-mini mannequin discovered to adapt to distinctive enterprise guidelines, resembling figuring out optimum shopper interactions primarily based on nuanced patterns just like the ratio of a shopper’s internet price to out there funds. This enabled Contoso to streamline their onboarding processes and make extra knowledgeable choices quicker.

Supervised Wonderful-Tuning now out there for GPT-4.1-nano

We’re additionally bringing Supervised Wonderful-Tuning (SFT) to the GPT-4.1-nano mannequin—a small however highly effective basis mannequin optimized for high-throughput, cost-sensitive workloads. With SFT, you possibly can instill your mannequin with company-specific tone, terminology, workflows, and structured outputs—all tailor-made to your area. This mannequin will probably be out there for fine-tuning within the coming days.

Why Wonderful-tune GPT-4.1-nano?

Precision at Scale: Tailor the mannequin’s responses whereas sustaining velocity and effectivity.

Enterprise-Grade Output: Guarantee alignment with enterprise processes and tone-of-voice.

Light-weight and Deployable: Good for eventualities the place latency and value matter—resembling customer support bots, on-device processing, or high-volume doc parsing.

In comparison with bigger fashions, 4.1-nano delivers quicker inference and decrease compute prices, making it effectively suited to large-scale workloads like:

Buyer help automation, the place fashions should deal with 1000’s of tickets per hour with constant tone and accuracy.

Inside information assistants that comply with firm fashion and protocol in summarizing documentation or responding to FAQs.

As a small, quick, however extremely succesful mannequin, GPT-4.1-nano makes an awesome candidate for distillation as effectively. You should use fashions like GPT-4.1 or o4 to generate coaching information—or seize manufacturing visitors with saved completions—and train 4.1-nano to be simply as good!

Fine-tune gpt-4.1-nano demo in Azure AI Foundry.

Llama 4 Wonderful-Tuning now out there

We’re additionally excited to announce help for fine-tuning Meta’s Llama 4 Scout—a leading edge,17 billion energetic parameter mannequin which affords an business main context window of 10M tokens whereas becoming on a single H100 GPU for inferencing. It’s a best-in-class mannequin, and extra highly effective than all earlier era llama fashions.

Llama 4 fine-tuning is obtainable in our managed compute providing, permitting you to fine-tune and inference utilizing your individual GPU quota. Obtainable in each Azure AI Foundry and as Azure Machine Studying parts, you’ve gotten entry to further hyperparameters for deeper customization in comparison with our serverless expertise.

Get began with Azure AI Foundry right now

Azure AI Foundry is your basis for enterprise-grade AI tuning. These fine-tuning enhancements unlock new frontiers in mannequin customization, serving to you construct clever programs that assume and reply in ways in which replicate your corporation DNA.

Use Reinforcement Wonderful-tuning with o4-mini to construct reasoning engines that be taught from expertise and evolve over time. Coming quickly in Azure AI Foundry, with regional availability for East US2 and Sweden Central.

Use Supervised Wonderful-Tuning with 4.1-nano to scale dependable, cost-efficient, and extremely custom-made mannequin behaviors throughout your group. Obtainable now in Azure AI Foundry in North Central US and Sweden Central.

Attempt Llama 4 scout nice tuning to customise a best-in-class open supply mannequin. Obtainable now in Azure AI Foundry mannequin catalog and Azure Machine Studying.

With Azure AI Foundry, fine-tuning isn’t nearly accuracy—it’s about belief, effectivity, and flexibility at each layer of your stack.

Discover additional:

We’re simply getting began. Keep tuned for extra mannequin help, superior tuning strategies, and instruments that will help you construct AI that’s smarter, safer, and uniquely yours.

Buy now

Asserting new fine-tuning fashions and strategies in Azure AI Foundry

Reinforcement Wonderful-Tuning with o4-mini

Actual world influence: DraftWise

When must you use Reinforcement Wonderful-Tuning?

Instance: Wealth advisory at Contoso Wellness

Supervised Wonderful-Tuning now out there for GPT-4.1-nano

Why Wonderful-tune GPT-4.1-nano?

Llama 4 Wonderful-Tuning now out there

Get began with Azure AI Foundry right now

Discover additional:

Related Articles

NGA, Japanese group to strengthen 5G, 6G collaboration

What the rise of Klarna and related companies tells us concerning the financial system

RFK Jr. could be very blissful if each American wears an Apple Watch

LEAVE A REPLY Cancel reply

Latest Articles

NGA, Japanese group to strengthen 5G, 6G collaboration

What the rise of Klarna and related companies tells us concerning the financial system

RFK Jr. could be very blissful if each American wears an Apple Watch

Ultrasonic excitation: A brand new device within the industrial depowdering toolbox

AI Makes Staff Extra Productive, PwC Finds

Buy now

Asserting new fine-tuning fashions and strategies in Azure AI Foundry

Reinforcement Wonderful-Tuning with o4-mini

Actual world influence: DraftWise

When must you use Reinforcement Wonderful-Tuning?

Instance: Wealth advisory at Contoso Wellness

Supervised Wonderful-Tuning now out there for GPT-4.1-nano

Why Wonderful-tune GPT-4.1-nano?

Llama 4 Wonderful-Tuning now out there

Get began with Azure AI Foundry right now

Discover additional:

Related Articles

LEAVE A REPLY Cancel reply

Stay Connected

Latest Articles