
This weblog put up focuses on new options and enhancements. For a complete listing, together with bug fixes, please see the launch notes.
Clarifai’s Compute Orchestration helps you to deploy fashions by yourself compute, management how they scale, and determine the place inference runs throughout clusters and nodepools.
As AI programs transfer past single inference calls towards long-running duties, multi-step workflows, and agent-driven execution, orchestration must do extra than simply begin containers. It must handle execution over time, deal with failure, and route site visitors intelligently throughout compute.
This launch builds on that basis with native help for long-running pipelines, mannequin routing throughout nodepools and environments, and agentic mannequin execution utilizing Mannequin Context Protocol (MCP).
Introducing Pipelines for Lengthy-Working, Multi-Step AI Workflows
AI programs don’t break at inference. They break when workflows span a number of steps, run for hours, or must recuperate from failure.
Right now, groups depend on stitched-together scripts, cron jobs, and queue staff to handle these workflows. As agent workloads and MLOps pipelines develop extra complicated, this setup turns into onerous to function, debug, and scale.
With Clarifai 12.0, we’re introducing Pipelines, a local approach to outline, run, and handle long-running, multi-step AI workflows immediately on the Clarifai platform.
Why Pipelines
Most AI platforms are optimized for short-lived inference calls. However actual manufacturing workflows look very completely different:
Multi-step agent logic that spans instruments, fashions, and exterior APIs
Lengthy-running jobs like batch processing, fine-tuning, or evaluations
Finish-to-end MLOps workflows that require reproducibility, versioning, and management
Pipelines are constructed to deal with this class of issues.
Clarifai Pipelines act because the orchestration spine for superior AI programs. They allow you to outline container-based steps, management execution order or parallelism, handle state and secrets and techniques, and monitor runs from begin to end, all with out bolting collectively separate orchestration infrastructure.
Every pipeline is versioned, reproducible, and executed on Clarifai-managed compute, providing you with fine-grained management over how complicated AI workflows run at scale.
Let’s stroll by way of how Pipelines work, what you possibly can construct with them, and get began utilizing the CLI and API.
How Pipelines Work
At a excessive degree, a Clarifai Pipeline is a versioned, multi-step workflow made up of containerized steps that run asynchronously on Clarifai compute.
Every step is an remoted unit of execution with its personal code, dependencies, and useful resource settings. Pipelines outline how these steps join, whether or not they run sequentially or in parallel, and the way information flows between them.
You outline a pipeline as soon as, add it, after which set off runs that may execute for minutes, hours, or longer.
Initialize a pipeline challenge
This scaffolds an entire pipeline challenge utilizing the identical construction and conventions as Clarifai customized fashions.
Every pipeline step follows the very same footprint builders already use when importing fashions to Clarifai: a configuration file, a dependency file, and an executable Python entrypoint.
A typical scaffolded pipeline appears to be like like this:
On the pipeline degree, config.yaml defines how steps are linked and orchestrated, together with execution order, parameters, and dependencies between steps.
Every step is a self-contained unit that appears and behaves identical to a customized mannequin:
config.yaml defines the step’s inputs, runtime, and compute necessities
necessities.txt specifies the Python dependencies for that step
pipeline_step.py comprises the precise execution logic, the place you write code to course of information, name fashions, or work together with exterior programs
This implies constructing pipelines feels instantly acquainted. For those who’ve already uploaded customized fashions to Clarifai, you’re working with the identical configuration fashion, the identical versioning mannequin, and the identical deployment mechanics—simply composed into multi-step workflows.
Add the pipeline
Clarifai builds and variations every step as a containerized artifact, guaranteeing reproducible runs.
Run the pipeline
As soon as working, you possibly can monitor progress, examine logs, and handle executions immediately by way of the platform.
Below the hood, pipeline execution is powered by Argo Workflows, permitting Clarifai to reliably orchestrate long-running, multi-step jobs with correct dependency administration, retries, and fault dealing with.
Pipelines are designed to help all the pieces from automated MLOps workflows to superior AI agent orchestration, with out requiring you to function your personal workflow engine.
Observe: Pipelines are presently obtainable in Public Preview.
You can begin attempting them as we speak and we welcome your suggestions as we proceed to iterate. For a step-by-step information on defining steps, importing pipelines, managing runs, and constructing extra superior workflows, take a look at the detailed documentation right here.
Mannequin Routing with Multi-Nodepool Deployments
With this launch, Compute Orchestration now helps mannequin routing throughout a number of nodepools inside a single deployment.
Mannequin routing permits a deployment to reference a number of pre-existing nodepools by way of a deployment_config.yaml. These nodepools can belong to completely different clusters and may span cloud, on-prem, or hybrid environments.
Right here’s how mannequin routing works:
Nodepools are handled as an ordered precedence listing. Requests are routed to the primary nodepool by default.
A nodepool is taken into account absolutely loaded when queued requests exceed configured age or amount thresholds and the deployment has reached its max_replicas, or the nodepool has reached its most occasion capability.
When this occurs, the subsequent nodepool within the listing is robotically warmed and a portion of site visitors is routed to it.
The deployment’s
min_replicasapplies solely to the first nodepool.The deployment’s
max_replicasapplies independently to every nodepool, not as a worldwide sum.
This method permits excessive availability and predictable scaling with out duplicating deployments or manually managing failover. Deployments can now span a number of compute swimming pools whereas behaving as a single, resilient service.
Learn extra about Multi-Nodepool Deployment right here.
Agentic Capabilities with MCP Help
Clarifai expands help for agentic AI programs by making it simpler to mix agent-aware fashions with Mannequin Context Protocol integration. Fashions can uncover, name, and purpose over each customized and open-source MCP servers throughout inference, whereas remaining absolutely managed on the Clarifai platform.
Agentic Fashions with MCP Integration
You possibly can add fashions with agentic capabilities through the use of the AgenticModelClass, which extends the usual mannequin class to help device discovery and execution. The add workflow stays the identical as current customized fashions, utilizing the identical challenge construction, configuration recordsdata, and deployment course of.
Agentic fashions are configured to work with MCP servers, which expose instruments that the mannequin can name throughout inference.
Key capabilities embrace:
Iterative device calling inside a single predict or generate request
Device discovery and execution dealt with by the agentic mannequin class
Help for each streaming and non-streaming inference
Compatibility with the OpenAI-compatible API and Clarifai SDKs
An entire instance of importing and working an agentic mannequin is out there right here. This repository exhibits add a GPT-OSS-20B mannequin with agentic capabilities enabled utilizing the AgenticModelClass.
Deploying Public MCP Servers on Clarifai
Clarifai has already supported deploying customized MCP servers, permitting groups to construct their very own device servers and run them on the platform. This launch expands that functionality by making it simple to deploy public MCP servers immediately on the Platform.
Public MCP servers can now be uploaded utilizing a easy configuration, with out requiring groups to host or handle the server infrastructure themselves. As soon as deployed, these servers will be shared throughout fashions and workflows, permitting agentic fashions to entry the identical instruments.
This instance demonstrates deploy a public, open-source MCP server on Clarifai as an API endpoint.
Pay-As-You-Go Billing with Pay as you go Credit
We’ve launched a brand new Pay-As-You-Go (PAYG) plan to make billing less complicated and extra predictable for self-serve customers.
The PAYG plan has no month-to-month minimums and much fewer characteristic gates. You prepay credit, use them throughout the platform, and pay just for what you devour. To enhance reliability, the plan additionally consists of auto-recharge, so long-running jobs don’t cease unexpectedly when credit run low.
That can assist you get began, each verified person receives a one-time $5 welcome credit score, which can be utilized throughout inference, Compute Orchestration, deployments, and extra. You too can declare a further $5 in your group.
If you would like a deeper breakdown of how pay as you go credit work, what’s altering from earlier plans, and why we made this shift, get extra particulars on this weblog.
Clarifai as an Inference Supplier within the Vercel AI SDK
Clarifai is now obtainable as an inference supplier within the Vercel AI SDK. You should use Clarifai-hosted fashions immediately by way of the OpenAI-compatible interface in @ai-sdk/openai-compatible, with out altering your current software logic.
This makes it simple to swap in Clarifai-backed fashions for manufacturing inference whereas persevering with to make use of the identical Vercel AI SDK workflows you already depend on. Be taught extra right here
New Reasoning Fashions from the Ministral 3 Household
We’ve printed two new open-weight reasoning fashions from the Ministral 3 household on Clarifai:
A compact reasoning mannequin designed for effectivity, providing robust efficiency whereas remaining sensible to deploy on lifelike {hardware}.
Ministral-3-14B-Reasoning-2512
The most important mannequin within the Ministral 3 household, delivering reasoning efficiency near a lot bigger programs whereas retaining the advantages of an environment friendly open-weight design.
Each fashions can be found now and can be utilized throughout Clarifai’s inference, orchestration, and deployment workflows.
Further Modifications
Platform Updates
We’ve made a number of focused enhancements throughout the platform to enhance usability and day-to-day workflows.
Added cleaner filters within the Management Middle, making charts simpler to navigate and interpret.
Improved the Group & Logs view to make sure as we speak’s audit logs are included when choosing the final 7 days.
Enabled stopping responses immediately from the appropriate panel when utilizing Examine mode within the Playground.
Python SDK Updates
This launch features a broad set of enhancements to the Python SDK and CLI, targeted on stability, native runners, and developer expertise.
Improved reliability of native mannequin runners, together with fixes for vLLM compatibility, checkpoint downloads, and runner ID conflicts.
Launched higher artifact administration and interactive config.yaml creation in the course of the mannequin add stream.
Expanded check protection and improved error dealing with throughout runners, mannequin loading, and OpenAI-compatible API calls.
A number of extra fixes and enhancements are included, masking dependency upgrades, atmosphere dealing with, and CLI robustness. Be taught extra right here.
Able to Begin Constructing?
You can begin constructing with Clarifai Pipelines as we speak to run long-running, multi-step workflows immediately on the platform. Outline steps, add them with the CLI, and monitor execution throughout your compute.
For manufacturing deployments, mannequin routing helps you to scale throughout a number of nodepools and clusters with built-in spillover and excessive availability.
For those who’re constructing agentic programs, it’s also possible to allow agentic mannequin help with MCP servers to provide fashions entry to instruments throughout inference.
Pipelines can be found in public preview. We’d love your suggestions as you construct.