HomeSample Page

Sample Page Title


Liquid AI has launched LFM2-24B-A2B, a mannequin optimized for native, low-latency software dispatch, alongside LocalCowork, an open-source desktop agent software obtainable of their Liquid4All GitHub Cookbook. The discharge gives a deployable structure for operating enterprise workflows completely on-device, eliminating API calls and information egress for privacy-sensitive environments.

Structure and Serving Configuration

To attain low-latency execution on client {hardware}, LFM2-24B-A2B makes use of a Sparse Combination-of-Consultants (MoE) structure. Whereas the mannequin accommodates 24 billion parameters in complete, it solely prompts roughly 2 billion parameters per token throughout inference.

This structural design permits the mannequin to keep up a broad data base whereas considerably decreasing the computational overhead required for every era step. Liquid AI stress-tested the mannequin utilizing the next {hardware} and software program stack:

  • {Hardware}: Apple M4 Max, 36 GB unified reminiscence, 32 GPU cores.
  • Serving Engine: llama-server with flash consideration enabled.
  • Quantization: Q4_K_M GGUF format.
  • Reminiscence Footprint: ~14.5 GB of RAM.
  • Hyperparameters: Temperature set to 0.1, top_p to 0.1, and max_tokens to 512 (optimized for deterministic, strict outputs).

LocalCowork Software Integration

LocalCowork is a very offline desktop AI agent that makes use of the Mannequin Context Protocol (MCP) to execute pre-built instruments with out counting on cloud APIs or compromising information privateness, logging each motion to a neighborhood audit path. The system contains 75 instruments throughout 14 MCP servers able to dealing with duties like filesystem operations, OCR, and safety scanning. Nevertheless, the offered demo focuses on a extremely dependable, curated subset of 20 instruments throughout 6 servers, every rigorously examined to realize over 80% single-step accuracy and verified multi-step chain participation.

LocalCowork acts as the sensible implementation of this mannequin. It operates fully offline and comes pre-configured with a collection of enterprise-grade instruments:

  • File Operations: Itemizing, studying, and looking throughout the host filesystem.
  • Safety Scanning: Figuring out leaked API keys and private identifiable data (PII) inside native directories.
  • Doc Processing: Executing Optical Character Recognition (OCR), parsing textual content, diffing contracts, and producing PDFs.
  • Audit Logging: Recording each software name regionally for compliance monitoring.

Efficiency Benchmarks

Liquid AI workforce evaluated the mannequin in opposition to a workload of 100 single-step software choice prompts and 50 multi-step chains (requiring 3 to six discrete software executions, comparable to looking a folder, operating OCR, parsing information, deduplicating, and exporting).

Latency

The mannequin averaged ~385 ms per tool-selection response. This sub-second dispatch time is extremely appropriate for interactive, human-in-the-loop purposes the place fast suggestions is critical.

Accuracy

  • Single-Step Executions: 80% accuracy.
  • Multi-Step Chains: 26% end-to-end completion charge.

Key Takeaways

  • Privateness-First Native Execution: LocalCowork operates completely on-device with out cloud API dependencies or information egress, making it extremely appropriate for regulated enterprise environments requiring strict information privateness.
  • Environment friendly MoE Structure: LFM2-24B-A2B makes use of a Sparse Combination-of-Consultants (MoE) design, activating solely ~2 billion of its 24 billion parameters per token, permitting it to suit comfortably inside a ~14.5 GB RAM footprint utilizing Q4_K_M GGUF quantization.
  • Sub-Second Latency on Client {Hardware}: When benchmarked on an Apple M4 Max laptop computer, the mannequin achieves a mean latency of ~385 ms for tool-selection dispatch, enabling extremely interactive, real-time workflows.
  • Standardized MCP Software Integration: The agent leverages the Mannequin Context Protocol (MCP) to seamlessly join with native instruments—together with filesystem operations, OCR, and safety scanning—whereas routinely logging all actions to a neighborhood audit path.
  • Sturdy Single-Step Accuracy with Multi-Step Limits: The mannequin achieves 80% accuracy on single-step software execution however drops to a 26% success charge on multi-step chains resulting from ‘sibling confusion’ (deciding on the same however incorrect software), indicating it at the moment capabilities finest in a guided, human-in-the-loop loop relatively than as a completely autonomous agent.

Try the Repo and Technical particularsAdditionally, be happy to observe us on Twitter and don’t overlook to affix our 120k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you may be a part of us on telegram as properly.


Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles