Back to news
Large Language Models
May 12, 2026

Cactus Releases Needle, a 26M Parameter Model for Function Calling on Consumer Devices

May 12, 2026
AI Summary

Cactus has open-sourced Needle, a 26 million parameter model designed for function calling on consumer devices. The model is optimized for performance, running efficiently on budget phones and other devices, and is trained on a large dataset of synthesized function-calling data.

  • Needle is a 26 million parameter model focused on function calling, developed by Cactus.
  • It operates at 6000 tokens per second during prefill and 1200 tokens per second during decoding on consumer devices.
  • The model emphasizes tool calling as a retrieval-and-assembly process rather than reasoning, utilizing simple attention networks without multi-layer perceptrons (MLPs).
  • Training involved pretraining on 200 billion tokens across 16 TPU v6e for 27 hours, followed by post-training on 2 billion tokens of synthesized function-calling data for 45 minutes.
  • The dataset includes 15 tool categories such as timers, messaging, navigation, and smart home functions.
  • Experimental results indicate that Needle outperforms other models like FunctionGemma-270M and Qwen-0.6B in single-shot function calling, although those models excel in conversational contexts due to their larger capacity.
  • Users can test and finetune Needle on their own tools via a provided playground, and the project is part of a broader initiative by Cactus to create an inference engine for mobile and wearable devices.
  • The entire project is MIT licensed, and the model weights are available on Hugging Face.
open sourcetool callingagentic modelsfunction-callingconsumer devices