Large Language Models
May 12, 2026
Cactus Releases Needle, a 26M Parameter Model for Function Calling on Consumer Devices
May 12, 2026
AI Summary
Cactus has open-sourced Needle, a 26 million parameter model designed for function calling on consumer devices. The model is optimized for performance, running efficiently on budget phones and other devices, and is trained on a large dataset of synthesized function-calling data.
- Needle is a 26 million parameter model focused on function calling, developed by Cactus.
- It operates at 6000 tokens per second during prefill and 1200 tokens per second during decoding on consumer devices.
- The model emphasizes tool calling as a retrieval-and-assembly process rather than reasoning, utilizing simple attention networks without multi-layer perceptrons (MLPs).
- Training involved pretraining on 200 billion tokens across 16 TPU v6e for 27 hours, followed by post-training on 2 billion tokens of synthesized function-calling data for 45 minutes.
- The dataset includes 15 tool categories such as timers, messaging, navigation, and smart home functions.
- Experimental results indicate that Needle outperforms other models like FunctionGemma-270M and Qwen-0.6B in single-shot function calling, although those models excel in conversational contexts due to their larger capacity.
- Users can test and finetune Needle on their own tools via a provided playground, and the project is part of a broader initiative by Cactus to create an inference engine for mobile and wearable devices.
- The entire project is MIT licensed, and the model weights are available on Hugging Face.
open sourcetool callingagentic modelsfunction-callingconsumer devices