2025-04-10 TIL: AI Model Performance Analysis
💻
AI Model Performance Analysis
This analysis highlights the tradeoffs between model size, speed, and tool-calling capabilities across different AI models. Llama3.1:8b appears to offer the best balance of features, while llama3.2:3b provides a good compromise between size and functionality.
Model | Size | Speed | Tool Calling | Attributes & Notes |
---|---|---|---|---|
llama3.1:8b | 4.9G | Good | Very well | Excellent tool calling capabilities with reasonable speed |
llama3.2:3b | 2.0G | Fast | Well | Fast performance with tool calling ability; provides confidence to continue testing. Sometimes lacks accuracy |
qwen2.5:7b | 4.7G | Fast | Well | Fast performance with good tool calling capabilities. Performance degrades after extended use |
qwq:32b | 19G | Slow | Well | Produces high-quality thought processes that are satisfying. Extremely slow, causing frustration during testing |
qwen2.5:0.5b | 397MB | Super fast | No | Exceptional speed due to small size. Unable to call any tools |
MFDoom/deepseek-r1-tool-calling:8b | 4.9G | Somewhat slow | Poor | Occasionally demonstrates thought processes. Slow performance with inadequate tool calling |
phi4-mini:3.8b | 2.5G | Fast | No | Fast performance and can identify tools. Unable to use the tools it identifies |
mistral:7b | 4.1G | Fast | No | Fast performance and can identify tools. Unable to use the tools it identifies |
ishumilin/deepseek-r1-coder-tools:1.5b | 3.6G | Fast | No | Fast performance and recognizes tools. Cannot answer questions or execute tools |