2025-04-10 TIL: AI Model Performance Analysis

2025-04-10 TIL: AI Model Performance Analysis

💻

AI Model Performance Analysis

This analysis highlights the tradeoffs between model size, speed, and tool-calling capabilities across different AI models. Llama3.1:8b appears to offer the best balance of features, while llama3.2:3b provides a good compromise between size and functionality.

Model Size Speed Tool Calling Attributes & Notes
llama3.1:8b 4.9G Good Very well Excellent tool calling capabilities with reasonable speed
llama3.2:3b 2.0G Fast Well Fast performance with tool calling ability; provides confidence to continue testing. Sometimes lacks accuracy
qwen2.5:7b 4.7G Fast Well Fast performance with good tool calling capabilities. Performance degrades after extended use
qwq:32b 19G Slow Well Produces high-quality thought processes that are satisfying. Extremely slow, causing frustration during testing
qwen2.5:0.5b 397MB Super fast No Exceptional speed due to small size. Unable to call any tools
MFDoom/deepseek-r1-tool-calling:8b 4.9G Somewhat slow Poor Occasionally demonstrates thought processes. Slow performance with inadequate tool calling
phi4-mini:3.8b 2.5G Fast No Fast performance and can identify tools. Unable to use the tools it identifies
mistral:7b 4.1G Fast No Fast performance and can identify tools. Unable to use the tools it identifies
ishumilin/deepseek-r1-coder-tools:1.5b 3.6G Fast No Fast performance and recognizes tools. Cannot answer questions or execute tools
Puran Zhang @puran