2025-04-10 TIL: AI Model Performance Analysis

💻

AI Model Performance Analysis

This analysis highlights the tradeoffs between model size, speed, and tool-calling capabilities across different AI models. Llama3.1:8b appears to offer the best balance of features, while llama3.2:3b provides a good compromise between size and functionality.

Model	Size	Speed	Tool Calling	Attributes & Notes
llama3.1:8b	4.9G	Good	Very well	Excellent tool calling capabilities with reasonable speed
llama3.2:3b	2.0G	Fast	Well	Fast performance with tool calling ability; provides confidence to continue testing. Sometimes lacks accuracy
qwen2.5:7b	4.7G	Fast	Well	Fast performance with good tool calling capabilities. Performance degrades after extended use
qwq:32b	19G	Slow	Well	Produces high-quality thought processes that are satisfying. Extremely slow, causing frustration during testing
qwen2.5:0.5b	397MB	Super fast	No	Exceptional speed due to small size. Unable to call any tools
MFDoom/deepseek-r1-tool-calling:8b	4.9G	Somewhat slow	Poor	Occasionally demonstrates thought processes. Slow performance with inadequate tool calling
phi4-mini:3.8b	2.5G	Fast	No	Fast performance and can identify tools. Unable to use the tools it identifies
mistral:7b	4.1G	Fast	No	Fast performance and can identify tools. Unable to use the tools it identifies
ishumilin/deepseek-r1-coder-tools:1.5b	3.6G	Fast	No	Fast performance and recognizes tools. Cannot answer questions or execute tools