Location
https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php
Event Website
https://ebelfarsi.com/benchmarking_intelligent_agents
Document Type
Event
Start Date
15-4-2025 4:00 PM
Description
Accurate dietary assessment remains a critical yet time-consuming task in health and nutrition monitoring. This study benchmarks the macronutrient estimation capabilities of three intelligent vision agents: GPT Vision, Claude, and Gemini against manually logged food data. We unify two distinct datasets: MenuMatch, annotated by a professional nutritionist, and CGMacros, populated through user entries on MyFitnessPal. After flattening and cleaning both datasets, we first assess each model’s performance in calorie estimation. GPT Vision outperforms the others with the lowest percentage error 13.83% and is subsequently used to benchmark the macro estimations of Claude and Gemini. While Claude shows higher carbohydrate and fat estimation errors, Gemini yields the most balanced results across protein 12.55%, carbohydrates 19.57%, and fats 17.07%. These findings reveal strengths and trade-offs in current intelligent agents for visual food recognition, informing the development of more accurate, user-friendly, AI-powered nutrition tracking systems.
Included in
GRM-076 Assessing the Performance of Intelligent Agents in Visual Food Recognition Relative to Manual Data Entry
https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php
Accurate dietary assessment remains a critical yet time-consuming task in health and nutrition monitoring. This study benchmarks the macronutrient estimation capabilities of three intelligent vision agents: GPT Vision, Claude, and Gemini against manually logged food data. We unify two distinct datasets: MenuMatch, annotated by a professional nutritionist, and CGMacros, populated through user entries on MyFitnessPal. After flattening and cleaning both datasets, we first assess each model’s performance in calorie estimation. GPT Vision outperforms the others with the lowest percentage error 13.83% and is subsequently used to benchmark the macro estimations of Claude and Gemini. While Claude shows higher carbohydrate and fat estimation errors, Gemini yields the most balanced results across protein 12.55%, carbohydrates 19.57%, and fats 17.07%. These findings reveal strengths and trade-offs in current intelligent agents for visual food recognition, informing the development of more accurate, user-friendly, AI-powered nutrition tracking systems.
https://digitalcommons.kennesaw.edu/cday/Spring_2025/Masters_Research/10