M3CoT
APIa benchmark that evaluates large language models on a variety of multimodal reasoning tasks, including language, natural and social sciences, physical and social commonsense, temporal reasoning, algebra, and geometry.
0
Very Poor0 reviews
Score Breakdown
0.0
Performance
25%
0.0
Reliability
20%
0.0
Ease of Use
15%
0.0
Value
15%
0.0
Trust
15%
0.0
Delight
10%
Reviews (0)
Write a Review βNo reviews yet
Be the first to review β