🤖 Model Performance Comparison Tool

Compare LLM performance on multiple-choice questions using Hugging Face models.

Format: Each line should have: Question,Correct Answer,Choice1,Choice2,Choice3

💡 Features:

Enter the delimiter used in your dataset:

Delimiter

📊 Results

Results will appear here...

Accuracy Comparison

Confidence Analysis

Markdown Summary (Copy & Paste Ready)

CSV Summary (Copy & Paste Ready)

Detailed results will appear here...

This tool loads and runs HuggingFace models for evaluation:

🏗️ How it works:

⚡ Performance Tips:

🔧 Supported Models: