Korean Exam Leaderboard
Dataset Name: LLM Leaderboard Based on Korean Government and Professional Qualification Exam Past Questions Dataset
Description:
This dataset comprises past exam questions from civil service and professional qualification exams administered by the South Korean government and public institutions. The exams span a wide range of fields, including administrative, technical, educational, and legal sectors. It covers various levels such as national and local civil service exams (Grades 9, 7, and 5), as well as professional certifications like Certified Judicial Scrivener, Actuary, and Bar Exam. The dataset is structured for use in natural language processing (NLP), question answering (QA) systems, and AI models for education and evaluation. Each entry consists of the following fields: "Category and Level / Year of Exam / Subject / Question / Answer". ***This is based on a percentage converted to a 100-point scale @ MIPA(Korea Mobile Infra Promotion Associaton)
데이터셋 이름: 한국 정부 공무원·전문직 자격시험 기출 문제 데이터셋 기반 LLM 리더보드
설명:
이 데이터셋은 대한민국 정부 및 공공기관이 주관한 공무원 임용시험과 전문직 자격시험의 기출 문제들로 구성되어 있습니다. 시험은 행정직, 법률직 등 다양한 분야를 포함하며, 국가 및 지방직 등 9급/ 7급/ 5급은 물론 법무사, 변호사 등 여러 등급의 시험이 포함됩니다. 이 데이터는 자연어 처리(NLP), 질문 응답 시스템(QA), 교육 및 평가 AI 모델의 학습에 활용할 수 있도록 구성되었습니다. 이 데이터셋은 "부문 및 등급/출제년도/과목/문제/정답"으로 구성되어 있습니다. ***점수는 100점 만점 환산 퍼센테이지 기준임 @ 한국모바일인프라진흥협회(과학기술정보통신부 등록)
Description:
This dataset comprises past exam questions from civil service and professional qualification exams administered by the South Korean government and public institutions. The exams span a wide range of fields, including administrative, technical, educational, and legal sectors. It covers various levels such as national and local civil service exams (Grades 9, 7, and 5), as well as professional certifications like Certified Judicial Scrivener, Actuary, and Bar Exam. The dataset is structured for use in natural language processing (NLP), question answering (QA) systems, and AI models for education and evaluation. Each entry consists of the following fields: "Category and Level / Year of Exam / Subject / Question / Answer". ***This is based on a percentage converted to a 100-point scale @ MIPA(Korea Mobile Infra Promotion Associaton)
데이터셋 이름: 한국 정부 공무원·전문직 자격시험 기출 문제 데이터셋 기반 LLM 리더보드
설명:
이 데이터셋은 대한민국 정부 및 공공기관이 주관한 공무원 임용시험과 전문직 자격시험의 기출 문제들로 구성되어 있습니다. 시험은 행정직, 법률직 등 다양한 분야를 포함하며, 국가 및 지방직 등 9급/ 7급/ 5급은 물론 법무사, 변호사 등 여러 등급의 시험이 포함됩니다. 이 데이터는 자연어 처리(NLP), 질문 응답 시스템(QA), 교육 및 평가 AI 모델의 학습에 활용할 수 있도록 구성되었습니다. 이 데이터셋은 "부문 및 등급/출제년도/과목/문제/정답"으로 구성되어 있습니다. ***점수는 100점 만점 환산 퍼센테이지 기준임 @ 한국모바일인프라진흥협회(과학기술정보통신부 등록)
AI Model | Average | Korean Bar Exam |
Senior Civil Service Exam |
Judicial Service Grade 5 |
National Assembly Grade 5 |
Judicial Scrivener Exam |
Police Executive Candidate |
National Civil Service Grade 7 |
Seoul City Grade 7 |
Local Civil Service Grade 7 |
Military Civil Service Grade 7 |
Police Officer Exam |
Police Promotion Exam |
Fire Service Executive Candidate |
National Assembly Grade 8 |
Postal Service Exam |
National Civil Service Grade 9 |
Seoul City Grade 9 |
National Assembly Grade 9 |
Judicial Service Grade 9 |
Local Civil Service Grade 9 |
Military Civil Service Grade 9 |
Firefighter Exam |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
OpenAI/GPT-o1 | 52.5 | 52.5 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
OpenAI/GPT-4.5 | 49.3 | 49.33 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
OpenAI/GPT-4o | 49.1 | 49.11 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
deepseek-ai/DeepSeek-R1 | 47.3 | 47.33 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
Anthropic/Claude 3.7 Sonnet | 42.7 | 42.66 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
Google/Gemini 2.0 PRO Experimental | 40.0 | 40 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
OpenAI/GPT-o3-mini | 37.0 | 37.05 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
nvidia/llama-3_1-nemotron-70b-instruct | 35.0 | 35 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
Qwen/QwQ-32B | 32.7 | 32.66 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
qwen/qwen2_5-coder-32b-instruct | 32.5 | 32.5 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
deepseek-ai/deepseek-r1-distill-qwen-32b | 30.0 | 30 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
meta/llama-3.2-90b-vision-instruct | 27.5 | 27.5 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
CohereForAI/c4ai-command-a-03-2025 | 27.3 | 27.33 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
nvidia/llama-3_3-nemotron-super-49b-v1 | 25.0 | 25 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
Google/Gemini 2.0 Flash | 25.0 | 25 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
meta-llama/Llama-3.3-70B-Instruct | 19.3 | 19.33 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
Google/Gemini 2.0 Flash Thinking Experimental | 12.5 | 12.5 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
mistralai/Mistral-Small-3.1-24B-Instruct-2503 | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
google/gemma-3-27b-it | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
LGAI-EXAONE/EXAONE-Deep-32B | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
Legend (범례)
Korean Bar Exam: 변호사
Senior Civil Service Exam: 국가직 5급
Judicial Service Grade 5: 법원직 5급
National Assembly Grade 5: 국회직 5급
Judicial Scrivener Exam: 법무사
Police Executive Candidate: 경찰간부후보생
National Civil Service Grade 7: 국가직 7급
Seoul City Grade 7: 서울시 7급
Local Civil Service Grade 7: 지방직 7급
Military Civil Service Grade 7: 군무원 7급
Police Officer Exam: 경찰(순경)
Police Promotion Exam: 경찰(승진)
Fire Service Executive Candidate: 소방간부후보생
National Assembly Grade 8: 국회직 8급
Postal Service Exam: 계리직
National Civil Service Grade 9: 국가직 9급
Seoul City Grade 9: 서울시 9급
National Assembly Grade 9: 국회직 9급
Judicial Service Grade 9: 법원직 9급
Local Civil Service Grade 9: 지방직 9급
Military Civil Service Grade 9: 군무원 9급
Firefighter Exam: 소방공무원
TBD: To Be Determined (평가 예정)