20-30b Dense大模型有哪些好的

19

Qwen3.5-27b 我宣布封神!!!

int6恰好22.5G显存占用

qwen3

幻觉重,工业垃圾

qwen2.5

14b

14b-1M

32b

上下文:128k

ruler.png
short_result.png
qwen2.5-32B-instruct_wturbo.001.jpg

QwQ

幻觉太重,不要

InternLM2.5

上下文:200k

enchmark

InternLM2.5-20B

InternLM2-20B

MMLU

74.25

67.58

CMMLU

82.22

68.29

BBH

77.82

71.36

MATH

48

32.66

HUMANEVAL

71.95

51.22

GPQA

37.88

31.31

GPT-OSS 20b

moe,不要

Gemma3

幻觉太重,不要

GLM-Z1-32B

面向推理,无人问津,不踩这个地雷

Mistral small

视觉多模态

上下文:128k

Model

MMLU

MMLU Pro (5-shot CoT)

MATH

GPQA Main (5-shot CoT)

GPQA Diamond (5-shot CoT )

MBPP Plus - Pass@5

HumanEval Plus - Pass@5

SimpleQA (TotalAcc)

Small 3.1 24B Instruct

80.62%

66.76%

69.30%

44.42%

45.96%

74.63%

88.99%

10.43%

Small 3.2 24B Instruct

80.50%

69.06%

69.42%

44.22%

46.13%

78.33%

92.90%

12.10%