Published Jan 21, 2025

- 1 min read

[AI]DeepSeek-R1 模型发布且开源，性能对标 OpenAI o1 正式版

Thumbnail

DeepSeek-R1 性能概览

DeepSeek-R1 在多个领域展现出强大的性能，以下是其在各个方面的具体表现：

综合性能 (Average: 73.9, 排名第二)

模型 (Model)	综合得分 (Average)	编程 (Coding)	数据分析 (Data Analysis)	指令跟随 (Instruction Following)	语言 (Language)	数学 (Math)	推理 (Reasoning)
DeepSeek-Reasoner	73.9	65.7	71.8	87.0	53.7	79.9	85.3

数学 (Math: 79.9, 排名第二)

模型 (Model)	AMPS_Hard	math_comp	olympiad
DeepSeek-Reasoner	83.0	91.667	64.923

编程 (Coding: 65.7, 排名第三)

模型 (Model)	LCB_generation	coding_completion
DeepSeek-Reasoner	83.333	48.0

推理 (Reasoning: 85.3, 排名第二)

模型 (Model)	spatial	web_of_lies_v2	zebra_puzzle
DeepSeek-Reasoner	78.0	98.0	80.0

语言 (Language: 53.7, 排名第三)

模型 (Model)	connections	plot_unscrambling	typos
DeepSeek-Reasoner	74.167	43.046	44.0

数据分析 (Data Analysis: 71.8, 排名第一)

模型 (Model)	cta	tablejoin	tablereformat
DeepSeek-Reasoner	64.0	61.46	90.0

指令跟随 (Instruction Following: 87.0, 排名第一)

模型 (Model)	paraphrase	simplify	story_generation	summarize
DeepSeek-Reasoner	88.75	85.467	88.0	85.6