2025年斯坦福AI指数报告有哪些关键发现?


Highlights

想真正了解 AI 进展,光看新闻远远不够,你需要更扎实的数据和全局视角。

这份斯坦福 2025 年 AI Index 报告就是这样一个可靠的来源。它不讲故事,而是通过追踪全球 AI 研究、技术性能、经济影响、政策法规、公众认知等多个维度的数据,让你看到 AI 发展的真实图景。比如,它会告诉你 美国私有 AI 投资额已达 1091 亿美元,远超其他国家,但中国在模型性能上正快速追赶;或者 78% 的企业已采用 AI,但标准化负责任 AI (RAI) 评估却很罕见;全球对 AI 的乐观情绪在上升(中国 83%),但地区差异依然巨大(美国仅 39%)。

报告的核心是提供无偏见、经过严格审查的数据。它展示了 AI 在 MMMU 等基准测试上性能的显著提升(一年提高 18.8 个百分点),在医疗(FDA 批准 223 项 AI 设备)、交通(Waymo 每周 15 万次自动驾驶)等领域的加速渗透,以及成本的急剧下降(推理成本降 280 倍)。同时,它也指出了挑战,比如复杂推理能力依然不足,以及行业主导(近 90% 模型来自工业界)带来的影响。这份报告能帮你穿透喧嚣,看清 AI 到底走到了哪一步。

AI’s influence on society has never been more pronounced.
人工智能对社会的影响从未像现在这样显著。

At Stanford HAI, we believe AI is poised to be the most transformative technology of the 21st century. But its benefits won’t be evenly distributed unless we guide its development thoughtfully. The AI Index offers one of the most comprehensive, data-driven views of artificial intelligence. Recognized as a trusted resource by global media, governments, and leading companies, the AI Index equips policymakers, business leaders, and the public with rigorous, objective insights into AI’s technical progress, economic influence, and societal impact.

在斯坦福 HAI,我们相信人工智能有望成为 21 世纪最具变革意义的技术。但是,除非我们认真指导其发展,否则其益处将不会得到均衡分配。《人工智能索引》提供了关于人工智能最全面、数据驱动的观点之一。《人工智能索引》被全球媒体、政府和领先公司公认为值得信赖的资源,它为政策制定者、商业领袖和公众提供了对人工智能技术进步、经济影响和社会影响的严谨、客观的见解。

Top Takeaways 主要结论

1. AI performance on demanding benchmarks continues to improve.
1. 人工智能在要求苛刻的基准测试中的性能持续提高。

AI benchmarks performance chart

In 2023, researchers introduced new benchmarks—MMMU, GPQA, and SWE-bench—to test the limits of advanced AI systems. Just a year later, performance sharply increased: scores rose by 18.8, 48.9, and 67.3 percentage points on MMMU, GPQA, and SWE-bench, respectively. Beyond benchmarks, AI systems made major strides in generating high-quality video, and in some settings, language model agents even outperformed humans in programming tasks with limited time budgets.

2023 年,研究人员推出了新的基准测试——MMMU、GPQA 和 SWE-bench——以测试高级人工智能系统的极限。仅仅一年后,性能急剧提高:在 MMMU、GPQA 和 SWE-bench 上的分数分别上升了 18.8、48.9 和 67.3 个百分点。除了基准测试之外,人工智能系统在生成高质量视频方面取得了重大进展,在某些情况下,语言模型代理在有限的时间预算内甚至在编程任务中优于人类。

2. AI is increasingly embedded in everyday life.
2. 人工智能正日益融入日常生活。

AI in everyday life

From healthcare to transportation, AI is rapidly moving from the lab to daily life. In 2023, the FDA approved 223 AI-enabled medical devices, up from just six in 2015. On the roads, self-driving cars are no longer experimental: Waymo, one of the largest U.S. operators, provides over 150,000 autonomous rides each week, while Baidu’s affordable Apollo Go robotaxi fleet now serves numerous cities across China.

从医疗保健到交通运输,人工智能正迅速从实验室走向日常生活。2023 年,FDA 批准了 223 种支持人工智能的医疗设备,而 2015 年仅为 6 种。在道路上,自动驾驶汽车不再是实验品:美国最大的运营商之一 Waymo 每周提供超过 15 万次自动驾驶服务,而百度价格实惠的 Apollo Go 自动驾驶出租车车队现在为中国多个城市提供服务。

3. Business is all in on AI, fueling record investment and usage, as research continues to show strong productivity impacts.
3. 商业全面拥抱人工智能,推动了创纪录的投资和使用,同时研究继续显示出强大的生产力影响。

AI business investment chart

In 2024, U.S. private AI investment grew to $109.1 billion—nearly 12 times China’s $9.3 billion and 24 times the U.K.’s $4.5 billion. Generative AI saw particularly strong momentum, attracting $33.9 billion globally in private investment—an 18.7% increase from 2023. AI business usage is also accelerating: 78% of organizations reported using AI in 2024, up from 55% the year before. Meanwhile, a growing body of research confirms that AI boosts productivity and, in most cases, helps narrow skill gaps across the workforce.

2024 年,美国私人人工智能投资增长至 1091 亿美元,几乎是中国 93 亿美元的 12 倍,英国 45 亿美元的 24 倍。生成式人工智能尤其表现出强劲的势头,在全球范围内吸引了 339 亿美元的私人投资,比 2023 年增长了 18.7%。人工智能的商业应用也在加速:78%的组织报告称在 2024 年使用了人工智能,高于前一年的 55%。与此同时,越来越多的研究证实,人工智能提高了生产力,并且在大多数情况下,有助于缩小整个劳动力队伍中的技能差距。

4. The U.S. still leads in producing top AI models—but China is closing the performance gap.
4. 美国仍然在顶级人工智能模型的生产方面处于领先地位,但中国正在缩小性能差距。

US vs China AI models comparison

In 2024, U.S.-based institutions produced 40 notable AI models, significantly outpacing China’s 15 and Europe’s three. While the U.S. maintains its lead in quantity, Chinese models have rapidly closed the quality gap: performance differences on major benchmarks such as MMLU and HumanEval shrank from double digits in 2023 to near parity in 2024. Meanwhile, China continues to lead in AI publications and patents. At the same time, model development is increasingly global, with notable launches from regions such as the Middle East, Latin America, and Southeast Asia.

2024 年,总部位于美国的机构发布了 40 个值得关注的人工智能模型,大大超过了中国的 15 个和欧洲的 3 个。虽然美国在数量上保持领先地位,但中国模型的质量差距迅速缩小:在 MMLU 和 HumanEval 等主要基准上的性能差异从 2023 年的两位数缩小到 2024 年的几乎持平。与此同时,中国继续在人工智能出版物和专利方面保持领先。与此同时,模型开发正变得越来越全球化,中东、拉丁美洲和东南亚等地区也出现了值得关注的发布。

5. The responsible AI ecosystem evolves—unevenly.
5. 负责任的 AI 生态系统发展——不均衡。

Responsible AI ecosystem

AI-related incidents are rising sharply, yet standardized RAI evaluations remain rare among major industrial model developers. However, new benchmarks like HELM Safety, AIR-Bench, and FACTS offer promising tools for assessing factuality and safety. Among companies, a gap persists between recognizing RAI risks and taking meaningful action. In contrast, governments are showing increased urgency: In 2024, global cooperation on AI governance intensified, with organizations including the OECD, EU, U.N., and African Union releasing frameworks focused on transparency, trustworthiness, and other core responsible AI principles.

与人工智能相关的事件正在急剧增加,但主要的工业模型开发者仍然很少进行标准化负责任人工智能(RAI)评估。然而,诸如 HELM Safety、AIR-Bench 和 FACTS 之类的新基准为评估事实性和安全性提供了有希望的工具。在各公司中,认识到 RAI 风险与采取有意义的行动之间仍然存在差距。相比之下,各国政府表现出越来越高的紧迫性:2024 年,关于人工智能治理的全球合作得到加强,包括 OECD、欧盟、联合国和非洲联盟在内的组织发布了侧重于透明度、可信赖度和其他核心负责任人工智能原则的框架。

6. Global AI optimism is rising—but deep regional divides remain.
6. 全球对人工智能的乐观情绪正在上升,但区域间仍存在深刻分歧。

Global AI optimism chart

In countries like China (83%), Indonesia (80%), and Thailand (77%), strong majorities see AI products and services as more beneficial than harmful. In contrast, optimism remains far lower in places like Canada (40%), the United States (39%), and the Netherlands (36%). Still, sentiment is shifting: since 2022, optimism has grown significantly in several previously skeptical countries—including Germany (+10%), France (+10%), Canada (+8%), Great Britain (+8%), and the United States (+4%).

在中国(83%)、印度尼西亚(80%)和泰国(77%)等国家,绝大多数人认为人工智能产品和服务更有益,而不是有害。相比之下,在加拿大(40%)、美国(39%)和荷兰(36%)等地,乐观情绪仍然远低于上述国家。尽管如此,人们的看法正在转变:自 2022 年以来,在一些先前持怀疑态度的国家,乐观情绪已显著增长,包括德国(+10%)、法国(+10%)、加拿大(+8%)、英国(+8%)和美国(+4%)。

7. AI becomes more efficient, affordable and accessible.
7. 人工智能变得更高效、经济和普及。

AI efficiency improvements

Driven by increasingly capable small models, the inference cost for a system performing at the level of GPT-3.5 dropped over 280-fold between November 2022 and October 2024. At the hardware level, costs have declined by 30% annually, while energy efficiency has improved by 40% each year. Open-weight models are also closing the gap with closed models, reducing the performance difference from 8% to just 1.7% on some benchmarks in a single year. Together, these trends are rapidly lowering the barriers to advanced AI.

在性能日益增强的小型模型的推动下,执行 GPT-3.5 级别任务的系统,其推理成本在 2022 年 11 月至 2024 年 10 月间下降了 280 多倍。在硬件层面,成本每年下降 30%,而能源效率每年提高 40%。开源权重模型也在缩小与闭源模型的差距,在一年内,某些基准测试上的性能差异从 8%降低到仅 1.7%。总而言之,这些趋势正在迅速降低高级人工智能的门槛。

8. Governments are stepping up on AI—with regulation and investment.
8. 各国政府正在加大对人工智能的投入——通过监管和投资。

Government AI regulation and investment

In 2024, U.S. federal agencies introduced 59 AI-related regulations—more than double the number in 2023—and issued by twice as many agencies. Globally, legislative mentions of AI rose 21.3% across 75 countries since 2023, marking a ninefold increase since 2016. Alongside growing attention, governments are investing at scale: Canada pledged $2.4 billion, China launched a $47.5 billion semiconductor fund, France committed €109 billion, India pledged $1.25 billion, and Saudi Arabia’s Project Transcendence represents a $100 billion initiative.

2024 年,美国联邦机构出台了 59 项与人工智能相关的法规,是 2023 年的两倍多,且发布机构的数量也是之前的两倍。在全球范围内,自 2023 年以来,75 个国家/地区的立法中提及人工智能的次数增加了 21.3%,与 2016 年相比增加了九倍。伴随日益增长的关注,各国政府也在大规模投资:加拿大承诺投入 24 亿美元,中国启动了 475 亿美元的半导体基金,法国承诺投入 1090 亿欧元,印度承诺投入 12.5 亿美元,沙特阿拉伯的”超越项目”(Project Transcendence)代表了一项 1000 亿美元的计划。

9. AI and computer science education is expanding—but gaps in access and readiness persist.
9. 人工智能和计算机科学教育正在扩展,但仍然存在普及程度和准备方面的差距。

AI education expansion

Two-thirds of countries now offer or plan to offer K–12 CS education—twice as many as in 2019—with Africa and Latin America making the most progress. In the U.S., the number of graduates with bachelor’s degrees in computing has increased 22% over the last 10 years. Yet access remains limited in many African countries due to basic infrastructure gaps like electricity. In the U.S., 81% of K–12 CS teachers say AI should be part of foundational CS education, but less than half feel equipped to teach it.

现在有三分之二的国家提供或计划提供 K-12 CS 教育,是 2019 年的两倍,其中非洲和拉丁美洲的进展最大。在美国,计算机专业本科毕业生人数在过去 10 年中增加了 22%。然而,由于电力等基础设施方面的差距,许多非洲国家的访问仍然受到限制。在美国,81%的 K-12 CS 教师认为人工智能应该成为基础 CS 教育的一部分,但只有不到一半的人觉得自己有能力教授它。

10. Industry is racing ahead in AI—but the frontier is tightening.
10. 行业在人工智能领域突飞猛进,但前沿领域正在收紧。

Industry AI development

Nearly 90% of notable AI models in 2024 came from industry, up from 60% in 2023, while academia remains the top source of highly cited research. Model scale continues to grow rapidly—training compute doubles every five months, datasets every eight, and power use annually. Yet performance gaps are shrinking: the score difference between the top and 10th-ranked models fell from 11.9% to 5.4% in a year, and the top two are now separated by just 0.7%. The frontier is increasingly competitive—and increasingly crowded.

2024 年,近 90%的著名人工智能模型来自工业界,高于 2023 年的 60%,而学术界仍然是高引用研究的主要来源。模型规模持续快速增长——训练计算每五个月翻一番,数据集每八个月翻一番,功耗每年翻一番。然而,性能差距正在缩小:排名第一和第十的模型之间的分数差距在一年内从 11.9%降至 5.4%,前两名之间的差距现在仅为 0.7%。前沿领域竞争日益激烈,也日益拥挤。

11. AI earns top honors for its impact on science.
11. 人工智能因其对科学的影响而获得最高荣誉。

AI scientific awards

AI’s growing importance is reflected in major scientific awards: two Nobel Prizes recognized work that led to deep learning (physics), and to its application to protein folding (chemistry), while the Turing Award honored groundbreaking contributions to reinforcement learning.

人工智能日益增长的重要性体现在主要的科学奖项中:两项诺贝尔奖表彰了促成深度学习(物理学)以及其在蛋白质折叠(化学)中的应用的工作,而图灵奖则表彰了对强化学习的开创性贡献。

12. Complex reasoning remains a challenge.
12. 复杂推理仍然是一个挑战。

AI complex reasoning challenges

AI models excel at tasks like International Mathematical Olympiad problems but still struggle with complex reasoning benchmarks like PlanBench. They often fail to reliably solve logic tasks even when provably correct solutions exist, limiting their effectiveness in high-stakes settings where precision is critical.

人工智能模型擅长解决国际数学奥林匹克竞赛等问题,但仍然难以应对像 PlanBench 这样复杂的推理基准。即使存在可证明的正确解决方案,它们也常常无法可靠地解决逻辑任务,这限制了它们在高风险环境中的有效性,而在这些环境中,精确性至关重要。


阅读原文

想掌握AI工作流自动化的核心技能?

从基础到高级的AI工具整合思维,掌握完整的工作流构建方法,快速提升工作效率10倍!现在订阅我的课程,还可享受限时优惠



探索更多AI实战内容 →

Axton二维码

扫码关注获取更多资源

发表评论

您的邮箱地址不会被公开。 必填项已用 * 标注

滚动至顶部