
在發(fā)布最新人工智能模型Gemini 2.5 Pro“預(yù)覽版”三周后,,谷歌(Google)發(fā)布了一份重要文件,,詳細(xì)披露了這個模型的構(gòu)建與測試信息。
此前,,人工智能治理專家批評該公司在發(fā)布模型時未同步公開對該模型的詳細(xì)安全評估報告及其潛在風(fēng)險說明文件,,這顯然違背了其向美國政府及在多次國際人工智能安全會議上做出的承諾。
谷歌發(fā)言人在郵件聲明中表示,,任何關(guān)于公司違背承諾的說法都是“不實之詞”,。
該公司還表示,當(dāng)向公眾完全開放Gemini 2.5 Pro“系列模型”最終版時,,將提供更為詳盡的“技術(shù)報告”,。
但至少有一位人工智能治理專家批評谷歌新發(fā)布的六頁模型卡文件,認(rèn)為文件提供的模型安全評估信息“嚴(yán)重不足”。
位于華盛頓特區(qū)的智庫民主與技術(shù)中心(Center for Democracy and Technology)的人工智能治理高級顧問凱文·班克斯頓在社交媒體平臺X發(fā)表長篇評論,,對谷歌延遲發(fā)布模型卡文件及其細(xì)節(jié)缺失表示擔(dān)憂,。
他指出:“這份有關(guān)頂級人工智能模型的資料信息嚴(yán)重不足,它揭示了一個令人不安的趨勢——當(dāng)科技公司爭相將產(chǎn)品推向市場時,,人工智能的安全性和透明度正在競相降低,。”
班克斯頓特別強調(diào),,模型卡既未及時發(fā)布,又缺乏關(guān)鍵安全評估結(jié)果(如誘使人工智能模型輸出生物武器指南等危險內(nèi)容的“紅隊測試”細(xì)節(jié)),,表明谷歌“在發(fā)布最強模型前尚未完成安全測試”,,且“甚至至今仍未完成安全測試”。
班克斯頓提出另一種可能:谷歌雖完成了安全測試,,但制定了新政策,,即在模型未向全體用戶開放前不公開評估結(jié)果。目前,,谷歌將Gemini 2.5 Pro稱為“預(yù)覽”版本,,用戶通過谷歌AI Studio和Google Labs產(chǎn)品訪問,但可使用的功能受到限制,。谷歌同時宣布將向美國大學(xué)生廣泛開放該模型,。
谷歌發(fā)言人表示公司將為“每一個模型系列發(fā)布一份”更加完整的人工智能安全報告。班克斯頓在X平臺上表示,,這可能意味著谷歌未來不會針對模型的微調(diào)版本(如專門針對編程或網(wǎng)絡(luò)安全的改進(jìn)版)單獨發(fā)布評估報告,。他指出這種做法的危險性,因為人工智能模型的微調(diào)版本可能展現(xiàn)出與“基礎(chǔ)模型”截然不同的行為特征,。
在人工智能安全性方面的表現(xiàn)出現(xiàn)倒退趨勢的并非谷歌一家公司,。Meta為新發(fā)布的Llama 4人工智能模型提供的模型卡文件,在篇幅和細(xì)節(jié)方面與谷歌Gemini 2.5 Pro的模型卡文件相似,,同樣遭到人工智能安全專家的批評,。OpenAI則表示不會為其新模型GPT-4.1發(fā)布技術(shù)安全報告,理由是該模型“并非前沿產(chǎn)品”,,因其o3和o4-mini等“思維鏈”推理模型在多項基準(zhǔn)測試中表現(xiàn)更優(yōu),。與此同時,OpenAI宣稱其GPT-4.1模型的性能比GPT-4o模型更強大,。盡管安全評估顯示后者可能會帶來一定的風(fēng)險,,但該公司稱其安全性仍在可發(fā)布的閾值以內(nèi)。鑒于該公司表示不會發(fā)布技術(shù)報告,,GPT-4.1是否會突破安全閾值目前仍是未知數(shù),。
OpenAI確實為上周三新發(fā)布的o3和o4-mini模型提供了技術(shù)安全報告。但與此同時,本周早些時候,,該公司更新了“準(zhǔn)備框架”,。“準(zhǔn)備框架”介紹了該公司如何評估其人工智能模型存在的關(guān)鍵風(fēng)險以及計劃如何減緩這些風(fēng)險,,如協(xié)助制造生物武器和模型開始自主改進(jìn)并突破人類控制的可能性等風(fēng)險,。此次更新將“說服能力”(即模型操縱人類采取危險行動或相信虛假信息的能力)剔除出該公司預(yù)發(fā)布評估的風(fēng)險類別,并修改了發(fā)布高風(fēng)險模型的決策機(jī)制,,包括在競爭對手已推出類似模型時,,公司會考慮發(fā)布存在“重大風(fēng)險”的人工智能產(chǎn)品。
這些修改在人工智能治理專家之間引發(fā)了分歧:部分人士贊賞OpenAI提升決策透明度并明確發(fā)布政策的做法,,另一些專家則對修改內(nèi)容表示震驚,。 (財富中文網(wǎng))
譯者:劉進(jìn)龍
審校:汪皓
在發(fā)布最新人工智能模型Gemini 2.5 Pro“預(yù)覽版”三周后,谷歌(Google)發(fā)布了一份重要文件,,詳細(xì)披露了這個模型的構(gòu)建與測試信息,。
此前,人工智能治理專家批評該公司在發(fā)布模型時未同步公開對該模型的詳細(xì)安全評估報告及其潛在風(fēng)險說明文件,,這顯然違背了其向美國政府及在多次國際人工智能安全會議上做出的承諾,。
谷歌發(fā)言人在郵件聲明中表示,任何關(guān)于公司違背承諾的說法都是“不實之詞”,。
該公司還表示,,當(dāng)向公眾完全開放Gemini 2.5 Pro“系列模型”最終版時,將提供更為詳盡的“技術(shù)報告”,。
但至少有一位人工智能治理專家批評谷歌新發(fā)布的六頁模型卡文件,,認(rèn)為文件提供的模型安全評估信息“嚴(yán)重不足”。
位于華盛頓特區(qū)的智庫民主與技術(shù)中心(Center for Democracy and Technology)的人工智能治理高級顧問凱文·班克斯頓在社交媒體平臺X發(fā)表長篇評論,,對谷歌延遲發(fā)布模型卡文件及其細(xì)節(jié)缺失表示擔(dān)憂,。
他指出:“這份有關(guān)頂級人工智能模型的資料信息嚴(yán)重不足,它揭示了一個令人不安的趨勢——當(dāng)科技公司爭相將產(chǎn)品推向市場時,,人工智能的安全性和透明度正在競相降低,。”
班克斯頓特別強調(diào),,模型卡既未及時發(fā)布,,又缺乏關(guān)鍵安全評估結(jié)果(如誘使人工智能模型輸出生物武器指南等危險內(nèi)容的“紅隊測試”細(xì)節(jié)),表明谷歌“在發(fā)布最強模型前尚未完成安全測試”,,且“甚至至今仍未完成安全測試”,。
班克斯頓提出另一種可能:谷歌雖完成了安全測試,但制定了新政策,,即在模型未向全體用戶開放前不公開評估結(jié)果,。目前,,谷歌將Gemini 2.5 Pro稱為“預(yù)覽”版本,用戶通過谷歌AI Studio和Google Labs產(chǎn)品訪問,,但可使用的功能受到限制,。谷歌同時宣布將向美國大學(xué)生廣泛開放該模型。
谷歌發(fā)言人表示公司將為“每一個模型系列發(fā)布一份”更加完整的人工智能安全報告,。班克斯頓在X平臺上表示,,這可能意味著谷歌未來不會針對模型的微調(diào)版本(如專門針對編程或網(wǎng)絡(luò)安全的改進(jìn)版)單獨發(fā)布評估報告。他指出這種做法的危險性,,因為人工智能模型的微調(diào)版本可能展現(xiàn)出與“基礎(chǔ)模型”截然不同的行為特征,。
在人工智能安全性方面的表現(xiàn)出現(xiàn)倒退趨勢的并非谷歌一家公司。Meta為新發(fā)布的Llama 4人工智能模型提供的模型卡文件,,在篇幅和細(xì)節(jié)方面與谷歌Gemini 2.5 Pro的模型卡文件相似,,同樣遭到人工智能安全專家的批評。OpenAI則表示不會為其新模型GPT-4.1發(fā)布技術(shù)安全報告,,理由是該模型“并非前沿產(chǎn)品”,因其o3和o4-mini等“思維鏈”推理模型在多項基準(zhǔn)測試中表現(xiàn)更優(yōu),。與此同時,,OpenAI宣稱其GPT-4.1模型的性能比GPT-4o模型更強大。盡管安全評估顯示后者可能會帶來一定的風(fēng)險,,但該公司稱其安全性仍在可發(fā)布的閾值以內(nèi),。鑒于該公司表示不會發(fā)布技術(shù)報告,GPT-4.1是否會突破安全閾值目前仍是未知數(shù),。
OpenAI確實為上周三新發(fā)布的o3和o4-mini模型提供了技術(shù)安全報告,。但與此同時,本周早些時候,,該公司更新了“準(zhǔn)備框架”,。“準(zhǔn)備框架”介紹了該公司如何評估其人工智能模型存在的關(guān)鍵風(fēng)險以及計劃如何減緩這些風(fēng)險,,如協(xié)助制造生物武器和模型開始自主改進(jìn)并突破人類控制的可能性等風(fēng)險,。此次更新將“說服能力”(即模型操縱人類采取危險行動或相信虛假信息的能力)剔除出該公司預(yù)發(fā)布評估的風(fēng)險類別,并修改了發(fā)布高風(fēng)險模型的決策機(jī)制,,包括在競爭對手已推出類似模型時,,公司會考慮發(fā)布存在“重大風(fēng)險”的人工智能產(chǎn)品。
這些修改在人工智能治理專家之間引發(fā)了分歧:部分人士贊賞OpenAI提升決策透明度并明確發(fā)布政策的做法,,另一些專家則對修改內(nèi)容表示震驚,。 (財富中文網(wǎng))
譯者:劉進(jìn)龍
審校:汪皓
Google has released a key document detailing some information about how its latest AI model, Gemini 2.5 Pro, was built and tested, three weeks after it first made that model publicly available as a “preview” version.
AI governance experts had criticized the company for releasing the model without publishing documentation detailing safety evaluations it had carried out and any risks the model might present, in apparent violation of promises it had made to the U.S. government and at multiple international AI safety gatherings.
A Google spokesperson said in an emailed statement that any suggestion that the company had reneged on its commitments was “inaccurate.”
The company also said that a more detailed “technical report” will come later when it makes a final version of the Gemini 2.5 Pro “model family” fully available to the public.
But the newly published six-page model card has also been faulted by at least one AI governance expert for providing “meager” information about the safety evaluations of the model.
Kevin Bankston, a senior advisor on AI Governance at the Center for Democracy and Technology, a Washington, D.C.-based think tank, said in a lengthy thread on social media platform X that the late release of the model card and its lack of detail was worrisome.
“This meager documentation for Google’s top AI model tells a troubling story of a race to the bottom on AI safety and transparency as companies rush their models to market,” he said.
He said the late release of the model card and its lack key safety evaluation results—for instance, details of “red-teaming” tests to trick the AI model into serving up dangerous outputs like bioweapon instructions—suggested that Google “hadn’t finished its safety testing before releasing its most powerful model” and that “it still hasn’t completed that testing even now.”
Bankston said another possibility is that Google had finished its safety testing but has a new policy that it will not release its evaluation results until the model is released to all Google users. Currently, Google is calling Gemini 2.5 Pro a “preview,” which can be accessed through its Google AI Studio and Google Labs products, with some limitations on what users can do with it. Google has also said it is making the model widely available to U.S. college students.
The Google spokesperson said the company would release a more complete AI safety report “once per model family.” Bankson said on X that this might mean Google would no longer release separate evaluation results for fine-tuned versions of its models that it releases, such as those that have been tailored for coding or cybersecurity. This could be dangerous, he noted, because fine-tuned versions of AI models can exhibit behaviors that are markedly different from the “base model” from which they’ve been adapted.
Google is not the only AI company seemingly retreating on AI safety. Meta’s model card for its newly released Llama 4 AI model is of similar length and detail to the one Google just published for Gemini 2.5 Pro and was also criticized by AI safety experts. OpenAI said it was not releasing a technical safety report for its newly-released GPT-4.1 model because it said that the model was “not a frontier model,” since the company’s “chain of thought” reasoning models, such as o3 and o4-mini, beat it on many benchmarks. At the same time, OpenAI touted that GPT-4.1 was more capable than its GPT-4o model, whose safety evaluation had shown that model could pose certain risks, although it had said these were below the threshold at which the model would be considered unsafe to release. Whether GPT-4.1 might now exceed those thresholds is unknown, since OpenAI said it does not plan to publish a technical report.
OpenAI did publish a technical safety report for its new o3 and o4-mini models, which were released on Wednesday. But at the same time, earlier this week it updated its “Preparedness Framework” which describes how the company will evaluate its AI models for critical dangers—everything from helping someone build a biological weapon to the possibility that a model will begin to self-improve and escape human control—and seek to mitigate those risks. The update eliminated “Persuasion”—a model’s ability to manipulate a person into taking a harmful action or convince them to believe in misinformation—as a risk category that the company would assess during it pre-release evaluations. It also changed how the company would make decisions around releasing higher risk models, including saying the company would consider shipping an AI model that posed a “critical risk” if a competitor had already debuted a similar model.
Those changes divided opinion among AI governance experts, with some praising OpenAI for being transparent about its process and also providing better clarity around its release policies, while others were alarmed at the changes.