site:the-decoder.com - 搜索 News

Frontier models fail hard at "Humanity's Last Exam" but experts question if it matters

An international research team has developed a new benchmark that reveals the current limitations of LLMs. Even the most advanced models fail at 90 percent of the tasks - for now. The test, called ...

the-decoder3 天

OpenAI emerges as sole customer for proposed Stargate AI infrastructure project

Inside sources told the Financial Times that the Stargate AI infrastructure project will provide computing power exclusively to OpenAI. The project, announced earlier this week by OpenAI, SoftBank, ...

the-decoder4 天

"Nerd sniping" and "think less" attacks emerge as AI models get more time to reason

A new study by OpenAI shows that AI models become more robust against manipulation attempts if they are given more time to "think". The researchers also discovered new methods of attack. A recent ...

the-decoder4 天

OpenAI's Operator and Computer-Using Agent bring autonomous AI agents closer to reality

OpenAI has just launched Operator, an AI assistant that can navigate the web on its own. The tool, currently only available to US ChatGPT Pro subscribers, represents a step toward AI assistants that ...

the-decoder4 天

Perplexity announces new assistant for Android smartphones

Perplexity is stepping into Google's territory with a new AI assistant for Android that can control apps and handle tasks on its own. The move puts the startup in direct competition with Google's ...

the-decoder4 天

AI image generation gets a boost by borrowing ideas from reasoning models

A team of researchers from NYU, MIT, and Google has found a way to improve AI-generated images by borrowing ideas from recent AI reasoning models like OpenAI's o1. Their approach enhances image ...

the-decoder5 天

The Stargate Project: $500 billion for US AI infrastructure

A new company called "The Stargate Project" is bringing together some of tech's biggest names to build what could become the largest AI infrastructure network in history. The joint venture between ...

the-decoder5 天

OpenAI reportedly launching ChatGPT's first browser agent "Operator" this week

According to a report from The Information, OpenAI plans to launch "Operator" as a new ChatGPT feature for browser control later this week. The feature will offer several task categories, including ...

the-decoder5 天

Gemini 2.0 Flash Thinking: Google's smallest model takes lead in Chatbot Arena

Google's experimental AI model Gemini 2.0 Flash Thinking has jumped ahead of its competitors, scoring impressive results in math, science, and general performance tests. According to testing platform ...

the-decoder5 天

Google's Gemini AI inches closer to becoming a virtual agent with multi-app integration

Google is rolling out several updates to its Gemini AI assistant for Android, focusing on how it handles multimedia, works with other apps, and becomes more accessible. The biggest addition is Gemini ...

the-decoder6 天

Trump's reversal of Biden's AI safety rules puts US on fast track in AI race

Donald Trump has eliminated his predecessor's AI safety regulations, creating a regulatory gap for artificial intelligence development in the United States. In one of his first moves as president, ...

the-decoder6 天

Moonshot AI unveils Kimi k1.5, China's next o1 competitor

Following DeepSeek-R1's release, another reasoning model has emerged from China. Moonshot AI's new multimodal Kimi k1.5 is showing impressive results against established AI models in complex reasoning ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果