OpenAI o3 vs. Google Gemini 2.0: Which Model is Closer to AGI?
The year 2025 is shaping up to be a pivotal moment in AI innovation, driven by a race between technology giants to create artificial general intelligence (AGI), or AI that can reach human-level intelligence. Recently, OpenAI and Google (GOOGL) introduced their new AI models: o3 and Gemini 2.0. OpenAI’s o3, announced on December 20, is a conceptual model that CEO Sam Altman says could achieve AGI once it clears security tests, while Google CEO Sundar Pichai describes Gemini 2.0 as “the most thought-out model to date.” now” of the company. Both models demonstrate key capabilities of AGI, although their methods are different. While the new OpenAI model focuses on cognitive capabilities, Google is positioning Gemini 2.0 as a “highly integrated AI tool” designed for efficiency and real-time problem solving.
OpenAI’s O3 focuses on high-level thinking, using “private thought processes” to solve problems. This approach allows him to excel in physics, mathematics and science-related thinking. It has shown excellent results in ARC-AGI test-a benchmark to test the ability of an AI model to learn new skills without its training data. The o3 model scored 87.5 percent and 75.7 percent on the high computer setting and the low computer setting, respectively, triple the performance of its predecessor, the o1. (OpenAI reportedly avoided naming the model “o2” due to a trademark conflict with British telecommunications company O2.)
Success is expensive, however. It currently costs OpenAI $20 per task in low compute mode and thousands of dollars in high compute mode. “These skills are a new area, and they need scientific attention,” said François Chollet, co-ordinator of the ARC-AGI benchmark.. It will be interesting to see how OpenAI prices o3 subscriptions, especially since Altman said the company losing money on OpenAI Pro subscription because of the high cost of use.
Gemini 2.0’s strengths lie in multimodal capabilities, such as audio processing capabilities. “Thinking Mode” is a prominent feature, which improves thinking and provides step-by-step explanations. Gemini 2.0 also supports the ability to create integrated outputs—such as blog posts with text, AI-generated visuals and multiple languages. text-to-speech audio– with one help. Users can also fine-tune the tone and style of the sound.
Experts remain divided on whether these developments represent real progress toward AGI “We’ve certainly made progress in AGI, but I think there’s still a long way to go, and one of the exciting things is the advance of marketing,” Thomas Malonedirector of the MIT Center for Collective Intelligence, told the Observer. “Benchmarks are a new way to measure AI capabilities, but they don’t capture all human intelligence.”
Chollet expressed concern that OpenAI’s O3 may not yet have the kind of “general” intelligence that AGI requires. “I don’t think o3 is AGI yet,” he wrote in the letter blog post. He pointed out that the upcoming ARC-AGI-2 benchmark may still pose a major challenge to o3, which may reduce its performance under high-end computing conditions.
“The biggest technical hurdle in AI’s progression to AGI is long-term memory, which allows a model to retain the full context of every action it takes. Latency and cost are also challenges, but those will likely improve quickly—this is just the first generation,” Will Bryk, CEO of Exaa company that builds web search infrastructure for AI chatbots, told the Observer. “The best definition of AGI is that it can automate a large part of the knowledge economy. We’re not there yet, but we’re getting closer to AGI.”