The big picture
Unless you're a specialist in AI or a related field, the results of benchmark tests may appear humdrum, offering little insight beyond identifying which models excel at specific tasks.
That’s perfectly fine, because benchmarks are just benchmarks. Ultimately, the true test of Gemini's capability (and any other AI model, for that matter) should derive from everyday users, like you, who may utilize it to brainstorm ideas, search for information, write code, and more.
In line with this perspective, Gemini is pretty cool, but it's not revolutionary, at least not yet.
Gemini holds an advantage over GPT-4 due to its access to information from its widely-used search engine, in addition to what it can gather from the internet. OpenAI primarily works with the latter. Moreover, if SemiAnalysis's claims prove accurate, Gemini can tap into significantly more computing power than GPT-4, courtesy of Google's ease of access to top-tier chips. The superior results of Gemini in the MMLU/MMMU benchmarks become even less surprising when viewed against this backdrop.
There are bigger and more pertinent questions which need answering, often concerning more significant aspects of AI models. For example, is Gemini less prone, or better, immune to hallucinating and making things up as most large language models (LLMs) currently do?
While Google is eager to impress the public with Gemini, even the company is quick to temper expectations. Demis Hassabis, co-founder and CEO of Google DeepMind, told Wired that “to deliver AI systems that can understand the world in ways that today’s chatbots can’t, LLMs will need to be combined with other AI techniques.”
Hassabis is probably correct. What he was alluding to is the idea of artificial general intelligence, or AGI for short. This is something that the major players are clearly cognizant of and actively working toward developing. Rumors are even swirling around that OpenAI has made a breakthrough in this regard—a project that has since been referred to as the Q* (Q-star).
As more AI companies enter the fray and introduce newer LLMs and related technology, achieving AGI seems inevitable albeit likely on a timeline too lengthy for the liking of the enthusiasts and innovators.
Google, initially complacent, entered the AI race after OpenAI introduced ChatGPT to much acclaim. While its response—first with Bard and now Gemini—may not be groundbreaking, it exemplifies how latecomers can narrow gaps through concerted effort. If anything, this should serve as a significant revelation and the biggest takeaway for other AI enterprises seeking to catch up with the heavyweights.
|