Innovation

Is this mystery chatbot really GPT-4.5 in disguise? Here's how to see for yourself

People are using the 'im-a-good-gpt2-chatbot' for some impressive tasks - and you can, too. Here's how.

Written by Sabrina Ortiz, Editor May 8, 2024 at 11:39 a.m. PT

Magnifying glass chatbot — Francesco Carta fotografo/Getty Images

Since launching ChatGPT, OpenAI has continued to work on new AI projects that build on the success and popularity of its AI chatbot. Now, the appearance of a new mystery large language model (LLM) gives the public a sneak peek at its latest project -- and it's impressive.

Last week, "gpt2-chatbot" appeared on the Chatbot Arena, a benchmarking platform for comparing the performance of LLMs. The LLM caused quite the stir by outperforming many of the most popular LLMs on the market, such as Gemini, Claude, and even GPT-4. To the disappointment of many, however, Chatbot Arena quickly removed "gpt2-chatbot."

Also: Google was right to be worried: OpenAI reportedly wants to enter the search market

As of last night, however, if you visit the Chatbot Arena, you can encounter what seem to be two variants of the original chatbot: "im-a-good-gpt2-chatbot" and "im-also-a-good-gpt2-chatbot."

Despite the two models having "GPT" in their names, which usually denotes OpenAI's family of Generative Pre-trained Transformer (GPT) LLMs, the company has not officially acknowledged that it's behind the model. OpenAI CEO Sam Altman posted on X to merely cryptically state the name of one of the LLMs, "im-a-good-gpt2-chatbot," as seen below.

im-a-good-gpt2-chatbot
— Sam Altman (@sama) May 5, 2024

Even though the models are available in Chatbot Arena, accessing them is tricky. The two models are not in Chatbot Arena's list of supported LLMs and thus you can't test them in the side-by-side comparison feature.

Instead, if you want to access them, you must keep initiating an Arena (battle) comparison -- which randomly selects two LLMs to compete against each other -- until one of the two new models comes up. It took me five rounds to finally have one of the two appear, as seen below. If you're determined to test these models for yourself, the extra effort is worth it.

Once you have either "im-a-good-gpt2-chatbot" or "im-also-a-good-gpt2-chatbot" open, you can keep chatting with the model to test its capabilities for yourself. You can keep asking questions until you decide to start a new round or hit refresh.

Also: These four new Copilot for Microsoft 365 features make prompt writing like a pro even easier

Users have tested the new anonymous models' impressive capabilities, including by creating a Flappy Bird clone with one prompt, creating a code interpreter that uses Claude Opus, and even reasoning through basic physics questions.

Whoa the new gpt2-chatbot just created Flappy Bird clone in one-shot 🤯
And it was a dead simple prompt. 🧵👇 pic.twitter.com/rxwv6sJ5cw
— Min Choi (@minchoi) May 7, 2024

These improvements have led people to speculate that the model is OpenAI's GPT-4.5 or GPT-5, released under a penname so that OpenAI can benchmark its performance accurately. When one user asked the "im-a-good-gpt2-chatbot" what exact LLM version it was, the model said, "I am based on the GPT-4 architecture, specifically the GPT-4.5 variant."

There's no way of knowing whether this is the result of a hallucination; until OpenAI confirms anything, it is best to err on the side of caution when using this LLM. If you are even the slightest bit curious, however, I encourage you to give it a try. It's free.

Artificial Intelligence

Editorial standards

Show Comments

Is this mystery chatbot really GPT-4.5 in disguise? Here's how to see for yourself

Artificial Intelligence

Related

How I test an AI chatbot's coding ability - and you can too

What is Copilot (formerly Bing Chat)? Here's everything you need to know

This AI chatbot can sum up any PDF and answer any question you have about it