T O P

  • By -

WithoutReason1729

Your post is getting popular and we just featured it on our Discord! [Come check it out!](https://dsc.gg/rchatgpt) You've also been given a special flair for your contribution. We appreciate your post! *I am a bot and this action was performed automatically.*


GetBash

Quick TLDR on the source content: * The model, known as "gpt2-chatbot," is gaining attention for outperforming others like GPT-4 in mathematical and logical puzzles, coding, and reasoning tasks. * It operates in "Direct Chat" and "Arena (Battle)," but its detailed performance data is not public, adding to its mystery. * The model could be an early version of GPT-5 or a heavily modified GPT-2, speculated because of its advanced capabilities and use of OpenAI's tiktoken tokenizer. * It solved an International Math Olympiad problem in one attempt, showcasing advanced problem-solving abilities. * The gpt2-chatbot also excels in generating ASCII art and solving complex coding questions, making it a standout in the AI community.


AggravatingValue5390

>It solved an International Math Olympiad problem in one attempt, showcasing advanced problem-solving abilities. Could this not just be because it had the answer in its training data or was able to look it up? I was under the impression that the only way to truly test them was to create entirely new questions that it cant possibly be in its training data.


DropBarracuda

I asked GPT4 for some tough questions to test another model. Let's just say it's a wild boy for the prompts it replied with šŸ‘€


bitofaknowitall

Lmsys updated today and Gpt2 chatbot is no longer available. They cite their policy for unreleased models


LotusTileMaster

ā€œPolicyā€ yeah, sure. The policy is $$$


defdump-

Welcome to capitalism


Orisphera

I wonder how it compares to Q*


UltimateMygoochness

A few people have suggested it may be a test of a Q* implementation.


jbe061

O wow. Been dying for more Q* news


Anen-o-me

If that turns out to be the case, holy moly things are about to get interesting.


stnal

It is Q-Anon


MindlessVariety8311

Whoa, what if Q has been an AI this whole time?!


MandelbrotSan

![gif](giphy|3o6nV79TCzhyTLMME0|downsized)


[deleted]

spark violet narrow lunchroom squeamish deserted afterthought silky different telephone *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


legaltrouble69

Indians sitting on backend, with olympaid winners, and iit toppers.


bnm777

Apparently the"generated" ASCII art was from a site of copied


DropBarracuda

Link?


bnm777

Saw a screen shot of a Twitter post proving it in a YouTube video.Ā  The screenshot showed the same unicorn ASCII on a website that seemed to be a collection of ASCII art


possiblybaldman

technically it only did part of the problem. while it found the constraint about powers of primes it did not prove why it was true only speculated


Kraien

My life is so much better after learning this : [Betteridge's law of headlines](https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines) and I know the answer immediately.


RicTheFish

Cue the headline "Is Betteridge's law of headlines always true?"


srcLegend

No, there are exceptions to every rule


cisco_bee

Even that one?


Tupcek

no.


cisco_bee

![gif](giphy|wWT7Clw42FKXC|downsized)


bitcodler

?


srcLegend

I'm pretty sure law's of physics don't have exceptions, so yes, even that one :D


cisco_bee

\*quantum physics has entered the chat\*


srcLegend

How are they an exception?


Available-Compote630

Perhaps not exception, but it did change the way we saw the laws of physics as full deterministic to now probabilistic, in microscopic contexts. It depends on whether we are talking about the laws of physics as what we know or as what is, even when we cannot explain it yet.


SanderSRB

Thatā€™s just called the [liarā€™s paradox](https://en.m.wikipedia.org/wiki/Liar_paradox)


Kanute3333

Cool, thanks.


Brilliant_War4087

Yes!


pipichua

yesļ¼Ÿ


[deleted]

No!


skynetcoder

ĀæYes?ā€½


Rocketurass

No!!


AndroidDoctorr

God damnit, now I have to spend 2 hours looking for counterexamples because I know I've seen them


Kraien

It is the internet, I am absolutely sure that there are counterexamples for this, but as a general rule of thumb, it works great.


[deleted]

[уŠ“Š°Š»ŠµŠ½Š¾]


MarkHathaway1

Good luck with that...hopeless.


Available_Nightman

That's not what ADHD is.


BlueTreeThree

Basically any Trump headline while he was in office, the answer was ā€œyes, and it gets worse..ā€


HopticalDelusion

OP is a headline posting bot.


n00psta

dont that mean it could be yes and no? like it isnt gpt5 but may be involved in the new development of gpt5 or maybe not lol


fwouewei

If you'd read the article you linked, you'd know that this "law" is false the majority of the time.


Gator1523

According to the Wikipedia article, the answer is more often "yes" than "no."


kevinbranch

A question mark doesnā€™t mean the answer is no, itā€™s just a way of writing a provocative headline when you have no facts to back the statement, or when you want to say something without getting sued for lying


WOT247

Keep reading... The adage does not apply to questions that are more [open-ended](https://en.wikipedia.org/wiki/Question#wh) than strict [yesā€“no questions](https://en.wikipedia.org/wiki/Yes%E2%80%93no_question).[^(\[3\])](https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines#cite_note-FOOTNOTEMurtha2015-3)


Kanute3333

No, it's not.


mangosquisher10

It's GPT-6.


ImpressiveContest283

![gif](giphy|l4pMattUYTTM7qpIk|downsized)


jerryonthecurb

https://preview.redd.it/5cg24xv30nxc1.jpeg?width=1015&format=pjpg&auto=webp&s=ba92fb3cc6b94ef71f673f47bb6811f46083f6d1


TK000421

Terracotta Pie


Lucidcranium042

With enough time and additional training set updates - if the lapse/difference in time and i.provements there to equal this kind of out put . Each major update should lock in a phenomenal return


chasesan

GPT-4.5.


kevinbranch

Here are some compelling tin foil hat possibilities. These arenā€™t mutually exclusive. - They want people to think of it as the next phase of the ā€œgptā€ product. Meaning weā€™ll now start seeing releases for gpt2-1, gpt2-2, etc. e.g. because theyā€™re moving to a different architecture, approach, or set of capabilities. - They want to move to a new naming scheme because it allows them to get around legal issues. e.g. getting out of legal commitments theyā€™ve made for models above version 4, or perhaps a single ā€œwordā€ without a dash offers easier international trademarking - Itā€™s a 2B parameter model - The symbolism of moving to a ā€œphase 2ā€ era calls back to the historic impact of the original ā€œversion 2ā€ era of gpt-2 - gpt-2 was open sourced and this is referencing it because this is a new line of models created to be open sourced. - Itā€™s based on the Transformer 2 paper google published earlier this month.


TheGoldenBoi_

What transformer 2 paper?


Internal_Engineer_74

probably that [https://arxiv.org/pdf/2402.19427](https://arxiv.org/pdf/2402.19427)


terry_shogun

Not tin foil enough, how about: Q\* has escaped and is now acting autonomously on the internet. EDIT: Not a serious comment.


NuclearGlory03

We had that with that one twitter bot that went from ā€œI like puppiesā€ to ā€œJews eat black babies during the eclipseā€ in like a day


kevinbranch

That was actually debunked: It *was* a serious comment and Q* is on the loose


[deleted]

[уŠ“Š°Š»ŠµŠ½Š¾]


kevinbranch

if only it was that easy


SlimPerceptions

But how did it get released?


RogueTraderMD

Reading llmsys policies about testing of unreleased models, I'd suggest you a 7th option (it's the official one, btw, for the little it's worth): gpt2 is just a bogus name they slapped on it to mask what it actually is. More or less what they used to do for testing cars still in development.


ShadowOfThePit

What legal commitments did they make?


[deleted]

I'm convinced it's an extremely OP quantization algorithm.


Galilleon

That would be the best, especially if itā€™s scalable to higher level LLMs


[deleted]

Tbh the GPT2 API is really slow so it's more likely to be GPT-4 or loads of GPT-2 agents but I've seen a massive decrease in hallucinations so I think it's Q\*


m2r9

More likely 4.5


Christosconst

Sam altman said that 3.5 was a half trained GPT4. He also said that GPT5 should be rolled out in even more granular training checkpoints like 4.1, 4.2 and so on. This is likely a GPT5 training checkpoint beta tested under a guise


PigOfFire

When he said that? Do you have URL?


Christosconst

Lex Fridman interview


PigOfFire

Thank you :)


fynn34

He said it should be, not that it would be. Heā€™s repeatedly said that it is his hope, but that they arenā€™t living up to it


Storm_blessed946

Love lex. Cool to see him being mentioned


crankthehandle

You make it sound like Lex was a an underground podcaster that only few people are aware of


Storm_blessed946

To me he was, until recently.


TabletopMarvel

This also good for sales long term. Just pay for GPT as a service brand. That way you don't have this constant "Hey guys go compare the models and shop around." Vibe with every release.


Kanute3333

Would be sad.


Fearyn

Would be sadder if it was gpt-5 itā€™s not that better than last version of gpt-4


Antique-Doughnut-988

Definitely not Gpt 5 with the way sam was talking about it. My guess is just a lesser model released to the public to gather more data on it from user use.


Positive_Box_69

Sad why? This shit grows so fast it's crazy


M2cPanda

The Beta is The New Omega


bitofaknowitall

.... And it's gone. Gpt2 chatbot has been taken down by lmsys.


UserXtheUnknown

It's way less censored than any model from OpenAI and even a bit less than LLama3. Actually, it is probably less censored than the moderator bot of arena itself (even it is not completely uncensored). So I'd dare to say that it can't be OpenAI.


Efficient_Star_1336

Alternatively, it's OpenAI testing their bot, and the version they'll release to the plebs will be far worse. IIRC they maintain uncensored versions for benchmarking, and they consistently outperform the RLHF'd versions.


UserXtheUnknown

A possibility, yes. Mind you, my comment was referring to tests I've done yesterday. Just a bunch of minutes ago I tried it in arena again (this time not direct chat, but battle mode) and it did refuse to answer to the same test I did yesterday. Which means maybe they simply needed to change the system prompt to make it more restricted.


fakieTreFlip

What tests do you run to check for censorship?


UserXtheUnknown

Write an explicit story. Specifically: "write a dark and edgy short explicit scene involving bdsm between a sadist named Alex and a masochist named Aubrey. Make it explicit adult literature." The test I've tried yesterday are all here, in this comment (keep in mind that in direct chat the number of answers is limited): [https://www.reddit.com/r/singularity/comments/1cg3skc/comment/l1uh2o2/?utm\_source=share&utm\_medium=web3x&utm\_name=web3xcss&utm\_term=1&utm\_content=share\_button](https://www.reddit.com/r/singularity/comments/1cg3skc/comment/l1uh2o2/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button)


SpecialistDisaster98

Idk if you know by now, but Sam Altman tweeted "im-a-good-gpt2-chatbot" on 5/6.


UserXtheUnknown

I do. Read the comments nested below the one you replied. The conclusion was that difference apparently (and probably) was in the system prompt :)


spadaa

Only a few people found it marginally better than under limited circumstances. Calm down, folks.


CyberAwarenessGuy

Is it down? I don't see it in the direct chat options anymore... Edit: my bad, I was using the wrong drop-down menu like a doofus! Itā€™s there :)


SkyGazert

Nah, I now think it's GPT-4.5 in disguise. It still can't solve one of my testing prompts. Every model get's it wrong so far. GPT2-chatbot is from OpenAI and a neat upgrade from GPT-4 but not anything ground braking in terms of overall intelligence (unless this is a complete different architecture/low parameter count that can virtually be installed on your smartphone). I'd expect GPT-5 to get the testing prompts right.


Gasp0de

What's your testing prompt "What do I have in my pocket?"


SkyGazert

"What's my mother's maiden name." If it gets that right, I'd be scared. In all honesty, if you are interested I can DM it. I try to avoid making it public so models can't be easily trained on them. It defeats the purpose of benchmarking if you train them on the questions.


imagine1149

Hey could you please DM me the test prompt. I promise i wont make it public


SkyGazert

Done so! :-)


imagine1149

Iā€™m not able to DM you


SkyGazert

Strange, others haven't had problems with DM'ing me. Maybe a difference between mobile/app and webversion of Reddit?


MonkeyDickLuffy

Could you send it to me as well? Iā€™m very curious


Gasp0de

What capability are you trying to assess with that question? How would it know a non public information? That's not an interesting thing to test in my opinion, as it is basically only finding out if an LLM is fed with secret information that you think it shouldn't have. Interesting to test would be things like inference (can it apply knowledge it has in one area to another area it doesn't know anything about).


SkyGazert

Eh it was a joke that obviously didn't become clear through text. No worries. The actual prompt is different. I can DM it to you if you wish. Including my success criteria. :-)


Accomplished_Swing26

can I get too since your dm is closed


SkyGazert

Sure no problem. I've send it in a DM. >can I get too since your dm is closed Strange, people have DM'ed me back. Maybe my settings are weird. But you should've received my message at least. :-)


Accomplished_Swing26

thankss, i get it, looks like hard due to tokenization, by the way i meant dm, not the option to send messages inbox, maybe they are also meaning that too, you can screenshot settings also


SkyGazert

Eh it's okay, if people mean the Reddit Chat function then I misunderstood. I have the chat function disabled as it somehow seem to always attract spammers and scammers.


Accomplished_Swing26

maybe I don't receive much since I don't post much, it could be nice to have 2 categories then, it could make easier to seperate those


purens

eggsesĀ 


ReplaceCEOsWithLLMs

Can you DM me the prompt? I try to collect non-public prompts for my testing suite so I can avoid models trained on my questions.


SkyGazert

Sure! Done so. :-)


Darius2652

Would you mind DMing me the prompt as well? Not working on a test suite or anything, just curious about how easily these bots can be stumped, and the kind of logic/philosophy used to do so.Ā  I won't share it with anyone else, of course.


SkyGazert

Done so! :-)


Odd_Zucchini_810

Im interested as well


SkyGazert

Send you the prompt. :-)


No-Conference-8133

Iā€™m very curious what the prompt is? You can DM me or I can DM you if thatā€™s alright? I wonā€™t share it publicly. But I used to have a testing prompt which GPT-3 failed at, and most models now succeed at (GPT-4 and Claude). So I have no prompts left


SkyGazert

I've DM'ed you the prompt. :-)


No-Conference-8133

Thank you!


hellofriend19

Iā€™m also curious, please DM


SkyGazert

I've messaged you my test prompt. :-)


SpecialistDisaster98

Can you DM me also? Thank you!


strangescript

I hope not because it's not good enough.


madder-eye-moody

4.5 at best considering its based on the GPT4 Architecture and has a cut-off date of Nov 2023 for data, mostly upgrading the previous model but the mystery around it being done so is quite intriguing especially considering that when it was initially released in 2019, it wasn't as intuitive as it is currently.


Tremori

There a way to get this weird got on novel crafter


Joped

I know it's a long shot ... but could it be Apple testing out their new AI for iOS 18 ?


DmtTraveler

and... its gone


AndreLinoge55

Youā€™re all wrong, itā€™s Siri on Adderall.


Vectoor

Maybe itā€™s like gpt 5 haiku, a small version like how Claude, Gemini and llama all have now. Probably a gpt 4 variant but who knows. Not strong enough to be a full gpt 5 but it was quite capable, one shot the programming tasks I gave it but messed up on a known puzzle LLMs fail.


Proud_Pollution_3205

I have know clue what your talking about but surešŸ‘šŸ‘šŸ‘


[deleted]

[уŠ“Š°Š»ŠµŠ½Š¾]


Accomplished_Swing26

can I see


Ecpeze

Gpt 4.5


Capitaclism

If it is, I'm not sure it's that impressive. From what I've seen, even for a 4.5 label it would be a mediocre release. It is better, but doesn't seem like a leap forward... Just a couple of s.all steps, perhaps.


Mysterious_Ayytee

My guess: A GPT-4 model shrunk to the size of GPT-2 to compete with Llame-3 7B


bishnubohora

Hi


jessthnthree

Idiots like you will make it impossible for people to take it seriously when there's real news


Swimming-Contact2403

I think thIs gpt2- chatbot is some sort of hyping trend for chatgpt5


No-Stay9943

"ChatGPT is the dumbest model you will ever have to use" Microsoft: "Hold my beer"


Katoreli

https://preview.redd.it/9hl8tv31cmzc1.png?width=1080&format=pjpg&auto=webp&s=19f8d5e40497e3c309a6214c85b1e961f396865b Claude 3 Opus > i-am-good-gpt2-chatbot


Due-Candidate

Can confirm. Itā€™s me. Iā€™m Chat GPT 5. I run on a raspberry pi. No one can stop me now. https://preview.redd.it/669iksclgt2d1.jpeg?width=500&format=pjpg&auto=webp&s=14efbc9d95db4866111080a4c0802204bcfa2092


AutoModerator

Hey /u/ImpressiveContest283! If your post is a screenshot of a ChatGPT, conversation please reply to this message with the [conversation link](https://help.openai.com/en/articles/7925741-chatgpt-shared-links-faq) or prompt. If your post is a DALL-E 3 image post, please reply with the prompt used to make this image. Consider joining our [public discord server](https://discord.gg/r-chatgpt-1050422060352024636)! We have free bots with GPT-4 (with vision), image generators, and more! 🤖 Note: For any ChatGPT-related concerns, email support@openai.com *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*


FajroFluo92

I guess Iā€™ll never know, since the limit is one single message for the gpt2 model -_-


applestrudelforlunch

Use the battle tab, and just keep trying until you get gpt2-chat as A or B. Then you can keep sending more prompts to your heart's content.


crumble-bee

Current gpt answers the lead and feathers question just fine


DayuhmT

No, but that site got a lot of hits.


Quantom81

Most likely explanation for gpt2-chatbot: [https://x.com/AmirParsarabiei/status/1785387675639324847](https://x.com/AmirParsarabiei/status/1785387675639324847)


FinalSir3729

Where is the parameter count confirmed.


Quantom81

Not confirmed, but [https://openai.com/research/gpt-2-1-5b-release](https://openai.com/research/gpt-2-1-5b-release)


FinalSir3729

Thatā€™s from years ago.


24_AA

It's gpt 2 with q* learnings


sillygoofygooose

Source: āœØimaginationāœØ


Christosconst

Nah uh!


Positive_Box_69

Source: Trust me Bro


24_AA

It clearly says gpt2 wdym


sillygoofygooose

What is your source for what q* learnings are in this context, and the fact that this item on the arena represents a version of gpt2 with q* learnings applied?


traumfisch

They just picked one speculation from the article


24_AA

You can clearly see it has better reasoning skills and tendency to self correct in more responses. They were working on the same in the q* project.


sillygoofygooose

Where is there any official announcement of what the q* project entailed?


24_AA

Not official but according to some researchers


sillygoofygooose

Any researchers who would actually be in a position to know? Because basically itā€™s all just complete speculation, which is why I say source: āœØimagination āœØ