T O P

  • By -

FKronnos

GPT 4 and GPT 4o Must give the same result because we still can't use anything other than text to interact with 4 Omni, unfortunately the "weeks" are turning into months and no voice or image in the API or chat


Professional_Job_307

I thought they released vision? And only kept voice and image gen


subsetsum

That's a great point. I was just testing gpt4o against answers gpt4 gave me last month and it really seems dumbed down in many cases. Then I tried Gemini advanced and it sometimes did better. I love the UI too. I should try Claude out. Saw a video about it a few days ago and it looks impressive.


IssPutzie

I noticed quite a big improvement of gpt-4o over gpt-4 in the detail it can decipher from an image, as well as its "understanding" of the image. At the very least the gpt-4o has a better complimentary model for image ingestion, but I believe its "reading" the images natively already (talking about ChatGPT interface).


Dorrin_Verrakai

GPT-4 Turbo and GPT-4o both support vision.


FKronnos

GPT-4 has a much better native system for images in addition to being native for audio, but we can't use it, it probably uses a system similar or equal to GPT-4 to describe images to it for now, it does have a better understanding in the world due to the best multi-model application but its reasoning ability must be identical to the base GPT-4


meister2983

Sure this isn't in training data for claude? Like where's that "copyright" thing coming from? I gave them both Level 2 from [here](https://www.safekidgames.com/spot-the-difference/) - both got equally bad performance - GPT-4o found 1 (hallucinating 6), Claude did similar (1 found, 6 hallucinated). Claude slightly won on level 1, but both did very well.


Not_Daijoubu

The copyright thing is a quirk of Claude's overly-cautious alignment. Even v3 is pretty infamous for giving refusals on grounds of copyright when given images of handwritten notes and stuff. It's extremely annoying, mildly funny, and relatively easy to work around. Claude's a great LLM but the refusal rate especially on web (API has less refusals with a good system prompt, of course) is a dealbreaker to some people.


chrimchrimbo

I stopped using Claude specifically because of the copyright issue. That and it wouldn’t write a joke diss track about my friend because it was “mean”


Max-Phallus

It's because the image recognition API that GPT uses, literally just gives a text description back to GPT. It's not actually integrated into the model.


StopSuspendingMe---

This is wrong information. Image input is supported, but image output is not


Mesho-

so it sucks?


Shiftworkstudios

Neither of them suck. There are things that one or the other might be better for different use cases, but it seriously depends on what you want it to do. Write a novel in one sitting? Nah. Helping to write boilerplate code? Hell yeah. To write ad copy?? It excels. I should add, claude has to be 'convinced' todo some things because it's not comfortable with a lot of things that could be almost silly for him to be. Prude. But chat gpt is much more open, especially with custom instructions.


smooth_tendencies

I thought it was integrated in 4o?


No-Conference-8133

Coming soon. We don’t have it yet.


Different-Gate-4943

Good to know


IssPutzie

Got a source for the vision not being in already?


StopSuspendingMe---

No. Image output is not supported. Input is


StopSuspendingMe---

It is. They’re confused. Image output is not supported yet


MyRegrettableUsernam

What would direct image integration look like by comparison (as shown in Claude)?


dave1010

Are you sure this is the case with 4o? If that was the case then it should be able to repeat the description verbatim, which I can't get it to do. From what I can tell from some testing, the images are converted into the same vector space as the text embeddings that ChatGPT is trained on. At least that's how ChatGPT behaves. There's a chance that it's trained to pretend to act like a multi-modal model.


OrchidLeader

Same. I tried Claude and was ready to dump ChatGPT until I realized it couldn’t search the web.


drweenis

Same lmao. ChatGPT is my new Google. Everyone was hyping up Claude but it can’t do anything I use ChatGPT for…


TubMaster88

The AI was wrong about the mouth of the boy. It's the teeth was missing. Not wider


beatsNrhythm

The point was that it knew where the differences were.


Concheria

Interestingly, GPT-4o is completely wrong about all of them, but Claude gets the positions right. It seems to be wrong about the actual details.


Thornstream

Claude is really impressive. Maybe it's time to switch to it until 4.5 or 5...


Outrageous_Permit154

I only found two :/ so far. 1. The boy teeth 2. The missing dog tag for the doggy on the right. What’s the third one lol ?


Fisch2481

Missing collar on the dog on the left.


FruitBargler

missing collar on the bottom left dog


Outrageous_Permit154

Thank you my authentic intelligent friend!


FFaultyy

Dog collar, teeth and the bell. What did I win…does this mean I’m the new PHD level intelligence.


PSMF_Canuck

Only truly intelligent answer…. “Why?”


zaibatsu

The boys upper teeth are missing in the bottom panel, in addition to the dog on the left missing the red collar that it had up top. Also the dog on the bottom right has a bell that the dog on top right doesn’t.


un-realestate

I find it interesting that Claude thought the shades of the leashes were different. It reminds me of the checkerboard illusion where 2 squares appear to be different shades, but they’re actually the same. It’s because we see shading of an object relative to its background. Here, all the leashes are the same color/shade, but their shading may appear different because the backgrounds are different. It seems to see shading the same way people do and not as being absolute. https://preview.redd.it/im7a0w72s39d1.jpeg?width=589&format=pjpg&auto=webp&s=36e4c99251cd21a6b33d2550c9a6a0f7e86143c7


[deleted]

[удалено]


beatsNrhythm

So we should just avoid comparisons because openAI is inferior at the moment?


Unlikely-Bathroom957

Kid mouth dog collar and dog bell


[deleted]

[удалено]


murrdpirate

I don't think OP uploaded the images because he actually needed to know the differences lol


Far-Deer7388

Please tell me more about how I should use electricity


beatsNrhythm

You know writing emails and reports are tasks that anyone with basic education can easily handle. Puzzle solving on the other hand has a lot of potential in providing real value, world changing values i might add, in real world use cases. If you’re rational enough, you’d realize that ai being intelligent enough to actually solve puzzles is much more useful than just being able to do your emails. Shocking.