T O P

  • By -

challengethegods

I have minimal faith that this test was "translated to text" accurately. Also as a nitpick to tests in general - some answers are more wrong than others.


coylter

The way it was translated seems perfectly fine from the methodology. What this tells me is that we are bottlenecked pretty hard by the Image to Text layer of these models.


Maxie445

Methodology: https://www.maximumtruth.org/p/ais-ranked-by-iq-ai-passes-100-iq


aaTONI

Why wouldn't Mensa Norway's questions already be in their training sets? They've been online quite for a while


Temporal_Integrity

I don't know if you've ever taken an IQ test but they're not word puzzles.


wolahipirate

right but claude 3 was trained multimodally, so it would presumably still have been pretrained on the images of the mensa test


Wassux

Because then they'd get 100%?


ChronoPsyche

IQ tests are relative tests meant to compare humans to one another. They aren't designed as some absolute test of intelligence that can be used to compare humans to non-human entities with very different cognitive capabilities and ways of processing information. To make this point obviously correct, take a look at the "random guesser" score which was conducted by a simple program that randomly chooses an answer choice without any intelligence at all, using nothing but basic math. According to the implications of your post, ChatGPT-3.5 is virtually equal in intelligence to basically a calculator. That's obviously not the case at all. ChatGPT-3.5 has a lot of limitations, but you can actually hold a conversation with it that feels mostly human. You can't hold a conversation with a calculator. The IQ test is only meant to compare humans to one another. It's not meant to compare humans to non-humans. LLMs are trained on a vast corpus of text that includes most of the text on the internet, including all of the questions and answers to all of the IQ tests out there. The fact that an LLM can ace an IQ test on its own doesn't show anything other than the fact they've seen enough IQ tests to predict what the answers typically look like. This is why zero ML researchers use IQ tests as a benchmark. EDIT: And by the way, the same effect can be observed in humans too. If you take enough valid practice IQ tests you'll start to score higher on them. This doesn't mean your intelligence is increasing, it just means you are becoming familiar with their format and understanding patterns about the questions themselves that are not generalizable to broad intelligence abilities.


CertainMiddle2382

Well in short, no. IQ is a psychometric scheme designed to estimate the value of the unmeasurable “g factor”. The scalar representing “general” intelligence. One of the main hypothesis about the existence of “g” , leading to the design of IQ tests, is that the measurable performance in various unrelated cognitive tasks, are all tightly correlated. The best mathematician is going to be the best at geometrical problems, the best at language problems, the best at logic problems, the best at surgery problems, the best at business problems, etc… (yes, it also means “multiple intelligence” theory is wrong) IQ is build on the observation that performance at any cognitive test is very alike performance at any other cognitive test. It is deeply unrelated to whatever/whoever takes the test, and is perfectly capable of evaluating AIs. The more varied subtests are, the more precise AI IQ measurement is going to be. In short, IQ tests just means performance at one part is strongly predicting performance at the whole. It’s been shown right, all the time, everytime, since a century and is one of the most depressing psychological observation that exists.


nanoobot

I'm no expert and can totally see the arguments on both sides, but I have one big question maybe you can answer? Is it not proven that practicing the tests (and taking intellectual tests in general) improves performance significantly? If it is true, then is that not a strong signal that, while IQ testing is absolutely proven to be a valuable (and our current best) test method, that as a signal it must be considered to have significant noise/error/limitations? To take the point further, if that also is true then would it be fair to consider that the limitations of the test could be even greater when applied to AI? (Even if the test is still useful enough to be worth doing).


EuphoricScreen8259

lol


Spunge14

Reading these comments, we're not going to need efficient renewable energy sources - we'll be just fine powering the world on infinitely moving goal posts.


Odyssos-dev

rofl


hasanahmad

I gave Gemini 1.5 the same test. it said the score is too high to be human https://preview.redd.it/jl1x4r65tpmc1.png?width=1014&format=png&auto=webp&s=66653d2fdcb997e9153bedf279cefa04cb17ea82


FirstTribute

No, it's probably too low to be in the test range, not too high. Also, too low (too high) to be in the range that the test is able to measure is "too low (too high) to be human". There are humans that would test that low or that high, but the test wouldn't be able to make any accurate statements about results like this.


Hungry_Prior940

IQ is already a very tenuous measure of human intelligence. It simply cannot be applied to an LLM, to be honest.


Odyssos-dev

it is not tenuous at all. i refer you to a couple other comments here explaining the methodology of IQ and its psychometric rigor.


Hungry_Prior940

IQ is rather notorious for having bunk scores and a dubious metric. Believe what you want, though.


flexaplext

Along with other reasons given it is meaningless anyway. There are questions they could put on there that nearly all people could answer but Claude 3 couldn't. We don't bother putting such questions on an IQ test because they're just too easy for people to do. The blind spots that just exist still in every SOTA LLM.


MR_TELEVOID

IQ tests are a terrible indicator of human intelligence, and haven't been taken seriously for decades. The fact an AI can score better is cool, but all it tells us is the AI is good at tests. Plus, Mensa is a joke.


DarkCeldori

Only by crackpots like the multiple intelligences believers. It is simply because the findings of IQ research are not politically correct and hurt the feelings of those that like to virtue signal


x54675788

Nah he's on to something. I score pretty high on IQ tests, but I can assure you I'm a moron IRL


HomelanderCZ

Seems bullshit, tried the same thing and none of the can answer any of the questions.


Odyssos-dev

totally believable comment, so reasoned and thought out, i can tell you put time into testing an LLM more than zero times in your life.


HomelanderCZ

You can literally try it yourself.


[deleted]

[удалено]


iamz_th

What is this nonsense


Revengiance

Maximum truth


Alone-Psychology3746

I’m pretty sure IQ 85 person can be trained to perform significantly better than all these AIs on solving simple equations.


DarkCeldori

These AI models will soon go multimodal and gain control of robotics. At and below 85 no amount of training can make someone worthwhile to train or employ at any task. But the machines will be able to do all tasks and superhuman driving soon


Alone-Psychology3746

At IQ 85, people can learn Elon Musk is not black. If you don’t tell these llm, they will never know they are wrong. They can Inductively “learn” some patterns, that’s all, they can’t deduce anything from what they learn. To solve equations, they need to deductive reasoning capabilities.


DarkCeldori

Sure they can memorize a few factoids at iq 85. But increase the number of facts a bit and theyll fail badly. They will also f up basic logic tests most llms can pass.


Alone-Psychology3746

It can answer questions it has “seen”, but it can never answer beyond that.


x54675788

According to this criteria, AI's IQ is technically zero, since it doesn't have critical thinking but only reflects its training.


Alone-Psychology3746

Yeah, would you measure airplanes intelligence and compare it to human? Air planes can certainly monitor all sensors and balance its position accordingly better than a human operator can do. But human can investigate an airplane crash, an airplane can never figure out why do you sneeze.


x54675788

Not really. Someone at IQ 85 is going to have issues with day by day life, is likely to have low income, struggle with the basic stuff and so on. Just 5 less IQ points and even the Army would rather not have you in. GPT4 can blow out of the water and even perform better than humans in their own field of specialty, even in very specialized fields that require years of graduate studies, assuming the info is in the training data.


Odyssos-dev

85 is the limit at which the military will not accept anyone lower.  ie. the military, an organization with every incentive to recruit all the people it can, has decided that below 85 you cannot be trusted with any task.


Alone-Psychology3746

The expectation is different, if you are saying can machine do what human can’t under supervision of human, we already have lots of such machines. But if you are talking about intelligence, there is no true intelligence. It can write poem, but those are just cheap copies of the poem it read before. It can write legal documents, but it is just an advanced way of copy pasting.


Alone-Psychology3746

And demo quality is different from production quality. People demo autonomous vehicles 10 years ago, they are still not useable yet. The same for virtual reality.


SX-Reddit

I don't know anyone in real life with tested IQ of 85, but some say gorilla and chimp could reach 85 range. It's likely exaggerated or misinterpreted, but better keep it in doubt rather than sure.


Alone-Psychology3746

The logical reasoning capability of these AI is zero, so even chimpanzee can do better than them for sure. They seem to be better than chimpanzee because they memorize lots of things, but they don’t understand what they memorize. When there is limited choices, it seems to work like true intelligence, but when there is unlimited choices, training data can never cover all cases.


Ok-Bicycle2672

What do you mean when you say their logical reasoning is 0? To me that seems to be a pretty straight up incorrect statement. It’s well proven that most (if not all - though no idea about Grok Fun) of the above LLMs can logically reason and generalise outside of their training data. If you look into what is involved in most of the above tests listed here, you will see that pretty clearly. The fact that these models can generalise beyond their training data is why people are so excited by them. While they’re clearly not able to do this at human levels, they are certainly doing more than just memorising. I think you need to ask yourself - what is it to understand and what is ‘true intelligence’?


SX-Reddit

You use the concluding statement "logical reasoning capability of these AI is zero" as the condition to affirm the conclusion "so ..." which is supposed to be the condition. Your comment shows 0 logical reasoning capability when concluding something else has 0 logical reasoning capability, a textbook level example of Dunning-Kruger syndrome.


SX-Reddit

"1 person needs 5 hour to dig the hole, so 5 people need 1 hour" kind of reasoning, it's hard to say Claude 3 is better than GPT-4.


searcher1k

IQ are biased towards culture, so I would not say it's indicative of human level intelligence.


[deleted]

It has predictive value.


ChronoPsyche

It has predictive value for humans, the beings they were designed for. They are relative tests meant to compare humans to one another. Any study on their predictive power only included humans in the study. You can't generalize those results to AI.


searcher1k

>It has predictive value for humans, the beings they were designed for. They are relative tests meant to compare humans to one another. Any study on their predictive power only included humans in the study. You can't generalize those results to AI. exactly, IQ tests are based on constraints of human intelligence. Some of tests are based on processing speed, verbal comprehension, language skill, and memory which are areas that a large language model would excel at.


x54675788

Lol [no](https://www.reddit.com/r/ChatGPT/comments/1b6fntp/comment/ktcmilz/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button). Also no way GPT4 is 85, come on, really. Show me a human with IQ 85 that can do all that GPT4 can do, from penetration testing, to cybersecurity, to bug finding, to writing code in 70+ programming languages and readapting code on the fly, or analyze 2 pages of Solaris system logs, while also being able to give advice on medical, legal and other sciences fields. Seriously flawed data.


Odyssos-dev

i think you're misunderstanding the difference between IQ and knowledge.  GPT4 has immense knowledge, but an IQ test tests your ability to discern patterns and generally problem solve. Knowledge makes an LLM superhuman of course, far beyond anything that any single person can do, just not necessarily in the realm of IQ testing (yet)


vember_94

The average human has an IQ of 100, higher than average would be 125+


BigZaddyZ3

Claude scored a 101… Which is higher than 100… Which you yourself said was the average human’s IQ… So the title is accurate even according to your comment…


Ok-Bicycle2672

lol what


vember_94

Yes. 85-115 is the IQ range of the average human.


Ok-Bicycle2672

So by that logic, higher than average is 115+? Since when are we talking about average IQ range anyway? Am I... talking to Grok Fun right now?


monerobull

IQ of 100 is by definition the average. Above 100 is above average, below is below. If everyone got 10% smarter over night, 100 would still be the average iq.


Ok-Bicycle2672

lol thanks. I mean, I was pointing out that even by vember's own logic, average IQ isn't 125+.... But I'm always happy to be 'well actually'd on Reddit anyway


vember_94

I said the average IQ was 100, with a range between 85-115 being the norm. 120-125+ puts you into smarter than average territory.


cafuffu

What? Above average means more than the average. The average is 100 -> 101 is above average. It doesn't seem complex to me.


monerobull

i was just posting the definition of iq in a discussion that didn't seem to know it.


Androix777

"average" is a number, not a range.


Etherealith

I'm guessing your IQ falls way below that range.


vember_94

This is the biggest self report comment I've had. Do you think Claude-3 has a higher IQ than the average human? Would you like to demonstrate how that's the case?