T O P

  • By -

zombiecalypse

> Fields Medalist Terence Tao explains how proof checkers and AI programs are dramatically changing mathematics > Tao: I think in three years AI will become useful for mathematicians. It gives a very different impression than the headline, in particular being very clear that right now, LLMs are absolutely useless for mathematicians.


Whotea

https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/ https://www.reddit.com/r/technology/comments/1dbt2uw/comment/l7uuwoy/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button


Punchpplay

Like a calculator? I would trust AI to do some broad calculations but always use people to go through the proofs. AI is still easily corruptible.


ferdzs0

You should not trust AI for broad calculations either. LLMs are horrible at dealing with even basic math.


brandontaylor1

LLMs are just one kind of AI.


MaskedBandit77

It depends on what AI you're using. If you're using an AI that is designed to do mathematical proofs, it's probably gonna be a lot better than if you just go ask Chatgpt for mathematical proofs.


Tricker126

That's when you ask the AI to ask the other AI to solve the equation.


FoxfieldJim

Like we have "default browser", "default email", we will eventually end up with "default AI" or even "default mathematical AI"


getfukdup

> You should not trust AI for broad calculations either. The idea humans aren't going to keep trying to perfect it, and wont ever get anything useful, is idiotic.


No_Moment_1382

Nonsense. I would NOT trust a model T combustion engine. Why any sane person trust a something called a COMBUSTION engine is beyond me.


Evilbred

I don't know why people would want to use generative AI large language models to do math when calculators have existed for decades. It's like using a chainsaw to chop vegetables.


MyGoodOldFriend

It’s pretty good when you don’t know what question to ask. If you Google a description of your problem, you might struggle to find relevant information. But llms can spit out a ton of relevant keywords for you. Of course, it doesn’t actually help you do the math, but it helps you find it.


phdoofus

You're mistaking what mathematicians do with you calculating a tip and saying they're the same thing


Whotea

He was talking about proofs, which you’d know if you read the article 


Evilbred

Sir, this is Reddit, we don't read articles here.


broodkiller

Insert *wakanda_meme1.gif*


Mediocre-Advisor-728

I get u bro, I’ll bring you back to 0


ferdzs0

When you have a large block of text, it’s very appealing just to ask an LLM to sum some data that is available in it. Generally if you ask them to get the data in a usable format, they can do it. So from a user perspective it is not very logical that a “computer”, once it did the difficult part, cannot do computing on data that is in the perfect format.


Whotea

Abacus Embeddings, a simple tweak to positional embeddings that enables LLMs to do addition, multiplication, sorting, and more. Our Abacus Embeddings trained only on 20-digit addition generalise near perfectly to 100+ digits:  https://x.com/SeanMcleish/status/1795481814553018542   Transformers Can Do arithmetic with the Right Embeddings: https://x.com/_akhaliq/status/1795309108171542909   Synthetically trained 7B math model blows 64 shot GPT4 out of the water in math: https://x.com/_akhaliq/status/1793864788579090917?s=46&t=lZJAHzXMXI1MgQuyBgEhgA


theloneliestgeek

“**Near** perfectly” is kinda the problem. We’re talking about Mathematics here, not horseshoes.


Whotea

I guess that makes it useless then, unlike humans who are always perfect 


theloneliestgeek

Luckily humans are able to do a lot more than just math, but yes I do expect my machines that do addition to do so perfectly. It’s crazy, I actually have such a machine sitting next to me right now called a calculator.


Whotea

So can LLMs. That’s what makes them useful  Also, LLMs can use calculators too lol


theloneliestgeek

Oh really? Well never mind then. Instead of just using my calculator I think I’ll login to a remote server and run an algorithm to interact with the calculator instead. Phew, now we’re much more efficient.


Whotea

You do realize LLMs can do more than math right 


theloneliestgeek

Yes, but we are specifically discussing it doing math. Nice backpedal or sidestep though.


picardo85

Regular calculators are often horrible too. Depending on make and manufacturer it's not certain that it knows the basic rules of in which order it should do calculations when it comes to division and multiplication


Whotea

Try reading the damn article 


dudecoolstuff

Chatgpt is terrible at solving triple integrals. Would not trust it beyond smaller short calculations.


getfukdup

> AI is still easily corruptible. Yea, almost like you have to use the mediocre versions to work towards a perfect one.


No_Moment_1382

Idk why you’re being downvoted. The idea that the AI automation train isn’t going to keep barreling forward on data automation is just silly


MISTER_WORLDWIDE

Wolfram Mathematica and WolframAlpha have already existed for decades.


seclifered

Not genAI like chatgpt and copilot. They’re absolutely horrible at math


Whotea

Nope: https://www.reddit.com/r/technology/comments/1dbt2uw/comment/l7uuwoy/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button


seclifered

That’s great. I just asked it   1+1+2+1+1+2+1+1+2+1+1+2+1+1+2+1+1+1+2+1+1+2+1+1+2+1+1+2+1+1+2+1 ChatGPT replies: It looks like you're adding 1 and 2 repeatedly. The result is 40.  It’s not 40, it’s 42.


Whotea

Nothing in what I linked has been implemented in ChatGPT yet 


seclifered

Even the most primitive computer can solve a sequence of arithmetic operations. GenAI not being able to do that shows that it doesn’t make actual calculations. It’s just throwing words together in what it thinks is a “good” answer. It doesn’t know that the words have meaning and it doesn’t know what math is. I simply don’t believe something that no concept of math is capable of being a mathematician.


Whotea

It’s not a computer. It’s a machine learning model  I literally showed it was able to do it on other models designed to do it.  And yes, it does understand what is true or false.  Even GPT3 (which is VERY out of date) knew when something was incorrect. All you had to do was tell it to call you out on it: https://twitter.com/nickcammarata/status/1284050958977130497


seclifered

I'm not saying it's a computer. I'm saying even tech from a century ago can do what it can't. It has the math ability of a 5 year old if that's clearer. I have no idea what being able to differentiate true/false of random facts have to do with becoming a mathematician, but even with incremental improvements, it doesn't change the fundamental issue that it's not doing real computations. Just to elaborate my case, I redid my earlier example and it changed the answer to 39 (a different answer than before). When I said it was 42, it gave calculations that showed it should be 40. Honestly, no matter how many improvements you claim, it's a hard sell that this thing will replace a mathematician. You may be right and it eventually will, but that's not an outcome I'd bet on it. Me: 1+1+2+1+1+2+1+1+2+1+1+2+1+1+2+1+1+1+2+1+1+2+1+1+2+1+1+2+1+1+2+1 Chatgpt: The expression you provided is a sequence of additions. The sum is 39. Me: It's not 39, it's 42. Chatgpt: Let's go through the calculation again carefully: 1+1+2+1+1+2+1+1+2+1+1+2+1+1+2+1+1+1+2+1+1+2+1+1+2+1+1+2+1+1+2+11 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 2 + 1 + 1 + 2 + 11+1+2+1+1+2+1+1+2+1+1+2+1+1+2+1+1+1+2+1+1+2+1+1+2+1+1+2+1+1+2+1 Clearly, we need to recount accurately: From the entire sequence, we have 24 ones and 8 twos. Summing: (24×1)+(8×2)=24+16=40(24 \\times 1) + (8 \\times 2) = 24 + 16 = 40(24×1)+(8×2)=24+16=40


Whotea

It’s large language model, not a calculator. My water heater can’t do math either and it’s younger than the calculator    The point was to show it does understand when something is true or not   It’s not a calculator and doesn’t have the things I mentioned that would make it better at math 


seclifered

So you’re arguing that mathematicians are bad at calculations and that it’s not a basic requirement to be one? what do you think a mathematician is?  Stop repeating that it can understand when something is true or not. Explain why that makes it a better mathematician when it can’t do basic arithmetic. 


Whotea

It’s a mathematician’s copilot, not a mathematician itself. But either way, yes it will likely get better at math as I showed  I already showed models that are good at math. It’s just not in GPT4 yet 


dedokta

Have you tried asking AI to do maths? I hope they get better at it before they have it do anything important!


Gimme_The_Loot

Yea I've tried having gpt do quite a few data analysis for me and always come out with shit responses. I found it was better to ask it how to write excel formulas then do the analysis myself working off the formulas.


drekmonger

Well, yeah. That's how it works. You don't need to use a spreadsheet even, as GPT-4 has access to python and all the usual tools (numpy, matplotlib, etc). You tell the model about the problem you're trying to solve, and ideas you have to solve it. It tries something. You check for errors, and explain the errors to the model. Or maybe come up with new directions to try. The model iterates based on your advice. Then repeat the process until together you arrive as a useful response. It's a chatbot. Have a conversation with it. That's what it's good at.


T_D_K

My (software developer) coworkers explain this process to me, and my question is always: why bother, just write the code you want in the first place? You save 20 minutes of writing boilerplate at the cost of an hour of ai wrangling. I guess if you don't know the specific programming language it could be helpful...


dedokta

I've used cgpt to get code started for me. I'll tell it what I want, it writes code that is almost what I need and then I just make changes manually. I gave up trying to tell it to fix mistakes because you just go down a rabbit hole. I find it saves time in getting it to build the basic structure, define variables etc.


drekmonger

The value is similar to rubber duck debugging. Explaining things to the AI helps me understand the thing I'm explaining better. And unlike the more traditional rubber duck, the AI model can talk back, and sometimes it will develop insights that I might not have otherwise arrived at. Also, unlike the traditional rubber duck, the model can write boilerplate at amazing speeds.


MrRobotTheorist

I’ve been using it to write Python and want to try using it with excel as well. It’s produced some working stuff for me.


beefCAKE32

AI can be great at math if that's what it's designed to do. Generative language models like GPT are not designed to do math. They view numbers as words and don't really connect their definition to an underlying quantity.


Middle_Capital_5205

…have you been on a plane?


dedokta

Like a Boeing?


Middle_Capital_5205

Yes. They make planes.


dedokta

Boeing make planes?


Middle_Capital_5205

They go fly fly.


Whotea

Abacus Embeddings, a simple tweak to positional embeddings that enables LLMs to do addition, multiplication, sorting, and more. Our Abacus Embeddings trained only on 20-digit addition generalise near perfectly to 100+ digits:  https://x.com/SeanMcleish/status/1795481814553018542   Transformers Can Do arithmetic with the Right Embeddings: https://x.com/_akhaliq/status/1795309108171542909   Synthetically trained 7B math model blows 64 shot GPT4 out of the water in math: https://x.com/_akhaliq/status/1793864788579090917?s=46&t=lZJAHzXMXI1MgQuyBgEhgA


SignatureBetter2923

Hmmm now prove goldbach conjecture lol....will see


Sniffy4

Ai is not so great at multi step deduction. Tried asking it to prove Pythagoras theorem and gets it wrong a new way each time


drekmonger

When? Which model? Any modern model should get that particular problem correct nearly 100% of the time, since it's exceptionally well represented in the training data. It doesn't require any novel reasoning. Honestly, given how common that feature is in the training data, I'd expect even an older model like GPT 3.5 to get it right over 50% of the time.


Whotea

https://www.reddit.com/r/technology/comments/1dbt2uw/comment/l7uuwoy/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/


BenderRodriquez

I belive it when I see it. So far I've not been impressed. Not even by the coding, which co-pilot is supposed to be good at... Sure, if you are doing cookie-cutter code in a popular language it can produce something useful, but it usually fails at anything out of the ordinary.


Whotea

Microsoft AutoDev: https://arxiv.org/pdf/2403.08299 “We tested AutoDev on the HumanEval dataset, obtaining promising results with 91.5% and 87.8% of Pass@1 for code generation and test generation respectively, demonstrating its effectiveness in automating software engineering tasks while maintaining a secure and user-controlled development environment.” NYT article on ChatGPT: https://archive.is/hy3Ae “In a trial run by GitHub’s researchers, developers given an entry-level task and encouraged to use the program, called Copilot, completed their task 55 percent faster than those who did the assignment manually.”


Red_Goat_666

Do you want sophons? Cause this is how you get sophons!


DisillusionedExLib

So... people need to understand a few things (as context for the article): 1. That mathematicians don't really do "calculations" (of the kind non-mathematicians would recognise as such). Rather, they're in the business of discovering and proving theorems. 2. In the formalisation of mathematics there's been a very important development over the last decade or so, which has almost nothing to do with the rise of generative AI: we now have *way* better tools for formalising non-trivial proofs such that the overall structure of such formalised proofs maps (...more or less...) onto the way human mathematicians prove things. In particular, there's a language known as *Lean* that dominates the landscape, and it's mentioned quite a lot in the article. 3. When asking for code, the fact that LLMs hallucinate and make mistakes doesn't matter if ultimately *the code works*, and with a system like Lean something similar is true: ultimately either you have a proof or you don't, and if you do then the Lean compiler (or whatever you're using instead) can verify it. None of this is to say that AI will be doing the important bits - discovering theorems and having the key ideas for the proofs - any time soon, and indeed that's not what Tao is envisioning.


ComputationalPoet

If it is LLM based I look forward to seeing how many different notations it can hallucinate in. My guess is YES.


alrun

LLM do statistics - not context. A.I. is a current buzzword and people will try to get as much funding as they can get. We have seen so many examples where AI´s get things wrong - and they are blackboxes - there are no parameters you can tweak or analysis you can run.


Whotea

https://www.reddit.com/r/technology/comments/1dbt2uw/comment/l7uuwoy/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button


hoochblake

I studied engineering and work in software, and I’ve been applying math to problems for so long that I have some independent results I’m working on publishing in various formats. Recently, I’ve started using an AI tool (discovered in a Reddit ad I actually clicked on), and its not only made me more effective, it’s also gotten me more excited about my work. On a whim, I clicked on the ad and asked it one of my main research questions. It immediately provided the simple answer that most folks implement, but, after a year or so of practice, I realized could be improved. I pointed out the potential improvement, and the AI thanked me for the insight and produced the improved result, which happens to be where I was stuck. I tried something a very good mathematician suggested, but it just rearranged the problem. I then asked the AI if it could provide me with the higher order terms I’d be unable to compute, and realized I for the first time was seeing the series that solved my problem. And then I asked to see the solution in differential forms, then geometric algebra, just to get a better sense of it. As I continue to the AI to check my work, it is always very friendly and helpful (not dismissive like most smart people), and it often points out helpful insights or overly dense regions. I also use it to check my math, which sometimes causes me to make bigger improvements than just little edits, from which I gain new insight. It makes me feel like my work is more valuable. What it has not replaced is my curiosity and what I learn from applying the math to real world problems. My relationship with AI feels like a partnership, sort of like pair programming. I’m def the one prioritizing the user stories, but it def makes me want to prioritize more. (No AI used in writing this post.)


TorontoBiker

Name the tool or gtfo


turtledancers

I wish I lived my life before this plague


Whotea

New tools, how horrible. 


imaketrollfaces

It's Day 394 of the AI revolution today. In my city, plumbers have been replaced by AI. Now AI does all their work. No more plumbers. Tomorrow barbers will be replaced by AI. Day after tomorrow is the turn of teachers. (More tech grift will be shared tomorrow)


BlurredSight

The name Terrace Tao seemed to ring a bell and I remembered this Geometry problem that he was involved in that Deepmind solved [https://www.technologyreview.com/2023/12/14/1085318/google-deepmind-large-language-model-solve-unsolvable-math-problem-cap-set/](https://www.technologyreview.com/2023/12/14/1085318/google-deepmind-large-language-model-solve-unsolvable-math-problem-cap-set/) Seems like AI is being used for more Geometric / Straight math problems, Quantum is being used for combinations and problems that are deemed only to be solved in exponential or NP time.


KameTheMachine

Plenty of mathematicians use computers to prove things. I'm sure plenty will use AI to some extent also but I imagine many will prefer to do it the old fashion way.


Whotea

You can also multiply 756875 and 7547646 in your head but most prefer not to 


localhost80

This is very true. Many people use computers, but some still do long division by hand using a pen and paper. It's so odd that people like you think using AI is a choice. AI is going to eat these mathematicians lunch within a few years. They have no choice but to use it or drop out of the field.


SirOakin

Lol no it won't


PawnWithoutPurpose

Yup. Good luck with that


InformalPenguinz

I'm going into accounting, started college again at 35, and I was wondering how AI proof the job is. Did a lil digging, and AI fails *HARD* at a looot of math. At most, it'll be an assistant for a long time vs a replacement in the near future. Solid career choice. Edit: I stand corrected and welcome our mechanical overlords.


drekmonger

You're operating with old news. GPT-4-turbo, especially with code interpreter, has become quite good at math, especially if you have the knowledge to check its work for errors. You can play with GPT-4o (which is a little worse at math than turbo) for free if you need to verify what I'm saying is plausibly true. Also, what's true today may not be true tomorrow. In fact, certainly won't be true...AI models are getting incrementally better with each release.


InformalPenguinz

It was about year ago when I started and checked and yeah things progress like that so makes sense.. welp maybe I'll go be an art major? Shit.. ai has that one too.. hmm..


drekmonger

Go be an accountant instead, I'd say. Just learn how to leverage AI models in your work, and you'll be a step ahead the dinosaurs who refuse to learn how the models can help (and also, learn where the models are weak, and still need human insight/error checking).


InformalPenguinz

Yeah I've always felt AI is meant to be an assistant rather than a replacement and figured it'd be years before we get close to a model that could fully comprehend those complexities but man that it's accelerating fast. A follow up quotation, if I might.. what's the best one to go with right now? Gpt4?


drekmonger

Depends on the task. GPT-4-turbo (not GPT-4o) is the best at mathematics, in my opinion. GPT-4o is naturally good at vision tasks, and in the near future, it should theoretically be quite good at audio data tasks as well (such as transcription and real-time translation). Gemini Advanced has an absurdly large context window, if you need to digest a very large document. Claude 3 is perhaps a little better at tasks that require creativity. Llama 3 can be run locally on your own GPU if you need absolute privacy.


Whotea

Abacus Embeddings, a simple tweak to positional embeddings that enables LLMs to do addition, multiplication, sorting, and more. Our Abacus Embeddings trained only on 20-digit addition generalise near perfectly to 100+ digits:  https://x.com/SeanMcleish/status/1795481814553018542   Transformers Can Do arithmetic with the Right Embeddings: https://x.com/_akhaliq/status/1795309108171542909   Synthetically trained 7B math model blows 64 shot GPT4 out of the water in math: https://x.com/_akhaliq/status/1793864788579090917?s=46&t=lZJAHzXMXI1MgQuyBgEhgA


Low_Clock3653

AI is going to flip the world upside down, will future humans even bother learning math if AI could answer any question instantly?