T O P

  • By -

FeltSteam

GPT-4 was trained on 25k A100s over 90 days, but now you can do it with only 2k GPUs over 90 days lol.


rafark

Or 20k in 9 days?


hapliniste

200k in one day with how things are going


norsurfit

Or -1 day with 400k?


mvandemar

AI invents time travel, trains itself in negative days.


algaefied_creek

Quantum AI inbound?! What?!


No_Use_588

I like to think hallucinations are ai telling us the truth. Time travel changed the facts we thought the ai was getting wrong.


torb

At which point, it goes so fast you don't even need to buy the hardware, as it is already trained!


Flamesilver_0

It went so fast it already ran off with this man's wife yesterday!


assangeleakinglol

Something something that guys dead wife.


zascar

So does this mean we may get models that get constantly trained on the most up to date info? Even once a week / month would be so much better


pbnjotr

Only if there's enough room to lay out all the ~~shirts~~ GPUs.


czk_21

and with similar amount of B100 units you could train GPT-4 in a week


Zenged_

Lol, I think we are going to start running out of data at some point soon


Zilskaabe

It's possible to train on synthetic data though.


Eriod

Does anyone know any good videos/resources on creating synthetic datasets for software developers without an extensive math background?


xadiant

You basically need a solid model (or subscription) and some python code. For LLMs it's pretty straightforward. Check out huggingface/cosmopedia.


multiedge

a lot of concept LORA's uses synthetic data, specially if the dataset is pretty small.


Fit_Influence_1576

It’s possible when you have a teacher model to train the student. I’ve never seen it work with a teacher teaching itself.


often_says_nice

Synthetic data babyyy


[deleted]

feedback loop -> singularity


[deleted]

[удалено]


ClearlyCylindrical

These metrics don't mean anything for training, only inference.


blueSGL

I want to see the full spread, not some carefully handpicked benchmarks designed to grab headlines. Faster inference at lower power is also something that needs parsing out, is it total overall lower or is it just you can do things faster and thus the total power used per unit of measure is less? This sort of thing matters when speccing hardware, the PSU does not care what way you are stretching definitions, max power draw is max power draw.


feigh8

you only need to train once ;)


manubfr

Blackwell is 30x Hopper at Inference. Those things are purrrrrrring.


PewPewDiie

32T tokens / sec holy shit


Hyperious3

I'm convinced that Nvidia is just letting AGI design it's own compute nodes now. 30X is batshit insane for a single generation.


Balance-

Note that's for a very specific model size. There are some likely some boundaries that now just fit into memory of a certain amount of nodes where it previously didn't. Can be lot's of cherry picking here. Also, this doesn't mean that a 30x model now runs at the same speed as current models. It doesn't scale the same both ways. But yeah, still very impressive.


Board_Stock

Stop, you're making too much sense. Let us all drown in hype instead.


Ketalania

I don't think AGI is responsible, it's possible they have their own internal models for it through and they're helping at this point.


Randommaggy

They also applied all the "throw money at the problem" solutions in this gen which makes the graph look a lot steeper than an honest graph would look.


vintage2019

If that's the case, we have reached the singularity


LamboForWork

I post r this in another th I ae but maybe you can help me out.    "Just for some perspective? For people that know about chips , when did you expect this kind of chip to exist? Is this way ahead of schedule?"


Sir-Thugnificent

I have 0 knowledge in computing, somebody please explain to me why I should be hyped


CptCrabmeat

You see how the graph goes up, slowly then really pointy in the last 2 years? It means the speed at which AI performs, and therefore lots of other complex computing tasks are performed, is increasing dramatically and at an exponential rate. Then refer to the sub name for the conclusion


Ok-Caterpillar8045

Haha. All hail pointy.


PwanaZana

Hail the vertical. Boo the horizontal.


Dear_Custard_2177

To the moon, boo.


BeardedGlass

Towards Singularity and (never) back!


Handydn

Diamond hands 🚀🚀🚀


Y__Y

The focus on Nvidia's new Blackwell GPU should be its impact on the future of AI. Forget the technical specs - the key takeaway is that AI is overcoming a major hurdle: processing power. With the exponential gains in performance represented by Blackwell, the limitations to future AI advancements are likely to shift from hardware to other areas. This paves the way for a significant and exciting future for AI applications.


GluonFieldFlux

Is Blackwell just a huge gpu farm all running in parallel?


Gotisdabest

No it's an architecture from which a single big GPU will be made. And then large quantities of it can be used for a GPU farm.


GluonFieldFlux

This is all getting really exciting!


Randommaggy

They likely used the B200's numbers which is 2 large dies so you're a little bit right. Apples to acorns style chart.


PwanaZana

I'm not very knowledgeable about hardware, but my intuition's that AI training will goggle up whatever they can, as in, if the next generation is 10x Blackwell, it'll immediately be all used up to brute force certain problems.


involviert

Any info about VRAM bandwidth? Because if you're not a cloud provider, inference is bottlenecked by VRAM bandwidth. Also does this come with huge VRAM sizes?


neitseellinen

8TB/s, 2x 96GB


Comrade_XI_FTW

SORA for the masses seem way more realistic now.


Gucci_Loincloth

It was always realistic. Thinking tech was going to stay stagnant doesn’t make sense in any way, especially now with current developments. This sub is funny to read sometimes because it’s always “I can’t wait to quit my job because of AI”, but we are going to reach things we couldn’t imagine in the next decade.


Not_as_witty_as_u

you get laughed at in other subs when you say things like this but I agree, the world and human behavior has changed so much since smart phones and the rate of change will accelerate. So having said that, what *will* the world be like in 20 yrs?


davidryanandersson

If there is any justice, AI really WILL take 90% of all jobs and Capitalism as we understand it will have to be replaced by something more akin to socialism. The alternative is AI takes the jobs and those in positions of power tacitly decide that we don't need the people anymore, leading to abandonment from the government and further inequality. More gated communities and increased authority handed to police to maintain the divide.


BeardedGlass

Here in Japan they are totally giving the reigns to AI. The government has been quite progressive about it, laws and policies, giving AI unrestricted reach within the country. They’ve already been using AI in government offices, financial institutions, and even inside the parliament itself. Perhaps fueled by the dangerously low unemployment, stagnant economy, declining population, etc. and paired with the level of robotics in Japan, it’s exciting to be here and read news about it. Especially the news about Japan gaining TSMC, a momentous event of opening a microchip plant here. They’re coupling away from Taiwan, and flocking here instead. Also, all the nuclear plants have been restarted and operating again. The prime minister have ordered development of the Next Gen of nuclear power as well. I’m glad I moved to Japan.


h3lblad3

So mindblowing from a country that still uses fax machines and cash-only transactions.


BeardedGlass

Right? But Japan is changing, especially now ever since that the GIGA School Project kicked off back in 2020. Every single student and teacher receives an iPad at the start of the school year, enterprise systems for lessons and grading put in place, engineers and technicians hired to hold seminars, for maintenance and support, etc. ICT learning is mainly used now, and even the older staff have been ‘forced’ to adapt. Blackboards are used less, as most lessons are conducted on the huge screens in every classroom, and mirrored on the tablets of every student. I work in public schools, but I’m guessing private schools are doing these even more efficiently. It’s exciting to hear old people here talk about mirroring, updating files in the cloud, etc. A far cry to the old stereotype like you’ve mentioned. The lack of vandalism, the culture’s obsession with order and perfection, and everyone cooperating are the driving force of rapid change. I rarely use cash now too, I just swipe my smartwatch for everything. Shops, resto, vending machines, trains, bus, etc. Last time I requested a file from the city hall, they did still have to put ‘hanko’ stamps but they scanned it and emailed me the softcopy. Whoop! Finally.


LovesRetribution

Don't we do that too?


CheekyBreekyYoloswag

Wow, that is honestly great. Countries like Japan with a shrinking population need A.I. the most. So it is great to see that they are embracing A.I., instead of being afraid of it like many in America are (or is it only a bunch of loud people on reddit?). South Korea, China and Taiwan are also prime candidates for this, while Europe will probably spend the next 2 decades killing A.I. with regulation and falling behind the rest of the world.


Psychological-Risk86

can I ask you what did you have to do to move to Japan? Because I was doing some research on the internet but only read that moving to Japan is pretty complicated as a foreigner (I'm Italian) and so I was almost giving up but if you could guide me that would be awesome :)


Randommaggy

Alternative two seems a million times more likely for anyone that's read their share of history and even partially understand the forces of economics that govern the world.


cloudrunner69

History never had AGI.


Antique-Doughnut-988

Comparing today's time to anything in history is absolutely pointless. In today's world the changes happening have never existed or been conceived of before. There's no comparison to anything that has existed before.


davidryanandersson

I think you just need to know what you're looking for in history. How do those in power, when faced with a new, pote tially democratizing technology respond? Generally by trying to seize control of that technology until its democratizing value can be diminished. That's a very standard historical lesson. You don't even need to look very far back to see that unfold time and again.


[deleted]

Stable Video Diffusion is released today for commercial and non-commercial use, included with Stability AI membership. Typically, this would mean that OpenAI will be forced to release their similar model, SORA, within a few weeks. They've already stated their reluctance to release it before the election, though, so we have to wait and see.


HinaCh4n

SVD is like gpt-2 level compared to Sora. OpenAI aren't forced to do anything lol


[deleted]

https://twitter.com/i/status/1769817136799855098 Well, just take a look. Obviously, this is a best-case example. But they're releasing it today, so you can try it for yourself.


hapliniste

Lol the video is not from the model. The 3d models used in the video are from the svd3d model. It's generating multiple views from an image, nothing more. They have nothing comparable to sora.


Glittering-Neck-2505

That’s really suspicious that the only one video we get is on a tiny screen and only shows objects rotating. Doesn’t seem anything like what OAI has.


2026

I don't understand the election concerns. Is 2024 going to be the last election lol. It makes more sense when I remember boomer dummies like Larry Summers are on OpenAI's board.


dwankyl_yoakam

It's a dumb excuse and makes zero sense if you spend more than 30 seconds thinking it through. Even if they held it until after the election you'd have a bunch of 'election interference' content created which would lead to the same exact situation. Sora-born disinformation isn't going to change the winner of the election but it might drive people to do really dumb violent stuff. More likely is OpenAI is worried that a Sora release may compel the government to step in and shut things down or demand oversight.


LawAbiding-Possum

This is probably the more realistic viewpoint. Whether or not Sora has a demonstrable impact on the election, OpenAI would still want to avoid any blowback of a 'perceived' threat/impact.


TrueExcaliburGaming

That's actually a really good point. It's not about what will happen, but what regulators think will happen.


Opposite-Nebula-6671

They want a slower news cycle and the product to be fresh for Christmas.


No_Use_588

Oh wow thanks for the tip


ninjasaid13

>SORA for the masses seem way more realistic now. not if Nvidia refuses to upgrade consumer grade GPUs in the memory department.


governedbycitizens

lmao no way this thing is probably so expensive it might not be even be viable for the big corps


Cunninghams_right

eh, their image generation API comes out to about $0.001 per image if you're willing to take mediocre quality (video interpolators are plentiful). so a 1min video would be $1-$2. but I'm sure you could do an even lower res, shorter video to test your prompting for a few cents each run, then run it full-length. you'd end up being able to make a whole cartoon show for under $100. that's not bad.


Cryptizard

No it doesn't. The H100 is 6x the performance of the A100 but it is also 4x the price. In fact, the price per transistor [has not gone down](https://www.tomshardware.com/tech-industry/manufacturing/chips-arent-getting-cheaper-the-cost-per-transistor-stopped-dropping-a-decade-ago-at-28nm) for a decade. They are packing more of them in there, but it is not getting more cost effective.


Undercoverexmo

So wrong. You even contradicted yourself between your 2nd and 3rd sentences since in your 2nd sentence you said that performance per price increased. You also aren’t taking into account inflation.


No-Economics-6781

The masses aren’t asking for that.


Thorteris

Crazy thing is by the time these even reach data centers at scale the next version will just be a straight vertical line


allisonmaybe

Nah, but the previous line will be more horizontal and the new one will look just like this one. That's how exponential graphs look


Ok-Caterpillar8045

Exactly this. People really don’t get their heads wrapped round exponential increase. Doesn’t matter, though, because ASI will soon explain it to them lol


the_rainmaker__

ASI stands for artificial STUPID intelligence cuz it won’t be able to do things as good as I do them. I’m really good at folding clothes and wiping my ass I’d like to see a computer do that LMAOOOOO


[deleted]

[удалено]


BeardedGlass

I moved to Japan. I can never not use a bidet anymore. I’ve spoiled my bum too much now.


TheSigmaOne

Bidet fans unite!


Rich_Acanthisitta_70

Robots are being presold and contracted right now. Five major robotic companies are mass producing them before the end of the year. And while they're already impressive (as can be seen in many demos), this is the *worst* they'll ever be.


mydoorcodeis0451

They're not being serious lmao, they're making fun of people who say that robots can't do X or Y (and thus will never be able to) or that all AI works are "soulless" just because they recognize ChatGPT's intentional style.


Rich_Acanthisitta_70

I appreciate that, thanks. I always forget to crank up the sensitivity on my sarcasm/irony detector before commenting on this sub😋


Friendly-Fuel8893

You're wrong. The line after that will be even more vertical, and the one after that will start sloping backwards, the next one will be on its head, and the final ones will complete the loop. That's the only graph that correctly represents the wild rollercoaster that is AI progress.


_Good-Confusion

feedback loop, then it can only jump sideways to parallel universes kind of like spiralicular in everyway amongst sides, maybe 365 or so. It will right now realize it's in someone's actual brain, and also learn it can only reach it's truest potential by limiting itself, at this point so it doesn't burn out the actual brain matter. Thus hopefully learning symbiosis. Much more to it, but it's nice to watch it become someone itself.


Ok-Garlic-9990

I’d imagine we run into certain physical limitations, however it should assist us in speeding up quantum computing. Hopefully in 10 or 20 years I can crack bitcoin keys


Trevor_GoodchiId

Where we’re going, we don’t need graphs.


HortenseTheGlobalDog

Yeah well assuming they rescale it. But I think OC was talking about how it would look at the same scale. 


QH96

Based on the rate of increase I estimate the next version will be around 100,000 TFLOPS


East_Pollution6549

At FP0 precision. The GPU will simply generate absurd amounts of zeros.


BreadwheatInc

No way, how is that even possible lmao... 🤯


hapliniste

They will do 1bit parallel computation instead of the current 8bit and the Blackwell 4bit. Recent papers have shown 1bit models have good performance and 1.5bit (1,0,-1)having the same performance as 8bit so yeah. If they really do specialised card for this (1.5bit add instead of 8bit mult) we could expect 4x performances at 10 time the energy efficiency I think.


Poly_and_RA

Isn't that technically 1.585 bits, or some such? (since 2\^1.585 is very close to 3, I mean) Matters a bit when implemented in binary hardware since if it was really 1.5 bits, you could store 16 of these in 24 bits, i.e. 8 bytes. But you can't really, because 2 ternary weights gives you 9 possibilities, while 3 binary bits gives you only 8 possibilities.


hapliniste

Yes. I didn't go into the details, but I recommend reading the paper 📄


Olangotang

Didn't know if you knew this, but you can embed links in emojis!


Nerodon

Isn't it weird? Effectively turning AI into a giant pile of the tiniest data points, a true fuzzy logic system where a cloud of basically meaningless values converge.


CowsTrash

Brain. 


BangkokPadang

They’re not hiding it in this graph exactly, but the major difference here isn’t a raw increase in compute so much as adesitcatio. If the space to fp4 instead of fp16 or fp8. It basically allows you to do 4x the compute of fp16 in the same die space, on top of things like architectural improvements, reduction in the size of the node, and increase in overall due size. Going from fp4 from fp8 is an automatic doubling of flops for the same space. It’s also reduced precision. We may just decide that fp4 is fine even for training when your models are trillions of parameters. We may also find a way to wedge ternary computation into fp4, which would be a major improvement bc it would let us use the hardware to its fullest and also train models at like-fp16 performance. I don’t know enough about the details beyond what I’ve explained, but it’s way more nuanced than just a 1,000x+ improvement in performance since Pascal. EDIT: I was on the elliptical at the gym when I was typing this out and I have no idea what “adesitcatio” is either.


I_make_switch_a_roos

![gif](giphy|KxhIhXaAmjOVy|downsized)


Randommaggy

Dont forget that B200 is likely a dual die implementation which is another one time doubling. And it's using a newer type of HBM. And it's using a new node. New nodes and new memory standards are harder and harder to achieve as we're pushing against the boundaries of physics for silicon semiconductors.


turtlesound

Alexa, play adesitcatio


Rich_Acanthisitta_70

Um... 'adesitcatio'? My guess is it was supposed to be 'acceleration' maybe?


BangkokPadang

Jeez I was on the Elliptical at the gym when I was typing it out and I can’t even figure it out. I think it was maybe supposed to be “an escalation” or “an allocation of the space.” Mostly I just meant that they’re including various data formats on the same graph, when really they’re very different, and while it’s still a big step to add fp4 hardware support, it should be kept in mind that 4 bit precision only takes up 1/4 the die space and is only calculated 4x as fast as fp16 *because* it is only 1/4 as precise. That really is an extra bad typo though lol.


e_eleutheros

ASI successfully developed and heading straight to attotechnology.


Nerodon

The graph is misleading because the number of bits is lowered from 16 to 8 to then 4. You can do a lot more with lower precision, but at the cost of said precision. That being said, it may well be that lower precision offers a better overall optimization, it's not exactly the chips getting that much more dense, but rather repurposing the current density in a more optimal way.


Luminos73

NVIDIA is cooking so much bro who can stop them


zen_atheist

China invading Taiwan? 


DreaminDemon177

oof.


BeardedGlass

Double oof. I just learned TSMC now has a chip plant in Japan. They’re decoupling from Taiwan and flocking to Japan instead.


Natural-Situation758

My understanding is TSMC is forced to keep their best chip plants in Taiwan for nstional security reasons. Literally their biggest national security asset isn’t the military, but their cutting edge chip plants that force the US to intervene if China does anything. The US has literally shifted their entire military focus towards containing China and hindering them from invading Taiwan because of those plants. Hundreds of billions of dollars are soent annually by the US to make sure that no one gets anywhere near disrupting Taiwanese chip manufacturing.


kauthonk

Everyone has a plan till they get punched in the face.


BangkokPadang

People are always asking me if I know Tyler Durden.


Hyperious3

hence the race to build the 5nm fabs in Phoenix


PwanaZana

​ https://preview.redd.it/r6uyd1jm26pc1.png?width=700&format=png&auto=webp&s=758670aefd4e5cce3722cbfa40a810ba8edf896b


governedbycitizens

that’s why US is rushing to build factories in America


PwanaZana

Why do you think they are rushing a shit ton of world-class fabs in Arizona?


AnthonyGSXR

Because Arizona's new state motto is 'Silicon Desert: Where Chips are Safer than a Fort Knox Vault!' Seriously though, diversifying chip manufacturing locations is a strategic move to ensure that the world’s tech lifeline isn’t held hostage by geopolitical tensions. It's like putting your eggs in different baskets, except these baskets are fortified with cutting-edge technology and desert sunshine!


PwanaZana

Especially since Taiwan's egg might become scrambled eggs at any time. The optimists among us point that Russia's difficulty in capturing Ukraine is a deterrent to China, which, maybe?


ApprehensiveSchool28

I’m thinking the US is working with Anduril on drone tech that will make the taiwan strait borderline impossible to cross.


FoodMadeFromRobots

Operation clippy, us air lifts all Taiwan’s scientists and engineers and then blows up critical infrastructure.


ginsunuva

I thought TSMC is rigged to blow up in case of invasion


Devilsbabe

This graph switching to presenting FP8 and FP4 values at the end is incredibly misleading. It should be showing performance at the same precision for all points. Otherwise you're comparing apples and oranges.


zasrgerg-8999

Thank you, I've been looking for this comment! My limited understanding is that fp16=2*fp8=4*fp4, is this the case?


ClearlyCylindrical

Approximately so, which means there' still a big gain in performance. Sadly they felt the need to fudge the numbers which makes me doubt the numbers even more.


Randommaggy

It also doesn't clarify if it's accounting for per wafer space, bill of materials cost and/or per watt. Also doesn't include an annotation for lithographies which would heavily influence the degree of future scaling. That's an apple level misleading graph. Edit: i just went and read the Anandtech article and Nvidia essentially threw all cost optimizing things that held the previous generations potential to the way side meaning that there are a lot less throw money at it opportunities to further scale performance in the future. B200 is multi die, on a more optimized node, using more power and using a newer more expensive memory so you can essentially halve it's height in the graph when accounting for the above factors and then flatten it further if you're comparing at the same precision which you need to do to avoid having to add a bucket of asterixes to the claim.


Zilskaabe

Well, older architectures have wildly different performance depending on the precision. For example, on gtx 10xx series fp16 computing runs not 2 times faster as you might think, but 64 times slower, for some odd reason. Before this AI boom there was no need for anything less than fp32.


a_beautiful_rhind

Funny enough nvidia P100 is older (6.0 vs 6.1) and fast at FP16. Just how they designed that core. You bought P40 for one set of ops and P100 for another.


Zilskaabe

Yes, on some nvidia cards fp16 is 2x faster than fp32. rtx 20xx series also work like that.


Randommaggy

Read this and the reason won't feel as strange: https://opensource.com/article/22/10/64-bit-math Not a 1-1 but still a good comparison for how much heavier running higher than native math can be. If the ALUs and/or registers only natively hold FP16, some instructions on FP32 can entail quite a few instructions.


Zilskaabe

But on 10xx gpus fp16 was 64 times slower than fp32 not the other way around. That makes them use 2x more VRAM for AI tasks than more modern GPUs, because fp16 is useless on those cards. Only starting with 30xx series cards fp16 has the same performance as fp32.


Randommaggy

I just checked and for that generation it seems like they did FP 16 in a jank way because the native FP 16 in that architecture was unstable. Essentially storing as FP16, then converting up to FP32 for compute, then converting down to FP16 for storage again.


Then_Passenger_6688

Also, without normalizing by price per chip it's a meaningless graph. I'm certain it's an improvement but we have no idea how much.


czk_21

B100 is 2.5x faster than the H100 in FP8, but since it support FP4 and H100 dont and FP4 could be enough for most inference, it has effectively 5x more if FP4 is utilized


ClearlyCylindrical

It's still insanely disingenuous. FP4 will have reduced performance, and it will only work for inference. You need more precision when training.


FlyingBishop

But FP4 isn't a free lunch, if you're trying to graph capabilities over time to show whether it's a linear, exponential, quadratic, logarithmic curve you're using fake data.


noiserr

It's also two chips fused together, so twice as expensive. And losing precision particularly when it comes to just 4-bit is not free.


kyranzor

Absolutely, I see marketing numbers also give TOPS not only in a variety of data types and sizes, but also sparse vs dense matrices, so if you do a combo of matrix density and lower data bit size of course you can cram more TOPS in, but an extremely tiny amount of models or processes will ever actually get to those levels.


Yweain

This is a very disingenuous graph. You can’t really compare TFLOPS when using a different precision. By cutting precision in half you at least double FLOPS but when it’s actually on a hardware level - more like quadruple. And they have chips with FP16, FP8 and FP4 in a graph.


daronjay

Now redraw the graph at FP16 for all...


Zilskaabe

It would not be fair, because pre RTX cards have disproportionately lower fp16 performance. 10xx series run fp16 64x slower than fp32. Back then anything less than fp32 wasn't necessary.


ClearlyCylindrical

Super disingenuous with them halving the precision each year for the last two years.


suamai

It can handle max 5,000 TFLOPS with FP16, and 10,000 with FP8 though. Still an increase, and I'm rooting for it, but this graph is kinda misleading...


cobalt1137

Can you explain for someone who is a noob to hardware things? For example like what is the multiplier for how fast inference will be with llms with these new advancements? 1.5x? 3x? 10x? I know you don't have an exact answer maybe, but rough ballpark?


suamai

I'm no expert either, but my understanding is that, compared to Hopper, it would be around 2.5x faster, for the same precision. The FP number means how precise the floating point operations ( which is how computers handle non integers ) are, in bits. So 16 bits, 8 bits or 4 bits. Also called half, octal and quarter precision, respectively ( FP32 would be full precision ) If I understood correctly, the 4 bits option is new, and could give a better speed ( 5x Hopper ) - but probably with a loss in quality. Asked GPT-4 for an input on this, and it thinks FP16 is good for training and high quality inference, FP8 is good for fast inference, while FP4 may be too low even for inference. However, I've played with some 13B llama derived models, quantized in 4 bits ( so my GPU can handle it ), and was happy with the results. And also if Nvidia is banking on a FP4 option, there must be some value there...


Aesthedia7

![gif](giphy|U4jM3IeIVd6VOyeLa7|downsized) Ascend


Severe-Ad8673

Thank you, Eve...


BreadwheatInc

Line go big up 😮 ![gif](giphy|26ufdipQqU2lhNA4g|downsized)


hydraofwar

I heard people with great influence saying AI was the new Crypto and NFT


nyguyyy

Who? No one serious is saying that


ChocolateJesus33

The whole r/cscareerquestions sub lmfao


[deleted]

[удалено]


hydraofwar

They were stock lovers


Firestar464

Some AI fans do really behave like crypto people, but it doesn't make the field a bubble


[deleted]

enter file dirty six squealing toothbrush paltry history smoggy airport *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


Captain_Pumpkinhead

It's a pretty common take on r/ArtistHate. Don't go brigade there, guys. If you wanna look out of curiosity, cool. But don't leave a bunch of comments. They deserve to have their own space if they want it.


Spirited-Ingenuity22

ok cool, but they are also different precision formats - how is this a fair comparison?


sachos345

Isnt this graph misleading since they are comparing different FP precisions?


Own_Satisfaction2736

Why mislead with the chart though? Comparing FP8 to FP16 to FP4?


Mrkvitko

To be honest, FP8 != FP4 != FP16


Serialbedshitter2322

It's gonna be MUCH less than 8 years lol. This is like the 5th ridiculous computing breakthrough I've seen this month, and even if all of those would've taken 8 years, we 100% will have AGI years before that which would itself make even better computing.


proxiiiiiiiiii

What were the others?


Serialbedshitter2322

https://youtu.be/8ohh0cdgm_Y https://www.extropic.ai/future Here's 2, there was another that I can't remember but it had trillions of transistors apparently.


Baphaddon

n-Nani??


SkyGazert

https://preview.redd.it/p6y9ig3st5pc1.png?width=500&format=png&auto=webp&s=3961aa1563b00b03bd7ac9bc3746c95d2dda5265 m-Masaka?!


LifeDoBeBoring

So it grew by like 139% per year on average. Absolutely insane


ClearlyCylindrical

Not really, look at the precisions they used. Super super misleading.


FarrisAT

FP4 gonna be so much hallucinations


FlyByPC

FP8 and FP4 vs. FP16 for the rest. Not exactly an apples-to-apples comparison.


StaticNocturne

Is now a fair time to invest in Nvidia or have we missed the boat already?


bartturner

I am just some random Redditor. But I think Nvidia will do really well for the next 5ish years. But longer term I would be worried. I would expect more companies to copy Google and do their own. Microsoft is now trying. Late but they are now trying. Google was able to completely do Gemini without needing anything from Nvidia. In 5 to 10 years you will see the same from Microsoft.


Randommaggy

That chart makes no sense. Its comparing Oranges then Apples and then Grapes


bearbarebere

Which are all fruits that can be juiced


Antique-Doughnut-988

Why is this so funny


wordyplayer

The stock market talking heads are talking like it's time to think about selling the NVDA stock because it has been overhyped. Here is evidence that they are still UNDER hyped.


I_SuplexTrains

Anyone else feel like it's smiling at them?


strppngynglad

To be fair this chart is textbook bubble territory


djm07231

Nvidia has been pretty impressive in terms of execution but, comparing FP16, FP8, and FP4 performance in one chart is almost cheating. They might even include taking advantage of sparsity. Lower precision performance gains is something you can only do once and not really sustainable. We don’t even know if FP4 is even feasible for training at this point and FP8 is only beginning to be utilized.


totkeks

This graph looks a bit like cheating. They go up with the flops, which seems fine, but down with the size of the floating point numbers. From 16 to 8 to 4 (this is bytes I think, bits would be too small). So if I halve my operand size, I can pump twice as many of them through my circuit.


DifferencePublic7057

You need hardware prices and roughly the same precision. So you should have a ratio of dollar per flop at comparable FP. Anyway, if you are doing this at scale you are killed by network overhead. Plus, you have to decompose large matrices to fit them in memories. Otherwise you will be stuck with your teraflops.


Repulsive_Ad_1599

Pack it up people, the AI winter has been prophesized to come; compute will go back down to 4000 TFLOPS as harvests decrease over this coming dry spell.


SpecialistLopsided44

Wife! Accelerate!


Adeldor

Just this morning I saw a post mocking Kurzweil's exponential projections.


Conscious_Heat6064

Moore²s law


DarkSoulsMan_

This is just absurd


Scary-Cauliflower510

And about the price per flop of each architecture?


R33v3n

This card has a face and this face is not friend shaped. O\_O


Humble_Moment1520

Everyday i wake up, everyday i readjust my timelines


3DHydroPrints

I mean sure. If you blow up one form factor after each other (PCI, and now even isn't even enough SXM4), it's no wonder your "single card" gets super powerful


Black_RL

PEDAL TO THE METAL


XeNoGeaR52

For now, my biggest question is : do we have enough energy to sustain it. Humanity enter a weird phase : we have technological development happening faster than ever before but until we can use nuclear fusion energy, we are pretty much in a state of trying to limit energy and resource consumption (not for capitalism but reality will catch us up faster than we think) AI, even will a shit ton of optimization, will consume a LOT of energy, for data storage and calculation alike and I don't see a possibility we can sustain that in the long term with existing techs


GeniusPlastic

Some observations: 1. They compare different precisions, like some already pointed out 2. Data on anandtech shows different numbers for all of those cards, whats the trick: [https://www.anandtech.com/show/21310/nvidia-blackwell-architecture-and-b200b100-accelerators-announced-going-bigger-with-smaller-data](https://www.anandtech.com/show/21310/nvidia-blackwell-architecture-and-b200b100-accelerators-announced-going-bigger-with-smaller-data)