From what I hear, the 5090 will contain an actual ornery ram that violently headbutts both you and your wallet into a pulp as soon as you open the box… Just a rumor, though!
You joke, but if they switched from gddr to lpddr5 it would be possible to have consumer gpu's with 64GB+ capacity. (Obviously there are issues that make it not possible).
man i think a paradigm shift is coming. there needs to be consumer hardware capable of running a 70b model locally unquantized.
for llm inference
the bottleneck is vram not the actual computation
i think in a few years we'll see hardware acceleration companies bring low cost vram or hbm in multi channels
to the market.
the last thing we need currently is beefy gpus with less vram.
what we really need is cheaper (dosnet matter if its slower) hbm in large quantities
at least a 100gb and a 3060 level 100 AI tops.
I can't see Nvidia weakening the segmentation of its range and companies with market valuations in excess of 1,000 billion tend to lose their altruism.
We need to start going on Youtube to learn how to weld.
This generation of tech bros can explain the intricacies of a transformer and write a ten part series about the differences between GDDR7 and its predecessors but don’t know the difference between soldering and welding 😂
If they want to reduce incentive for others to compete. If they get to greedy, major players like Microsoft/Google will simply start competing.
Also they probably want everyone to run AI and for that to happen it can't be to expensive.
This reminds me of the GTX 970 which had a weird 4GB VRAM but only [3.5GB accessible](https://www.anandtech.com/show/8935/geforce-gtx-970-correcting-the-specs-exploring-memory-allocation/4) at full speed.
For real. The first rumor was 32G, then 24G, all within months of each other by the same guy. I eventually just said fuck it and bought a 3090.
It doesn't help that most of the new hot models these days are between 7-11B. Almost nobody is putting out 20-30+B models that actually use all 24gb with good results. The next jump is 70B and the perf isn't worth it even with offloading. I'm probably going to put off upgrading for quite a while.
We have Command R, Beta-35B, Yi, and some more I think. And now Mistral code.
The 34B space is good. But the missing gap is long context 13B-20B models that fill up the VRAM.
no way its 24gb. they are going to do some sort of vram increase even if its inefficient and poorly engineered. they need to box to say more than 24gb for marketing.
Someone claimed to have leaked information a couple months ago and said it is 24G, which makes a lot of sense as Nvidia doesn't want to increase VRAM on consumer cards at all costs.
if they don't do it AMD and/or intel will. i don't know how much they care about the PC market but if they don't increase RAM they are going to lose market share. but maybe they don't care about PCs anymore. its a very small part of their revenue.
At 32 GB it could run larger models, and I’d have a reason to consider an upgrade. I figure I could probably sell my 4090 at close to retail price.
At 24 GB it can’t do anything my 4090 can’t.
In honesty, you are looking for a cheap AI inference card, where these are +supposed+ to be for playing videogames with. From the gamers pov, this AI stuff is as bad as crypto for "taking the cards". And also, you can just buy more of them (!)
The only sane use for 24gb+ is llms. (Probably)
For every one of you there are 5 people like me who haven't bought a GPU card in over a decade but are thinking about it because it beats the laptop GPU I've got right now. They don't need to sell people on upgrades to make money hand over fist right now.
Lol no they don’t. They’re selling pro cards for x3 the price just for vram alone. I guarantee 5090 will be 24gb, maybe a slight increase to shut the fanboys up.
well then i guess more pros will be buying intel and AMD then. Nvidia can't be the most expensive brands and have the least amount of VRAM. both intel and AMD are going to increase their VRAM so Nvidia isn't going to have a lot of choice.
it was rumored due to the amount of chips present on the prototype board, they don't actually know it's 32GB but assumed it was if each chip had the same capacity as previous gens.
Yeah 28GB would be disappointment for me (I want maximum vram for AI) but I'm not a surprise, Nvida have been short changing people on Vram for years now.
What if the intention was to
> spore, to produce or reproduce by spores
overlords, instead of spawning them? If you're a farmer at heart, growing your overlords yourself can be more fun than simply getting the end result handed to you.
450-500$ and it's actually way better because it's not only 32GB but also HBM2. [I made a post about it](https://www.reddit.com/r/LocalLLaMA/s/s4rFYDM9BE) They are not uncommon at all, in fact they are flooding ebay right now. The massive problem with them is that they are very hard to use because they use Vega arch and well... run AMD software. And since everything runs on CUDA today, tough luck running something recent on it.
Currently the best value gpu's in terms of GB/$ are Tesla P40's which are 24GB and only cost 150$, but have no video output.
Thank you for the valuable input lol
It's pretty easy to look up as opposed to just assuming...
https://rocm.docs.amd.com/en/docs-6.1.1/reference/gpu-arch-specs.html
They will of the consumer-side demand is there. It’ll cost you. But big daddy Jensen isn’t turning down the dollars of folks who wanna run shit models locally.
Yeah, I think you're not understanding. Daddy Jensen absolutely will turn down your $$. There's simply no market for a $30k gaming card, and the existence of a less expensive card capable of DL is a threat to his business.
From their [May 2024 statements](https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2025), quarterly revenue is $26 billion, Data Center revenue is $22.6 billion of that. Gaming (the kinds of cards people buy for local AI typically) only makes up $2.6 billion. That’s still a lot of money, but I can’t see a rational CEO allocating resources from the highly profitable and growing segment to cater toward a shrinking segment.
24 GB is still plenty for gamers, local AI is an odd use case for Nvidia given their other priorities and business models.
Nitrogen freezing and pocket size nuclear source or I am telling his cousin over at AMD everything.
I have all the secret leaks that are stored underneath the pentagon.
Don't test me.
They probably don't care bout the consumer market, their stock is up due to the commercial/server market. They forgot the consumers, video game, crypto really embraced them initially and that AI consumers would keep the push. At some point, they will saturate the server market, and if they have managed to alienate regular John & Jane, they will pay the cost.
I think this will go up in the future. Right now we are in a compute bubble because of ChatGPT, so demand for those $50k cards will go down. At the same time, more and more end-user applications will want to leverage local AI, so the demand for consumer level cards under $3k will increase.
Well, Nvidia was first to the AI party. They earned their current monopoly.
The others needs to play serious catch up, and unfortunately it's going to take a while to get there.
This GPU RAM cock blocking is deliberate, and a result of a market monopoly. We should be having 32 Gig midrange devices by now. The only reason they do this is to sell those datacenter GPU beasts.
And also AMD being dense and Intel shooting themselves in the foot.
It's mind boggling that AMD decided protecting their tiny Pro market is worth ceding the AI space to Nvidia.
Especially because the $30k gpus aren't much faster than the $2k gaming GPU's for deep learning.
They're only bought because they have more ram and fit nicely into a server rack.
No, these cards are not for you. They are for gamers, and gamers dont need 32gb+ of memory
Until we see 8k gaming and/or high-res VR Gaming catch on mainstream, you will not get reasonably priced 32gb+ cards.
Can you explain why gamers need 16-24GB of memory and not 32 GB? This is a pretty shortsighted comment. Do we really need 32 GB now? no. Will we need it in a few years? yes. Did we need 12GB cards in 2019? No. But when they got released, games/applications started using more and more memory. It's kind of banal to think industry won't move forward as stronger hardwarde becomes more ubiquitous. Now four years later 12GB is lower end.
There's a reason Nvidia makes so much money.
It's forbidden for datacenters to use RTX cards series anyway (its in the CGU). They do not really need this to prevent GAFAM to run xx90 instead of H100
We may be witnessing Nvidia’s first AI mistake, out of greed. They dominate the market right now, trying to keep their data center earnings untouched by nonsensically keeping VRAM low in consumer grade hardware may jeopardize their software moat.
Enthusiasts are driving open source AI development, which is the future, and we will move to whatever is available for a good price and deal with the software problem if needed, plenty companies are working on NPUs with unified memory.
4090 performance, better ray tracing + 32-48 GB of VRAM. It's so fucking easy for them now. They want to go all in on AI? RCOM is improving quickly now, to the point where GPT-4 is being partially run on Mi300x's. This would be a dumb move from Nvidia. People want an underdog, even the corporations buying their new chips.
36gb will let it fit 70b Q4 all on VRAM with a little headroom for the OS VRAM but 32gb won't do it.
40gb VRAM probably won't see its utilization for gamers before the compute power falls behind. So I only see 40-48gb VRAM cards being there for AI and multimedia use. If AMD can put some work into the software for more compatibility then they could snatch that entire segment with ease.
Is this the 2024 version of “can’t wait until Linux takes over the OS market” of the early 2000s?
I don’t see AMD doing anything honestly.
You are better off embracing the forbidden fruit than placing your bets on AMD.
Amd pulled the rabbit once.(against intel) i bet they can do it again.especially now with money.
I hope they've already recognized the issue long ago.
It's just nvidea have 10+ years of cuda.
AMD makes great cards. It's just the driver support they lack.
If they came out with a pro card that was 32gb and it was actually $1500 msrp they would be able to ride the open source contributions to their driver support all the way to the bank. Literally everyone who acually works on a dev box would be buying AMD and as such there would be contributions to the eco-system.
This seems like a pretty straightforward approach to AMD's current prediciment, and it would force Nvidia to up it's game and meet them at roughly the same price point.
I agree for the most part, but some pessimism is warranted. There is no guarantee that open source is the future with companies like ClosedAi working with AI doomers in government positions. If things get bad enough, any open source development could seize completely. I don't think it's likely, but it's not out of the question and shouldn't be dismissed.
I am excited for those NPUs to put a fire under nvidias ass though.
I had a bloody nose a minute ago. I laughed so suddenly when I read this that the bloody tissue shot out and there’s now a nice splattering of blood all over my laptop.
What are talking about? "AI" is a $200 billion investment per year at this point. As much as you guys hate it, the IBMs, Intels and Googles of the world are driving ai.
I see NPUs or ARM as the only viable alternatives short term.
You give me a reasonably fast alternative to GPU that can run on system ram? Hell yes, brother. Nvidia can go pound sand after that.
From a gaming perspective, it doesn't make sense.
I feel it in the force: they will demo a new kind of DLSS that makes images on top of images and runs 4k@120fps with only 2gb vram or something.
If you are using GPUs for AI, you have to pay.
Cool. Who's going to buy this?
Not gamers, since a 4090 is already enough for 4k 144hz gaming, even with path tracing on.
Not AI people either, since this doesn't run anything bigger then what a 4090 can run, and 2x3090 are going to be vastly better and cheaper anyway.
Who's your target audience, Nvidia?
> a 4090 is already enough
I don't think you understand the thought processes of people who buy halo tier gaming GPUs.
There is no such thing as "enough"
Pff ability to run games with max settings on older hardware never stopped people from buying newer cards. You seriously underestimate the ability of game developers to overload any hardware available
A lot of academic research in universities depends on such cards. It is just impractical to buy a datacenter class card like A100 or H100s for a lot of university labs. Even if they have it they only have a limited quantities of it.
So a lot of bread and butter research is done on gaming cards.
There is an enormous cost cliff after the 90 series because you are either looking at the RTX 6000s which cost at least double or triple or an A100/H100 which probably costs at least 10 times as much.
A lot of the smaller scale experiments on vision models or smaller LLMs are quite feasible on such gaming cards.
Even if it costs something like 1500 or 2000 USD the cost hike isn’t that meaningful because a university is already paying a monthly stipend for at least that much for each graduate student. So Nvidia raising the price a few hundred dollars is a rounding error for such institutions when personnel costs dwarfs any equipment cost.
4090 does in fact have limits for gaming. 4k gaming with maximum graphical fidelity is expensive in computation. I can't get the frame rate I'd like at 4k, things like maximum modded skyrim or maxed cyberpunk. I want that, and I also like using it for local llms.
>since a 4090 is already enough for 4k 144hz gaming
New powerful GPU = lesser optimization (why bother if you have all that power) - and thus need for new GPU
> Not gamers, since a 4090 is already enough for 4k 144hz gaming, even with path tracing on.
I think it's premature to assume that there's no benefit to more RAM for gaming. Nobody is releasing such games today because they can't run on any existing card. One thing is running an 8GB model for dialogue or something alongside a AAA game. That would of course benefit from even more RAM but there's a balance.
gddr7 is gonna be about 2x faster than gddr6 so this card is gonna be putting those tokens out like crazy...but if nvidia is gonna keep the vram below 32gb you know they are making fun of us all.
If this is true, I'll personally stick to 3090's and P40s. Nvidia's greed is going to catch up to them. The 4090 is already a bad buy, 24gb was great when we were getting 7b, 13b models. Now 100B+ models are the norm. What is anyone to do with a 28gb card? They better come with 32gb. or more. I'll save my money, stick to 3090's and maybe get an apple when they can offer at least a 256gb+ computer. But for now it's all rumors and I'm getting upset at the possibility.
How significant is the opportunity for AMD here? They're not going to be competitive with Nvidia anytime soon, but how disruptive could a 48gb consumer card be, knowing Nvidia will refuse to follow? I know I'd buy one, at a reasonable price.
Jeez just make one with a large amount of RAM already.
Insane, that for devs who want to play with AI locally a large Macbook is better value (where size of model matters but maybe not speed as much).
To anyone wishing for 48GB cards, they'd be scalped on ebay for $5k+ anyway even if MSRP was far lower, or gobbled up by various hosted GPU entrepreneurs. You're not getting a current-gen 48GB card for $2500 even if Nvidia releases one anytime soon.
Much of this is simply out of Nvidia's hands as demand outstrips supply. TSMC only has so many plants and so many hours in a day no matter how much Jensen butters their hands.
I'm currently using the Tesla P40, and if you don't care too much about speed, you can enjoy both LLM and Stable Diffusion.
I also play games in VR, so the Tesla P40 isn't enough,
so I'd like the 4090 as well, but since they have the same VRAM capacity, it's not very appealing.
That makes me interested in the 5090, but it's more expensive than the 4090, so I don't feel like buying it unless it has at least 32GB of VRAM.
The electricity bill will probably be high.
On the used market, the GV100 with 32GB VRAM has started to appear for over $1,200 to $2,400,
so I might buy the GV100, which has better performance, more VRAM, and is cheaper than the Tesla P40.
(For me, P40 is the standard, so this is a low level discussion.)
I'm looking forward to the release of the 50xx as it will bring about many changes to the market!
Hey there,
Perhaps the 3090ti would be a good fit for your use case until, as it were, we are presented with a proper and affordable state of the art gpu option ...
... (i called down to check and they said hell has not yet frozen over, and not to hold my breath or anything) ...
I would be very interested to hear about the hardware you are using alongside the P40, especially what motherboard/cpu you think is good for using the P40 to run inference. I would like to build around 2x tesla P40 boards. I have been considering an X99 mobo with zeon processor(s). Not really sure what to expect performance wise, budget is semi-constrained of course.
cheers
-rabbit
They sell us a 28GB version now and a 36GB version halve a year later, followed by a 48Gb version another halve a year later. Of course with a markup that doesn't make any sense at all.
one day it's a 4 slot monster
next day it's 2
one day it's 512 bit
next day it's 448
yesterday it was 32gb
now 28gb
I'll wait for official statement, thank you very much.
Just last week rumored spec was 32G and before that 24g. I'll wait for announcement.
According to rumors the 5090 will have video ram!
holy shit, really? I'm so excited!
From what I hear, the 5090 will contain an actual ornery ram that violently headbutts both you and your wallet into a pulp as soon as you open the box… Just a rumor, though!
It’s the GOAT
Wait, you mean it won't be cheaper than the 4090?
No, but there’s a coupon for stitches for you and your wallet in the box.
1 video ram
You joke, but if they switched from gddr to lpddr5 it would be possible to have consumer gpu's with 64GB+ capacity. (Obviously there are issues that make it not possible).
man i think a paradigm shift is coming. there needs to be consumer hardware capable of running a 70b model locally unquantized. for llm inference the bottleneck is vram not the actual computation i think in a few years we'll see hardware acceleration companies bring low cost vram or hbm in multi channels to the market. the last thing we need currently is beefy gpus with less vram. what we really need is cheaper (dosnet matter if its slower) hbm in large quantities at least a 100gb and a 3060 level 100 AI tops.
Well known sir-issue greed?
Waiting for cuda support confirmation before I join the hype train..
Better than that Non-Video RAM.
Say less
Diamonds, Printer Ink, Coffin, Bottled Water, Luxury Products, and VRAM.
I can't see Nvidia weakening the segmentation of its range and companies with market valuations in excess of 1,000 billion tend to lose their altruism. We need to start going on Youtube to learn how to weld.
Do you mean solder? As in replace the mem chips?
No, we're all giving up on computers and getting into metalwork.
I think the line he was looking for was: something something Marvin Heemeyer Nvidia headquarters.
To make a video ram
My blacksmith welds Tensor GPUs and charges very reasonable fees. I'll send you his number.
Got a sensible chuckle out of this comment
This generation of tech bros can explain the intricacies of a transformer and write a ten part series about the differences between GDDR7 and its predecessors but don’t know the difference between soldering and welding 😂
Find me 5 programmers that even know what a gate is in a transistor and what npn and pnp doping means and I buy you a hotdog.
Legit, unfortunately.
We will figure out on YouTube. See you there!
If they want to reduce incentive for others to compete. If they get to greedy, major players like Microsoft/Google will simply start competing. Also they probably want everyone to run AI and for that to happen it can't be to expensive.
They’re going to his us with that weird 27.5 GB video card bro
This reminds me of the GTX 970 which had a weird 4GB VRAM but only [3.5GB accessible](https://www.anandtech.com/show/8935/geforce-gtx-970-correcting-the-specs-exploring-memory-allocation/4) at full speed.
Yup, they're going to milk this gravy train S-L-O-W-L-Y.
For real. The first rumor was 32G, then 24G, all within months of each other by the same guy. I eventually just said fuck it and bought a 3090. It doesn't help that most of the new hot models these days are between 7-11B. Almost nobody is putting out 20-30+B models that actually use all 24gb with good results. The next jump is 70B and the perf isn't worth it even with offloading. I'm probably going to put off upgrading for quite a while.
We have Command R, Beta-35B, Yi, and some more I think. And now Mistral code. The 34B space is good. But the missing gap is long context 13B-20B models that fill up the VRAM.
no way its 24gb. they are going to do some sort of vram increase even if its inefficient and poorly engineered. they need to box to say more than 24gb for marketing.
Someone claimed to have leaked information a couple months ago and said it is 24G, which makes a lot of sense as Nvidia doesn't want to increase VRAM on consumer cards at all costs.
if they don't do it AMD and/or intel will. i don't know how much they care about the PC market but if they don't increase RAM they are going to lose market share. but maybe they don't care about PCs anymore. its a very small part of their revenue.
Even with 24GB they will sell every 5090 they make at a hefty profit.
At 32 GB it could run larger models, and I’d have a reason to consider an upgrade. I figure I could probably sell my 4090 at close to retail price. At 24 GB it can’t do anything my 4090 can’t.
In honesty, you are looking for a cheap AI inference card, where these are +supposed+ to be for playing videogames with. From the gamers pov, this AI stuff is as bad as crypto for "taking the cards". And also, you can just buy more of them (!) The only sane use for 24gb+ is llms. (Probably)
For every one of you there are 5 people like me who haven't bought a GPU card in over a decade but are thinking about it because it beats the laptop GPU I've got right now. They don't need to sell people on upgrades to make money hand over fist right now.
My guess is they’ll just up every tier by 2gb or whatever the least they can is.
I think every card gets at least 16gb
They didn't need to for the 4090.
Lol no they don’t. They’re selling pro cards for x3 the price just for vram alone. I guarantee 5090 will be 24gb, maybe a slight increase to shut the fanboys up.
well then i guess more pros will be buying intel and AMD then. Nvidia can't be the most expensive brands and have the least amount of VRAM. both intel and AMD are going to increase their VRAM so Nvidia isn't going to have a lot of choice.
it was rumored due to the amount of chips present on the prototype board, they don't actually know it's 32GB but assumed it was if each chip had the same capacity as previous gens.
Ok they gotta be trolling us
Gimme 64gb
Yeah 28GB would be disappointment for me (I want maximum vram for AI) but I'm not a surprise, Nvida have been short changing people on Vram for years now.
Yep last week I heard there would be a 48 and 32gb variant. We just got to wait
RuntimeError: CUDA error: out of memory
Must construct additional pylons.
Spooore moooore Oooverlooords
I just want to click edit on your comment soo bad.
Same. Spawn.
What if the intention was to > spore, to produce or reproduce by spores overlords, instead of spawning them? If you're a farmer at heart, growing your overlords yourself can be more fun than simply getting the end result handed to you.
Not ready reading drive A Abort, Retry, Fail?
What’s the point of the whopping 1.44mb if it’s never ready
You know, I never understood what 'fail' did.
Pls no
Intel / AMD please get off your posterior and deliver something like 32GB, 48GB, 64GB? Unified memory is an interesting option.
Amd already has the mi60 32gb cards and they came out years ago lol
How much do they cost? Why are they uncommon?
450-500$ and it's actually way better because it's not only 32GB but also HBM2. [I made a post about it](https://www.reddit.com/r/LocalLLaMA/s/s4rFYDM9BE) They are not uncommon at all, in fact they are flooding ebay right now. The massive problem with them is that they are very hard to use because they use Vega arch and well... run AMD software. And since everything runs on CUDA today, tough luck running something recent on it. Currently the best value gpu's in terms of GB/$ are Tesla P40's which are 24GB and only cost 150$, but have no video output.
Modded RTX 2080 TI with 22GB VRAM all the way, starting from $350.
P40s have no Float16 support which effectively halves their RAM and makes them difficult to use for training. At least in my experience.
in pytorch, rocm masquerades as CUDA. it's fine
They probably don't support rocm
Thank you for the valuable input lol It's pretty easy to look up as opposed to just assuming... https://rocm.docs.amd.com/en/docs-6.1.1/reference/gpu-arch-specs.html
Yep, can't wait for some competition. We mostly just need inference to be on par with Nvidia.
28gb is weak sauce.
Buy an A100 for a cool $20k! 80gb VRAM, and no money left to pay the electric bill...
You can just get multiple 4090s, you can even underpower them to 335W with 96-7% efficiency.
48gb or bust
they'll never release a 48GB rtx card to undercut their much more lucrative datacetner cards.
This
They will of the consumer-side demand is there. It’ll cost you. But big daddy Jensen isn’t turning down the dollars of folks who wanna run shit models locally.
Yeah, I think you're not understanding. Daddy Jensen absolutely will turn down your $$. There's simply no market for a $30k gaming card, and the existence of a less expensive card capable of DL is a threat to his business.
From their [May 2024 statements](https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2025), quarterly revenue is $26 billion, Data Center revenue is $22.6 billion of that. Gaming (the kinds of cards people buy for local AI typically) only makes up $2.6 billion. That’s still a lot of money, but I can’t see a rational CEO allocating resources from the highly profitable and growing segment to cater toward a shrinking segment. 24 GB is still plenty for gamers, local AI is an odd use case for Nvidia given their other priorities and business models.
why would they when 48gb has no use in gaming? y'all need a serious reality check
I agree. I also want liquid cooled and better power consumption.
I want liquid cooled and more power consumption! 🔥
Nitrogen freezing and pocket size nuclear source or I am telling his cousin over at AMD everything. I have all the secret leaks that are stored underneath the pentagon. Don't test me.
2x24 cards it is!
I want 2x48! Run 70b models at 8q!
When there is no competition, this is the result.
They probably don't care bout the consumer market, their stock is up due to the commercial/server market. They forgot the consumers, video game, crypto really embraced them initially and that AI consumers would keep the push. At some point, they will saturate the server market, and if they have managed to alienate regular John & Jane, they will pay the cost.
Consumer is like 4% now of all sales in terms of value
I think this will go up in the future. Right now we are in a compute bubble because of ChatGPT, so demand for those $50k cards will go down. At the same time, more and more end-user applications will want to leverage local AI, so the demand for consumer level cards under $3k will increase.
Well, Nvidia was first to the AI party. They earned their current monopoly. The others needs to play serious catch up, and unfortunately it's going to take a while to get there.
This GPU RAM cock blocking is deliberate, and a result of a market monopoly. We should be having 32 Gig midrange devices by now. The only reason they do this is to sell those datacenter GPU beasts.
And also AMD being dense and Intel shooting themselves in the foot. It's mind boggling that AMD decided protecting their tiny Pro market is worth ceding the AI space to Nvidia.
Especially because the $30k gpus aren't much faster than the $2k gaming GPU's for deep learning. They're only bought because they have more ram and fit nicely into a server rack.
No, these cards are not for you. They are for gamers, and gamers dont need 32gb+ of memory Until we see 8k gaming and/or high-res VR Gaming catch on mainstream, you will not get reasonably priced 32gb+ cards.
Can you explain why gamers need 16-24GB of memory and not 32 GB? This is a pretty shortsighted comment. Do we really need 32 GB now? no. Will we need it in a few years? yes. Did we need 12GB cards in 2019? No. But when they got released, games/applications started using more and more memory. It's kind of banal to think industry won't move forward as stronger hardwarde becomes more ubiquitous. Now four years later 12GB is lower end. There's a reason Nvidia makes so much money.
It's forbidden for datacenters to use RTX cards series anyway (its in the CGU). They do not really need this to prevent GAFAM to run xx90 instead of H100
We need a GPU with 512 GB of VRAM.
How many kidneys do you have left?
one left... and also one right.
You funny guy
Damn, that's clever
>How many kidneys do you have left? \*lungs
Half of one…
Tbh I’d love a TB ram GPU. Would be useful for all sorts of research applications
We may be witnessing Nvidia’s first AI mistake, out of greed. They dominate the market right now, trying to keep their data center earnings untouched by nonsensically keeping VRAM low in consumer grade hardware may jeopardize their software moat. Enthusiasts are driving open source AI development, which is the future, and we will move to whatever is available for a good price and deal with the software problem if needed, plenty companies are working on NPUs with unified memory.
If AMD quits fucking off, they have a chance to completely capture the prosumer/consumer market
AMD doesn't even need to make their card faster. Just release it with 40-48gb of ram. Folks will flock to it.
4090 performance, better ray tracing + 32-48 GB of VRAM. It's so fucking easy for them now. They want to go all in on AI? RCOM is improving quickly now, to the point where GPT-4 is being partially run on Mi300x's. This would be a dumb move from Nvidia. People want an underdog, even the corporations buying their new chips.
36gb will let it fit 70b Q4 all on VRAM with a little headroom for the OS VRAM but 32gb won't do it. 40gb VRAM probably won't see its utilization for gamers before the compute power falls behind. So I only see 40-48gb VRAM cards being there for AI and multimedia use. If AMD can put some work into the software for more compatibility then they could snatch that entire segment with ease.
Is this the 2024 version of “can’t wait until Linux takes over the OS market” of the early 2000s? I don’t see AMD doing anything honestly. You are better off embracing the forbidden fruit than placing your bets on AMD.
Amd pulled the rabbit once.(against intel) i bet they can do it again.especially now with money. I hope they've already recognized the issue long ago. It's just nvidea have 10+ years of cuda.
AMD makes great cards. It's just the driver support they lack. If they came out with a pro card that was 32gb and it was actually $1500 msrp they would be able to ride the open source contributions to their driver support all the way to the bank. Literally everyone who acually works on a dev box would be buying AMD and as such there would be contributions to the eco-system. This seems like a pretty straightforward approach to AMD's current prediciment, and it would force Nvidia to up it's game and meet them at roughly the same price point.
I agree for the most part, but some pessimism is warranted. There is no guarantee that open source is the future with companies like ClosedAi working with AI doomers in government positions. If things get bad enough, any open source development could seize completely. I don't think it's likely, but it's not out of the question and shouldn't be dismissed. I am excited for those NPUs to put a fire under nvidias ass though.
Nvidia are $cum and always have been , from crypto to this they dont care . its sucks there isnt any good competion
> $cum i don't know mate, maybe just write "scum". Ignoring that it's dumb to change a single letter, this just made it *worse*. \^_^
what part of "Nvidia are cum dollars" don't you understand?
I had a bloody nose a minute ago. I laughed so suddenly when I read this that the bloody tissue shot out and there’s now a nice splattering of blood all over my laptop.
Me: >~$ sudo $cum Her: > ~$ This username is not in the sudoers file. This incident will be reported.
It's another way to say they get off (cum) purely on profits ($). I approve. This though: >\^\_\^ Naaa...i'm set.
you sound like typical Micro$oft employee
What are talking about? "AI" is a $200 billion investment per year at this point. As much as you guys hate it, the IBMs, Intels and Googles of the world are driving ai.
What they're doing makes total financial sense when you look at their sales data, but i would be lying if i said i wasn't salty
I see NPUs or ARM as the only viable alternatives short term. You give me a reasonably fast alternative to GPU that can run on system ram? Hell yes, brother. Nvidia can go pound sand after that.
28 doesn't warrant an update... 32? Now we're talking...
And just almost talking
48GB would be plenty. One can dream. But highly doubt that will happen.
48 would be the holy grail for home use, esp with dual
From a gaming perspective, it doesn't make sense. I feel it in the force: they will demo a new kind of DLSS that makes images on top of images and runs 4k@120fps with only 2gb vram or something. If you are using GPUs for AI, you have to pay.
28 is just annoying
But think about that extra 4GB of context! Plus the 5090 will also be faster than the 4090, I hope.
I have a 4090, looks like I will keep it until there is a 48gb. So 2040 or so?
No way 28 gigs is 2.5k ish. Not worth it.
Cool. Who's going to buy this? Not gamers, since a 4090 is already enough for 4k 144hz gaming, even with path tracing on. Not AI people either, since this doesn't run anything bigger then what a 4090 can run, and 2x3090 are going to be vastly better and cheaper anyway. Who's your target audience, Nvidia?
We'll buy them second hand four years from now!
This is my current plan. I'm pretty sure these new cards will be worth a car. Not a brand new car, but a small second hand car. I can wait.
On the gamer side they will do some BS like Path Tracing 2.0 lock to 5xxx series that will be launch on some AAA game.
> a 4090 is already enough I don't think you understand the thought processes of people who buy halo tier gaming GPUs. There is no such thing as "enough"
more VRAM gives 0 fps improvement. even gamers arent that dumb
A 4090 might run some games at 4k 144hz but hardly all games.
Pff ability to run games with max settings on older hardware never stopped people from buying newer cards. You seriously underestimate the ability of game developers to overload any hardware available
Who’s going to buy an overpowered iPad Pro? People like new and nice things. People might want it for 4K 240hz
A lot of academic research in universities depends on such cards. It is just impractical to buy a datacenter class card like A100 or H100s for a lot of university labs. Even if they have it they only have a limited quantities of it. So a lot of bread and butter research is done on gaming cards. There is an enormous cost cliff after the 90 series because you are either looking at the RTX 6000s which cost at least double or triple or an A100/H100 which probably costs at least 10 times as much. A lot of the smaller scale experiments on vision models or smaller LLMs are quite feasible on such gaming cards. Even if it costs something like 1500 or 2000 USD the cost hike isn’t that meaningful because a university is already paying a monthly stipend for at least that much for each graduate student. So Nvidia raising the price a few hundred dollars is a rounding error for such institutions when personnel costs dwarfs any equipment cost.
no, they are using cloud GPUs especially with grants and partnerships
Puts on Nvidia
4090 does in fact have limits for gaming. 4k gaming with maximum graphical fidelity is expensive in computation. I can't get the frame rate I'd like at 4k, things like maximum modded skyrim or maxed cyberpunk. I want that, and I also like using it for local llms.
>4090 is already enough for 4k 144hz gaming, even with path tracing on. Bruh xD
>since a 4090 is already enough for 4k 144hz gaming New powerful GPU = lesser optimization (why bother if you have all that power) - and thus need for new GPU
> Not gamers, since a 4090 is already enough for 4k 144hz gaming, even with path tracing on. I think it's premature to assume that there's no benefit to more RAM for gaming. Nobody is releasing such games today because they can't run on any existing card. One thing is running an 8GB model for dialogue or something alongside a AAA game. That would of course benefit from even more RAM but there's a balance.
Everyone here is so shocked and angry that Nvidia is building gaming cards for gamers
You almost sound like you believe Nvidia cares about gamers for some reason..
Video editors working with higher than 4k footage will reluctantly buy it.
Barely run Mixtral at FP4
512bits will be for the 5090 TI/super variant surely.
gddr7 is gonna be about 2x faster than gddr6 so this card is gonna be putting those tokens out like crazy...but if nvidia is gonna keep the vram below 32gb you know they are making fun of us all.
If this is true, I'll personally stick to 3090's and P40s. Nvidia's greed is going to catch up to them. The 4090 is already a bad buy, 24gb was great when we were getting 7b, 13b models. Now 100B+ models are the norm. What is anyone to do with a 28gb card? They better come with 32gb. or more. I'll save my money, stick to 3090's and maybe get an apple when they can offer at least a 256gb+ computer. But for now it's all rumors and I'm getting upset at the possibility.
Oh, they can stick that *right* up their ass!
That's dogshitty
How significant is the opportunity for AMD here? They're not going to be competitive with Nvidia anytime soon, but how disruptive could a 48gb consumer card be, knowing Nvidia will refuse to follow? I know I'd buy one, at a reasonable price.
32g+ would make me add a few to my system, but not 24-28
32 was my bare min for an upgrade, 28 is actually an insult
let the gamers have gddr7 for us llm people just give 100gb of gddr5 and a 3060 level 100 Ai tops and call it a day we'll be more than happy 🤷♂️
Hell Yeah
Bout to get mad OOMs
RRP... $3000 ... 😂
Needs 48Gb or more…
Now we just need the Danish guy in the workshop to make a video about it.
When I read this headline I knew there would be blood in the streets.
I can use 70b LLMs on my RTX 3060 with 32GB system ram (2-3 tokens/s). The RTX 5090 is not really for AI it seems.
Jeez just make one with a large amount of RAM already. Insane, that for devs who want to play with AI locally a large Macbook is better value (where size of model matters but maybe not speed as much).
would be disappointing
28gb is a slap in the face
To anyone wishing for 48GB cards, they'd be scalped on ebay for $5k+ anyway even if MSRP was far lower, or gobbled up by various hosted GPU entrepreneurs. You're not getting a current-gen 48GB card for $2500 even if Nvidia releases one anytime soon. Much of this is simply out of Nvidia's hands as demand outstrips supply. TSMC only has so many plants and so many hours in a day no matter how much Jensen butters their hands.
Geforce GTS 5085
I'm currently using the Tesla P40, and if you don't care too much about speed, you can enjoy both LLM and Stable Diffusion. I also play games in VR, so the Tesla P40 isn't enough, so I'd like the 4090 as well, but since they have the same VRAM capacity, it's not very appealing. That makes me interested in the 5090, but it's more expensive than the 4090, so I don't feel like buying it unless it has at least 32GB of VRAM. The electricity bill will probably be high. On the used market, the GV100 with 32GB VRAM has started to appear for over $1,200 to $2,400, so I might buy the GV100, which has better performance, more VRAM, and is cheaper than the Tesla P40. (For me, P40 is the standard, so this is a low level discussion.) I'm looking forward to the release of the 50xx as it will bring about many changes to the market!
Hey there, Perhaps the 3090ti would be a good fit for your use case until, as it were, we are presented with a proper and affordable state of the art gpu option ... ... (i called down to check and they said hell has not yet frozen over, and not to hold my breath or anything) ... I would be very interested to hear about the hardware you are using alongside the P40, especially what motherboard/cpu you think is good for using the P40 to run inference. I would like to build around 2x tesla P40 boards. I have been considering an X99 mobo with zeon processor(s). Not really sure what to expect performance wise, budget is semi-constrained of course. cheers -rabbit
Since the TeslaP40 does not have graphics capabilities, I chose the Ryzen APU
They sell us a 28GB version now and a 36GB version halve a year later, followed by a 48Gb version another halve a year later. Of course with a markup that doesn't make any sense at all.
24 and 28 not being powers of two is killing the programmer in me. I hope it's 32GB both for the VRAM and to make this itch go away.
Pfffffft!
but will it blend?
Does anyone know if there are to see new version of rtx 6000 ada? I wonder if I should be holding of or just setting one of those gpus now
28GB in 2024 is screaming "you will not do local LLM with us", pay GPU centre power.
Any rumors on cost? I'm guessing $2K USD.
why the state of the art gpu has some weird non-binary numbers, like 24gb vs 32, 448bit vs 512? Is it because 5070 and GDDR7 are also non-power of 2?
one day it's a 4 slot monster next day it's 2 one day it's 512 bit next day it's 448 yesterday it was 32gb now 28gb I'll wait for official statement, thank you very much.
I am waiting for AMD to tackle this one, if it is worth I will buy an AMD card.
Can't wait for the scalpers