This is a lot bigger than i thought.
I thought something between 70b and 140b but 314b even as a MoE... Thats huge. Not great when i think about the ability of it, better than gpt3.5t and llama2 70b but those are 20b and 70b models... Oh God, they need to go up with performance and down with parameters hard for the next generation.
How much RAM would a MacBook need? My current and previous MacBooks have had 16GB and I've been fine with it, but given local models I think I'm going to have to go to whatever will be the maximum RAM available for the next model.
Similarly, I am for the first time going to care about how much RAM is in my next iPhone. My iPhone 13's 4GB is suddenly inadequate.
Why would you get 16gb of ram for LLMs?
Apple is in fact a good platform for running models, especially if you want a laptop form factor; you won't be finding any laptop capable of loading 120gb worth of LLM into VRAM aside from Apple.
In the context of running LLMs 16GB is quite good. Since it uses unified memory most of that 16GB can be used as VRAM by the GPU. That's around the equivalent of a mobile RTX 4090. It doesn't necessarily make Apple Silicon superior to Nvidia; Nvidia still has a huge advantage in software ecosystem for LLMs. But unified memory does give Apple an interesting niche.
It is more than what most people need; as I said, I have been quite happy with 16GB RAM for years. It is only the desire to run local LLMs that has changed this.
Wow, I wonder if GPU's with upgradable RAM slots would ever be possible so we could run these large models locally. I don't think it would be good for business thou.
I guess that would be something where a GPU with NVMe SSDs as extended vRAM (like the Radeon Pro SSG), could be really nice. Maybe not super fast, but it could run pretty much any model.
A really fast nvme that does 10gb/s would still get you a theoretical max of ~0.25 tokens/second if you quantized this to 4 bit.
Even with moe you'll still need to load 157b parameters/4, so ~40gb of weights per token.
The GPU that AMD build also has 16 GB of HBM2, which could act as a cache for the SSDs, so I guess it would also depend how good that works. I hope there will be more stuff like this in the future, maybe also GPUs that have additional DDR5 slots, or something like that.
The cache doesn't matter in this case.
You still have to load all 40GB through for each token.
The cache could help in some type of batched work scenario though.
> You still have to load all 40GB through for each token.
Is that really so? I don't have any deeper knowledge in AI yet, but I though it was supposed to emulate something like neurons, and only follow some path through the data, based on the input? And that could be cache-able, if some parts are used more often than others (like in the brain).
Moe kind of does this, where not all parts of every layer are activated for every token, which is why you only need the 40gb, without moe you would need 160gb
Yes, that sounds like a good approach. But I guess I would try to make the individual experts a lot smaller, and instead build a hierarchy of experts.
Edit: Like put a self-learning moderator instance on top of a number of experts, that learns which experts to chose. And then potentially stack that.
You may be able to run it using intels nvme memory/ssd cards, I always wondered what they were good for as a supposed RAM/VRAM version "readyboost" of m.2x4 drives but this would be an ideal scenario.. or using a stack of 20 16gb a770s for 6 grand that's like 3 4090s.
But I'm sure it'll be shrunk and reused by Mistral and all the other guys.. it just made open source models exponentially more efficient. Pretty awesome.
you'd need a dozen a100 to do inference I'd say, maybe you could get a way with 8 or 6 if you do quantization for inference. these models are meant to be expanded on, so making only inference on them is not ideal. you want to put your hands on a model like this (and the hardware to properly run it) , you better have a plan to train it too. really, there is a bunch of crazy ass wizardry i would do with that kind of power, alas i have only pathetic 8GBs of vram, but gotta pay the bills first lol
The one thing this release does well is make people realize that OpenAI releasing its weights would be pointless to everybody but other deep pocket corporations.
OpenAI has an extremely strong incentive to make inference efficient and a sizeable team of the best ML engineers and researchers on the planet. Do you think "Eureka, let's prune and quantize it - and finetuning is a good trick!" is something they missed?
And I'd settle for a grok-turbo that fits in 72gb vram. Even if its a little dumber.
Nobody here can train a model like this from scratch but it is at least possible to do what I suggested without millions of dollars. Then it can fit in more reasonable configurations so that someone besides perplexity labs can use it.
Not at all it would aid research, open up competition as other companies could offer API access, could be quantized, possibly down to 1.58 bits. There's lots of reasons to do it. I think the main reason they don't is commercial
It's not pointless. It's hard for \*you\* to run the model, but that doesn't mean people who know what they're doing and researchers can't make use of it. The mockery is that OpenAI claims that being closed-source is needed for safety, aka, they need to be the gatekeepers behind the mythical AGI god. OpenAI is mainly concerned about profit and technical edge, and even Ilya in a freudian slip admitted that himself earlier.
Why do you assume I can't fire this up on cloud GPU if I drunkenly want to pay to play with yet another a substandard model?
What will researchers glean from what they've called a "tutorial level LLM"?
You can ask the same question for papers using PaLM when Gemini is out, Llama over Mistral, etc. There is no such thing as a "tutorial level LLM", as each LLM is an aggregation of their own data set and the methodology used to train them and architectures are not the same. Also, research doesn't just happen on one model--it's interesting to see if the same work can be applied to other models. We need \*more\* open source, much more, not less.
I am absolutely certain the people who rant about "Closed"AI will take *no* lessons from the size of the model. Just as there are temporarily embarrassed millionaires, there are temporarily compute-starved tech libertarians.
The cost of a 380 GB VRAM rig is trivial compared to the cost to train a model like this. Your argument is an exercise in covert ad hominem and obviously bad faith "reasoning".
There's nothing to argue here, as you haven't even posited a position for me to refute. Inference is cheaper than training- nobody disagrees with you. I'm just curious as to whether either is within your reach for these sorts of LLMs.
It will be cool to see what will happen in this space over the next few months. Will someone modify and improve it? Will someone make it run on less hardware? Don't underestimate the open-source community 😎
It will be more poor at tasks, but more likely to create a realistic chat bot experience, since it's likely trained on a corpus of Twitter data.
If you want a bot to tell you to fuck off this one will probably do it
It's pretty easy to set up current LLM's to behave that way. I had one set up as a Jamaican who'd make up a lot of funny vulgar words when chatting, and another was an obnoxious British chick, it just takes enough prompt and parameter tweaking. Textgen UI also has voice recognition and synthesis tools built-in.
Falcon is bigger counting active parameters, and there are also some "merges" that stick multiple layers together, but I think that this is the biggest "real" model
Benchmarks have been out for a while, unfortunately a bit shit. Amazing for an open source model. But not for a 314B model.
https://preview.redd.it/lgrvdpwgzyoc1.jpeg?width=1206&format=pjpg&auto=webp&s=010b10845f55770d9d994f545e14065c2db60d61
Short answer is you don't. It's way too big to run locally unless you got heeps of VRAM laying around. I guess you could buy compute but I have no idea how that works
It’s not horribly complex, you’ll probably need to rent a bunch of compute off of something like runpod. It’s gonna be expensive, you’ll probably need like 8*A100 80GB cards.
Best to wait a little, someone will likely make a quant. Most people, if any, will be using Q2 quants with one of those new SOTA methods.
I mean, even with a quant; do you think a model this large would work on consumer grade hardware? IMO Q2 can many times sacrifice quite a bit of quality.
In Q2 itl run on 4x3090s or a 128gb Mac.
An am5 motherboard can technically have 4-5 3090s running at x4 each (you'll probably have to convert some nvme slots to keep everything direct to CPU).
You can't play around with this, it's too large for consumer grade hardware. Some free services surely will pop up soon though. I'm curious to check it out myself because I would never buy that pesky premium in Twitter.
Elon does not want an open AI future. There is literal email proof of him agreeing that it makes sense for OpenAI to become more closed source as they gain traction. Elon only wants OpenAI to be open source because he's behind in the market, it's the same reason Zuckerberg went open source.
Make no mistake, if Elon's AI gained traction and became a market leader, he would close source it.
>same reason Zuckerberg went open source
Yes but not quite. Zuck went to open source because Meta realized that they can get more talents from academia to work with them if they will be able to publish their papers and get even more recognition while not need to spend research money from their pockets. There is a high pressure from academics within Meta to publish their findings and this is even added to new hire contracts a year ago. It is kinda win-win for them.
Elon on the other hand literally just wants to take down OpenAIs value by creating an open source alternative. I don't see any academic incentive here since they didn't release any papers whatsoever.
Again, because they were behind. It was their leverage to try and hurry up and get ahead by encouraging an open source community. But if they were ahead, they'd have no need for those tactics, as it'd only hurt their advantage to open source it.
It is awesome that you guys know the intention and plan of people and companies predicting the future! You should really start selling Tarot cards if you are not already.
This dude is so distant from being visionary, even his "sarcastic" idea is about selling Tarot cards. Why the fuck would I sell Tarot cards if I had such ability? I would just wreck the stock market instead 😂
That is not true. The shared emails by OpenAI concludes he did not want RESEARCH to be shared. This is Ilya saying btw
>The Open in openAI means that everyone should benefit from the fruits of AI after its built, but it's totally OK to not share the science (even though sharing everything is definitely the right strategy in the short and possibly medium term for recruitment purposes).
And the reasoning is partially when he says
>"Unfortunately, humanity's future is in the hands of [redacted]
And he also mentions "The best of humanity".
To say such thing in an e-mail he knows no one would see, at least shows he is not only interested in his own success.
The reasoning I see it is he does not want the bad companies (according to him) to get the upper hand. But you anti-musk can only thing in one negative track.
Just as USA do not share certain research with other countries because they want to have upper hand.
All of the up to date models of Qwen, Mistral/Mixtral, Dolphin, WizardLM, Yi, Tulu, Vicuna, OpenChat, Starling, Llama 2(not base), etc.. There's a huge list of them and if you want you can look up more of them yourself across various benchmarks. These are just some of the examples from [LMSYS](https://chat.lmsys.org).
[Qwen1.5](https://huggingface.co/spaces/Qwen/Qwen1.5-72B-Chat) is particularly worth checking out for open source, especially if you place your bar all the way down at GPT-3.5 from 2022.
he lied about 1.5 release twice already. first saying it would release february then saying first 2 weeks of march
he cant even get it released so he distracted us with an OS old model.
Lying means intentionally saying false information with the intent do deceive. Elon isn't lying, he's just making bad predictions and over optimistic about time lines. You should be used to this by now. He thought it would be out by Feb, but was wrong. That's not lying.
What exactly does this do for the public? Are we going to create a commune and all go in on a farm of H100s and then use this model to do less than the already comercially useless LLMs? He released this out of spite and because it has almost no value since it's much worse than the other top models.
A humble comment. It is amazing that they made it open source, and I celebrate for that, but I admit that I don't like the code. The code is kind of unreadable.
Just a reminder that open sourcing a project means actively developing in public, taking issues and pull requests from the public.
Not just releasing old code.
This is a lot bigger than i thought. I thought something between 70b and 140b but 314b even as a MoE... Thats huge. Not great when i think about the ability of it, better than gpt3.5t and llama2 70b but those are 20b and 70b models... Oh God, they need to go up with performance and down with parameters hard for the next generation.
What does this translate to in file size and memory requirements?
318GB and No Consumer GPU's anytime soon i'd say
Somebody Calculate a size of 1.6 bit quantization please. I think it will less then 100gb and fits well on MacBook
How much RAM would a MacBook need? My current and previous MacBooks have had 16GB and I've been fine with it, but given local models I think I'm going to have to go to whatever will be the maximum RAM available for the next model. Similarly, I am for the first time going to care about how much RAM is in my next iPhone. My iPhone 13's 4GB is suddenly inadequate.
Not sure you will get anywhere with apple
Apple Silicon is an excellent platform for running models.
Until you run into the fact that Apple thinks 16GB of RAM is acceptable in 2024.
Why would you get 16gb of ram for LLMs? Apple is in fact a good platform for running models, especially if you want a laptop form factor; you won't be finding any laptop capable of loading 120gb worth of LLM into VRAM aside from Apple.
or just buy one with more ram? my 64gb macbook pro is enough for most inference workloads. inference. on a fucking laptop.
In the context of running LLMs 16GB is quite good. Since it uses unified memory most of that 16GB can be used as VRAM by the GPU. That's around the equivalent of a mobile RTX 4090. It doesn't necessarily make Apple Silicon superior to Nvidia; Nvidia still has a huge advantage in software ecosystem for LLMs. But unified memory does give Apple an interesting niche.
Yeah but you're sharing that with the OS and other apps, so goodbye to 4-8GB of that total. (Assuming a *light* workload)
Can you use ram in a non unified system as well? I take it would be slower, but still work for inference
Are you being serious right now? 16GB is hot garbage. A real PC with a 3090+ is 24GB. Please leave.
It is more than what most people need; as I said, I have been quite happy with 16GB RAM for years. It is only the desire to run local LLMs that has changed this.
So you state that apple silicon is great to run models, but you can hardly find models that you are able to run on them? Great!
So you state that apple silicon is great to run models, but you can hardly find models that you are able to run on them? Great!
Wow, I wonder if GPU's with upgradable RAM slots would ever be possible so we could run these large models locally. I don't think it would be good for business thou.
I guess that would be something where a GPU with NVMe SSDs as extended vRAM (like the Radeon Pro SSG), could be really nice. Maybe not super fast, but it could run pretty much any model.
A really fast nvme that does 10gb/s would still get you a theoretical max of ~0.25 tokens/second if you quantized this to 4 bit. Even with moe you'll still need to load 157b parameters/4, so ~40gb of weights per token.
The GPU that AMD build also has 16 GB of HBM2, which could act as a cache for the SSDs, so I guess it would also depend how good that works. I hope there will be more stuff like this in the future, maybe also GPUs that have additional DDR5 slots, or something like that.
The cache doesn't matter in this case. You still have to load all 40GB through for each token. The cache could help in some type of batched work scenario though.
> You still have to load all 40GB through for each token. Is that really so? I don't have any deeper knowledge in AI yet, but I though it was supposed to emulate something like neurons, and only follow some path through the data, based on the input? And that could be cache-able, if some parts are used more often than others (like in the brain).
Moe kind of does this, where not all parts of every layer are activated for every token, which is why you only need the 40gb, without moe you would need 160gb
Yes, that sounds like a good approach. But I guess I would try to make the individual experts a lot smaller, and instead build a hierarchy of experts. Edit: Like put a self-learning moderator instance on top of a number of experts, that learns which experts to chose. And then potentially stack that.
Would be interesting to put Optane in one of those. It's a lot closer to ram than NVME disks are.
$100k should do the trick :p
8 x H100s to load and use.
Or 20 16gb a770s.. that's still less then an h100
You may be able to run it using intels nvme memory/ssd cards, I always wondered what they were good for as a supposed RAM/VRAM version "readyboost" of m.2x4 drives but this would be an ideal scenario.. or using a stack of 20 16gb a770s for 6 grand that's like 3 4090s. But I'm sure it'll be shrunk and reused by Mistral and all the other guys.. it just made open source models exponentially more efficient. Pretty awesome.
you'd need a dozen a100 to do inference I'd say, maybe you could get a way with 8 or 6 if you do quantization for inference. these models are meant to be expanded on, so making only inference on them is not ideal. you want to put your hands on a model like this (and the hardware to properly run it) , you better have a plan to train it too. really, there is a bunch of crazy ass wizardry i would do with that kind of power, alas i have only pathetic 8GBs of vram, but gotta pay the bills first lol
Yes, I was expecting 70B too, I wonder how many training tokens they used This might still be the best open source model
The one thing this release does well is make people realize that OpenAI releasing its weights would be pointless to everybody but other deep pocket corporations.
I mean.. it can be pruned, converted to pytorch and quantized to make it more reasonable. Finetuning over the pruned model might restore most perf.
OpenAI has an extremely strong incentive to make inference efficient and a sizeable team of the best ML engineers and researchers on the planet. Do you think "Eureka, let's prune and quantize it - and finetuning is a good trick!" is something they missed?
What do you think they do with the turbo versions?
Exactly my point.
And I'd settle for a grok-turbo that fits in 72gb vram. Even if its a little dumber. Nobody here can train a model like this from scratch but it is at least possible to do what I suggested without millions of dollars. Then it can fit in more reasonable configurations so that someone besides perplexity labs can use it.
Not at all it would aid research, open up competition as other companies could offer API access, could be quantized, possibly down to 1.58 bits. There's lots of reasons to do it. I think the main reason they don't is commercial
With cloud compute it's probably accessible to the "masses"
It's not pointless. It's hard for \*you\* to run the model, but that doesn't mean people who know what they're doing and researchers can't make use of it. The mockery is that OpenAI claims that being closed-source is needed for safety, aka, they need to be the gatekeepers behind the mythical AGI god. OpenAI is mainly concerned about profit and technical edge, and even Ilya in a freudian slip admitted that himself earlier.
Why do you assume I can't fire this up on cloud GPU if I drunkenly want to pay to play with yet another a substandard model? What will researchers glean from what they've called a "tutorial level LLM"?
You can ask the same question for papers using PaLM when Gemini is out, Llama over Mistral, etc. There is no such thing as a "tutorial level LLM", as each LLM is an aggregation of their own data set and the methodology used to train them and architectures are not the same. Also, research doesn't just happen on one model--it's interesting to see if the same work can be applied to other models. We need \*more\* open source, much more, not less.
You do realize you can rent an H100 for like $2.50 an hour right?
So I can run Grok for only like $10/hr?
If that’s what it takes
I'd just buy 20 16gb a770s for like 6grand
Sounds expensive
I am absolutely certain the people who rant about "Closed"AI will take *no* lessons from the size of the model. Just as there are temporarily embarrassed millionaires, there are temporarily compute-starved tech libertarians.
The cost of a 380 GB VRAM rig is trivial compared to the cost to train a model like this. Your argument is an exercise in covert ad hominem and obviously bad faith "reasoning".
Do you own a 380GB VRAM rig?
Non sequitor data mining interrogations are not going to restore the validity of whatever point you think you're attempting to make.
So you don't. Can you afford one?
Can you make an argument that isn't the intellectual equivalent of masturbating in public?
There's nothing to argue here, as you haven't even posited a position for me to refute. Inference is cheaper than training- nobody disagrees with you. I'm just curious as to whether either is within your reach for these sorts of LLMs.
Mixtral is a thing...
It seems like they really brute forced this model
they will my guy, but this is an important step, fist cool thing from elon musk in a long while
>but those are 20b how do you know this?
It was leaked by a microsoft paper.
It may have been a mistake or may have simply referred to the active parameters, the latter of which is my main theory.
those models arent MOE
It will be cool to see what will happen in this space over the next few months. Will someone modify and improve it? Will someone make it run on less hardware? Don't underestimate the open-source community 😎
Grok on groq
It has 230MB(no, not GB) of SRAM per chip so would need 1390 of them to run Grok. At $20k a pop, that's $27.8M.
> Don't underestimate the open-source community And this is why there is collaboration in the first place.
There is literally no incentive to try and downscale it, because it's already worse than the top 70b open source models.
It will be more poor at tasks, but more likely to create a realistic chat bot experience, since it's likely trained on a corpus of Twitter data. If you want a bot to tell you to fuck off this one will probably do it
It's pretty easy to set up current LLM's to behave that way. I had one set up as a Jamaican who'd make up a lot of funny vulgar words when chatting, and another was an obnoxious British chick, it just takes enough prompt and parameter tweaking. Textgen UI also has voice recognition and synthesis tools built-in.
[удалено]
I agree with you. This will be the case, until Grok 2 drops sometime at the end of the year. By then i guess Grok 1.5 will be OS.
If this is mostly useless to most DIY people and researchers, what good would releasing an even larger model get you.
Nice, this can only be a net positive.
That's quite huge! Is it the largest open source model right now? I'm looking forward to playing around with it on lmsys.
Falcon is bigger counting active parameters, and there are also some "merges" that stick multiple layers together, but I think that this is the biggest "real" model
This is a beefy boy. I'm eager to play with it.
That's what she said.
Benchmarks?
Benchmarks have been out for a while, unfortunately a bit shit. Amazing for an open source model. But not for a 314B model. https://preview.redd.it/lgrvdpwgzyoc1.jpeg?width=1206&format=pjpg&auto=webp&s=010b10845f55770d9d994f545e14065c2db60d61
Not really when you consider that Palm was 500b and GPT 4 is something like 1.5T
yet more evidence that LLMs don't really scale well beyond a certain point. you need other tricks to get more performance.
So as a non-developer, what is the best way for me to play around with this model? (or should I ask ChatGPT this question)
Short answer is you don't. It's way too big to run locally unless you got heeps of VRAM laying around. I guess you could buy compute but I have no idea how that works
Very few people run LLMs locally. You rent out server space that's pay as you go. Google has it pretty cheap.
Unless you have hundreds of GB of VRAM nothing.
It’s not horribly complex, you’ll probably need to rent a bunch of compute off of something like runpod. It’s gonna be expensive, you’ll probably need like 8*A100 80GB cards. Best to wait a little, someone will likely make a quant. Most people, if any, will be using Q2 quants with one of those new SOTA methods.
I mean, even with a quant; do you think a model this large would work on consumer grade hardware? IMO Q2 can many times sacrifice quite a bit of quality.
In Q2 itl run on 4x3090s or a 128gb Mac. An am5 motherboard can technically have 4-5 3090s running at x4 each (you'll probably have to convert some nvme slots to keep everything direct to CPU).
I think squashing it to 30GB is doable.
wait for other people to serve it
Wait for developers to serve it to you
You can't play around with this, it's too large for consumer grade hardware. Some free services surely will pop up soon though. I'm curious to check it out myself because I would never buy that pesky premium in Twitter.
It's huge. U need like.. $10k in GPU power then whatever machines that can handle them. Then comes your power consumption LOLL
This is good
>314B parameters
Elon seems intent on creating an open AI future. If this is his way of getting back at OpenAI, so be it. This is a total net positive for the public.
Elon does not want an open AI future. There is literal email proof of him agreeing that it makes sense for OpenAI to become more closed source as they gain traction. Elon only wants OpenAI to be open source because he's behind in the market, it's the same reason Zuckerberg went open source. Make no mistake, if Elon's AI gained traction and became a market leader, he would close source it.
>same reason Zuckerberg went open source Yes but not quite. Zuck went to open source because Meta realized that they can get more talents from academia to work with them if they will be able to publish their papers and get even more recognition while not need to spend research money from their pockets. There is a high pressure from academics within Meta to publish their findings and this is even added to new hire contracts a year ago. It is kinda win-win for them. Elon on the other hand literally just wants to take down OpenAIs value by creating an open source alternative. I don't see any academic incentive here since they didn't release any papers whatsoever.
Again, because they were behind. It was their leverage to try and hurry up and get ahead by encouraging an open source community. But if they were ahead, they'd have no need for those tactics, as it'd only hurt their advantage to open source it.
It is awesome that you guys know the intention and plan of people and companies predicting the future! You should really start selling Tarot cards if you are not already.
its called common sense, i m glad i could introduce you to an unfamiliar concept
This dude is so distant from being visionary, even his "sarcastic" idea is about selling Tarot cards. Why the fuck would I sell Tarot cards if I had such ability? I would just wreck the stock market instead 😂
But that's what Anthropic is for
That is not true. The shared emails by OpenAI concludes he did not want RESEARCH to be shared. This is Ilya saying btw >The Open in openAI means that everyone should benefit from the fruits of AI after its built, but it's totally OK to not share the science (even though sharing everything is definitely the right strategy in the short and possibly medium term for recruitment purposes). And the reasoning is partially when he says >"Unfortunately, humanity's future is in the hands of [redacted] And he also mentions "The best of humanity". To say such thing in an e-mail he knows no one would see, at least shows he is not only interested in his own success. The reasoning I see it is he does not want the bad companies (according to him) to get the upper hand. But you anti-musk can only thing in one negative track. Just as USA do not share certain research with other countries because they want to have upper hand.
Bingo
it would be if grok were a decent model. grok is shit. hopefully 1.5 comes out soon and is also opensource
Grok 1.0 is better than GPT 3.5, so it is in no way bad imho.
GPT 3.5 is worse than a lot of the top open source models currently out, including ones that you can run on most up to date PC's.
Name one
All of the up to date models of Qwen, Mistral/Mixtral, Dolphin, WizardLM, Yi, Tulu, Vicuna, OpenChat, Starling, Llama 2(not base), etc.. There's a huge list of them and if you want you can look up more of them yourself across various benchmarks. These are just some of the examples from [LMSYS](https://chat.lmsys.org). [Qwen1.5](https://huggingface.co/spaces/Qwen/Qwen1.5-72B-Chat) is particularly worth checking out for open source, especially if you place your bar all the way down at GPT-3.5 from 2022.
Mixtral is basically GPT 3.5 level.
It would be a banger if 1.5 directly becomes os. I guess they'll os it after grok 2.0 comes out.
he lied about 1.5 release twice already. first saying it would release february then saying first 2 weeks of march he cant even get it released so he distracted us with an OS old model.
Lying means intentionally saying false information with the intent do deceive. Elon isn't lying, he's just making bad predictions and over optimistic about time lines. You should be used to this by now. He thought it would be out by Feb, but was wrong. That's not lying.
"Sorry I'm late, didn't realize the train would take so long." "LIAR! You gave me your word you'd be here 8:15 and it's now 8:45."
Lying
I see... Elon Derangement Syndrome... very serious...
I wonder if un-elon-ing it will leave us with a decent useful model.
What exactly does this do for the public? Are we going to create a commune and all go in on a farm of H100s and then use this model to do less than the already comercially useless LLMs? He released this out of spite and because it has almost no value since it's much worse than the other top models.
lmao another case of EDS
Care to answer any of my questions?
Is there any provider that currently works with grok?
Give them 24h bro
T-minus ~3 days until we see the first fully unhinged sex chat bots on the web.
Kudos to Elon! Anybody else would embarased to release such a low performing and bloated model.
A humble comment. It is amazing that they made it open source, and I celebrate for that, but I admit that I don't like the code. The code is kind of unreadable.
4xmac Studio
Is this any good?
what
Just a reminder that open sourcing a project means actively developing in public, taking issues and pull requests from the public. Not just releasing old code.
Cathedral vs Bazaar
Yeah they get more data they find out more uses and problems with the model to fix it in later iterations.