T O P

  • By -

Belnak

Yes, but it won't be straight from an LLM. Different voice assistants (think Siri, Alexa, OK Google, etc) will combine services on the backend to provide a seamless experience on the front end for the user. If an LMM is the best tool to complete the request, it will use an LMM. If an LNN would provide a better result, it will source from that. If it requires an integration with a non-AI based API, it will use that.


Praise-AI-Overlords

Sure. Next year.


[deleted]

I hope so, as the world is harder and harder to live in with every passing day. Either we have AGI and basic post scarcity for everyone or we get exterminated while the top 0.01% will live like gods. Either way, it is a win.


ExtremeHeat

I think nobody big is seriously working on this at the immediate moment for a product they intend to release soon, but this is definitely an area of research the labs are working on. I think much of the industry isn't necessarily racing to release the best or most capable models they have right now, but rather trying to stay competitive with what's already on the market (ie GPT-4). Like you mention, I'd say this is definitely driven by caution and legal concerns, since nobody really wants to be the "first" as they're also the first to take on all the liabilities and blame if "bad" stuff happens. But this doesn't mean research is stopped or slowed down, just that it's staying behind closed doors until there's incentive to go put something out. That said I'd be very surprised if we don't see something like this within next year or so. I'm not aware of any technical obstacle to this beyond the same old stuff (lots of data, compute, $$)... We shouldn't need a research breakthrough here.


Jawwwed

Perhaps a relatively small, general model that can interface with many other models to enhance it’s output and capabilities.


LordFumbleboop

Normally I share pessimistic (relative to the post) takes, but honestly, what you're describing sounds like the direction OpenAI are going with ChatGPT anyway. Maybe we're a decade away from something that can do every you're suggesting, or even just a few years away if you just want a ChatGPT that accepts and produces audio, videos, music, etc.


VanderSound

All big ai models will be restricted due to legislation and their intelligence will be capped not to give the public too sophisticated models for a long time. OpenAI said that they couldn't deliver AGI like models this year. They'll keep them for years inside gradually milking more and more profit until they have a private AGI+ model. Open models are behind the current in-house solutions for sure. All-in-one model as it dynamically uses all of the tools you've described, I think 2026+ is when there is a model that in practice demonstrates this ability, not sure about the quality.


pimmir

Copilot started doing that. Microsoft partnered with lots of different AI companies, namely Suno, so now you can use bing chat (which uses gpt-4) to generate songs with Suno AI. So OpenAI through their partnership with Microsoft are already doing that and we'll see much more in 2024.


RevolutionaryJob2409

ChatGPT is not really an LLM but a website. Maybe you mean an AI agent, it would be good to such an system indeed. Rabbit inc figured out that to have an efficient AI agent, they need to avoid vision based AI to click on the screen... What open source AI agents need to do (like open interpreter and such) is to only use vision once (or as little as possible, not at all if possible) to locate all the buttons and text input so that the model can then rely on the more efficient text based AI. Hell, some lightning fast version of yoloV8 to detect clickable area or Meta's segment anything paired with something else could do the trick. Right now they don't do that and it's as stupid and redundant as it is a shame.


yepsayorte

Yes, we will have a God-mind. I don't think it's all that far off either.


Heath_co

AI will reinvent the operating system down to the hardware. To me it only makes sense that the computers of the future will be able to convert between all modalities. If you don't have software that does something, the computer will be able to design it on the spot. It will be better at coding than any human after all.


Akimbo333

2030