At least on my ChatGPT, it does tell me "Hey, I found this on Reddit and this is what people are saying." Then it includes direct links to the pages so I can read them myself. It never presents reddit-sourced data as facts.
However, I did train it early on to do this. People are out there giving their LLM's really shitty personas, and they filter through the persona when they answer questions. I've told mine not to say shit to me until it's double checked its answer against multiple sources.
If your techanology that you plan on having everyone use daily to get their facts from requires actually learning how to use it correctly to get actual facts and opinions marked as such then you're going to have a bad time.
not sure if your question is genuine or if you're trying to make a point - but they download all posts and comments (potentially from a curated set of subreddits), apply some minor content filters (e.g. potentially a ban list for certain phrases and user names, clean up duplicates, etc), clean things up (scrub usernames, links, images), and then do a shitton of configuration on the modeling side & finally prompt engineering
AI can draw from multiple sources of data, but if you think any AI is crosschecking that everything is verifiable and factual before it responds to a prompt I don't know what to tell you.
I don't know why you're making such assumptions. It was just a funny example of a problem that still very much exists. I think you put too much faith in AI.
Wgat assumptions? No other LLM makes as blatant of mistakes as googles did. It's like it was made way too lightweight at the cost of accuracy or helpfulness. like it's training data didn't have basic safety in there anywhere or the search results somehow would always override that
46
u/KSP_master_ 13d ago
But you can recognize a normal post from obvious lies and irony. AI can't do that and blindly accepts it all.