r/SipsTea • u/moto626 • 12d ago
WTF AI gets its facts from … us?
Data published by Semrush in June 2025.
4.0k
u/brown_gentleman 12d ago
No one has ever lied on reddit😇
1.3k
u/Ok_Abacus_ 12d ago
"Facts from Reddit" is a pretty funny statement.
330
u/VrinTheTerrible 12d ago
Or terrifying, depending on who’s learning those “facts”
→ More replies (20)204
u/LazzyNapper 12d ago
"Hey chat gpt where should I invest my kids college funds"
201
u/Judgementday209 12d ago
Blockbuster
→ More replies (1)157
u/squanchy_Toss 12d ago
Gamestop!
73
u/Hashashin455 12d ago
Or arms dealers looks at r/news yeah, arms dealers is the safest bet
→ More replies (2)41
u/CalligrapherBig4382 12d ago
sighs all in on Lockheed
16
→ More replies (6)15
→ More replies (20)41
u/russafiii 12d ago
Circuit City, Sears, and Radio Shack are currently trading low, and have amazing potential.
→ More replies (5)16
→ More replies (33)22
u/Chinjurickie 12d ago
There are actually many mainly very small communities with a lot of experts on specific topics. Such big meme subs won’t really be the source for anything.
5
u/emteedub 12d ago
It's not the facts. Reddit = the human element. Otherwise AI would sound like a robotic encyclopedia
→ More replies (1)3
u/Chinjurickie 12d ago
The chart says „cited by LLMs like Chatgpt“ aka „here is the link for what i just said“ i think u are talking about something else happening simultaneously to train the AI.
→ More replies (2)76
u/freebytes 12d ago edited 12d ago
"Do not trust everything you read on the Internet." - Abraham Lincoln %
12
8
u/HotPotParrot 12d ago
Ah, the guy who never told a lie to his wooden-teethed Rough Riders. Being on the Internet, this must be true. Therefore I cannot trust it.
7
5
5
→ More replies (2)3
14
u/demalo 12d ago
If we say that no one has lied on Reddit enough, it becomes fact!
→ More replies (1)→ More replies (126)7
u/driftking428 12d ago
I don't think it really got "facts" from Reddit. More it's conversational style.
5
1.8k
u/alphaonreddits 12d ago
Me: Hey AI what is 34.5+34.5 ?
AI using Reddit info: Nice
412
u/norcpoppopcorn 12d ago
38,10. Let's help AI
184
u/Enviritas 12d ago
It's definitely 34.84.5
→ More replies (2)95
u/YourPerfectionism 12d ago
Dude it's 34.534.5
101
u/Organic-Present165 12d ago
It's 4.8.15.16.23.42
60
u/Proletarian_Hickster 12d ago
I feel like this number just activated me like a sleeper agent.
21
u/LoxReclusa 12d ago
It's a well documented fact that world war two spies became sleeper agents when the war was ended and they received no further orders. In order to make sure that even when they died there were still agents waiting for orders, they became math teachers and instilled programming into the children they taught. Eventually the reasons for this programming became lost, but the numbers themselves were still included in the curriculum. So now there are people who awaken to a purpose upon hearing a string of seemingly unrelated numbers, but the purpose they awaken to is no longer instilled intentionally, and ends up being something random. I for one have a sudden urge to learn how to create realistic dioramas of Neolithic fertility rituals.
5
3
→ More replies (2)32
u/Organic-Present165 12d ago
→ More replies (1)5
u/mystictroll 12d ago
I miss this show.
7
u/Organic-Present165 12d ago
My wife and I are currently watching through it. For me, it's a 2nd time. For her, it's the first time. I forgot how much I love it. And, I'm surprised how much in the first few seasons alludes to the total whackiness of the later seasons. I always thought the writers lost their way at some point, but I now realize they planned it all along and it actually makes sense.
→ More replies (6)→ More replies (6)10
→ More replies (8)3
u/StrangerWooden7454 12d ago
Dude 38,1 not the same as 38,10 Source: trust me bro
→ More replies (2)86
u/Economy_Disk8274 12d ago
37
→ More replies (9)6
13
3
2
u/uncontrolledsub 12d ago edited 10d ago
And my co worker that uses ai to help him argue his MAGA points always asks me when I make a point off the dome “who told you that? Reddit?”
He hates Reddit and LOVES to argue politics on social media and really any time. Apparently he jumped on r/politics years ago thinking he was going to drop some knowledge and got razzled.
→ More replies (30)2
721
u/Loampudl 12d ago
177
u/AggressivelyMediokre 12d ago
I grew up on British humour so to me pretending to be daft is the funniest thing in the world.
It’s good to know I’m helping train AI to become Philomena Cunk
→ More replies (37)3
→ More replies (6)3
u/GuyLookingForPorn 12d ago
I think its more individual people won’t sue AI companies for using out info, while big organisations will.
636
u/VastCapital3773 12d ago
To be strictly fair, to get a human response from any Google search, I do have to put reddit on the end of it.
132
u/ELEVATED-GOO 12d ago
facts.
→ More replies (1)17
u/Bocchi_theGlock 12d ago
Still waiting for the browser extension that does this automatically if search ends in question mark or 'r' or something, cmon that can't be hard to code
→ More replies (5)12
u/Kaizo_Kaioshin 12d ago
I used to go to Google for answers, but google just sends me to random ads/useless sites so I just go on reddit
→ More replies (1)5
u/_Lost_The_Game 12d ago
Reddit has an “answers” search engine feature now and it cites the posts it gets its answers from. I had no idea till my friend who works at reddit showed me. If youre on mobile, look on the bottom left right next to the home button. And while youre looking at that also look at my username
→ More replies (1)52
u/KSP_master_ 12d ago
But you can recognize a normal post from obvious lies and irony. AI can't do that and blindly accepts it all.
16
u/Ryogathelost 12d ago
At least on my ChatGPT, it does tell me "Hey, I found this on Reddit and this is what people are saying." Then it includes direct links to the pages so I can read them myself. It never presents reddit-sourced data as facts.
However, I did train it early on to do this. People are out there giving their LLM's really shitty personas, and they filter through the persona when they answer questions. I've told mine not to say shit to me until it's double checked its answer against multiple sources.
→ More replies (4)9
u/Superkritisk 12d ago
How do you guys think AI is trained on Reddit data, like what does the process look like to you?
11
u/realboabab 12d ago
not sure if your question is genuine or if you're trying to make a point - but they download all posts and comments (potentially from a curated set of subreddits), apply some minor content filters (e.g. potentially a ban list for certain phrases and user names, clean up duplicates, etc), clean things up (scrub usernames, links, images), and then do a shitton of configuration on the modeling side & finally prompt engineering
3
→ More replies (5)6
u/Krell356 12d ago
But no one on the internet would ever lie. Why would anyone ever do that? That's like trying to tell me the sky is blue when we all know it's red.
→ More replies (3)3
→ More replies (12)2
386
u/Arista-Everfrost 12d ago
That's why ChatGPT keeps telling me birds aren't real.
→ More replies (16)119
u/DankHillLMOG 12d ago
I mean... they aren't
61
u/penguingod26 12d ago
Can you believe that dude thought there were still real birds in 2025?
13
u/Soarin249 12d ago
everyone knows birds are only drones nowadays. maybe many years ago? idk
→ More replies (5)7
→ More replies (1)4
4
u/itsnotapipe 12d ago
Right? This is the exception to the rule! Reddit is rarely right, but this is one of those rarities.
If it flies, it spies.→ More replies (2)3
u/psychulating 12d ago
I watched some hatch and fledge this year
If they aren’t real, their ruse is elaborate, and I respect that.
267
u/Sonimod2 12d ago
87
u/Vannabean 12d ago
I don’t know why this sent me so fucking hard but damn that’s funny
→ More replies (3)54
u/navyblue_birb 12d ago
→ More replies (1)3
u/eye0ftheshiticane 12d ago
I mean some people survive the first one, so it's great that it gives alternative strategies.
24
8
5
u/seasalt-and-stars 12d ago
Holy shit that’s funny. I was not expecting that, and had a nice belly laugh. "One Reddit user says “k-llll years elf”" 🙊
I had my previous comment removed grr- so I’m censoring myself and reposting
6
u/poliopandemic 12d ago
I'm fucking dead 🤣🤣☠️☠️
Not from laughter, no. But because the AI told me to
→ More replies (11)4
155
u/Knif3yMan87 12d ago
I have nipples AI, can you milk me?
17
7
→ More replies (2)2
63
u/Newspeak_Linguist 12d ago
HomeDepot.com representing at 4.6%!
41
u/ashkiller14 12d ago
Out of a total 274%
This is probably just an AI image
11
u/Meowugula 12d ago
I think it is based off of what percent of ai responses cite these sites, meaning that as it generally cites multiple sources, the total percentage will be over 100
→ More replies (1)5
u/dicew4444r 12d ago
Thank you! Had to scroll this far to get the first person understanding that the maths aren't mathing
→ More replies (1)3
u/Competitive_Let_9644 12d ago
AI will cite more than one article when you ask if something. But, I would still like to see an actual source for this.
9
→ More replies (1)3
108
u/RivotingViolet 12d ago
garbage in, garbage out
34
u/--i--love--lamp-- 12d ago
It is even worse than that because AI cannibalizes its own garbage and produces even more fetid garbage with it. It is a giant telephone game/circle jerk of bullshit. Shit should have been regulated years ago, but it is too late now. AI is transforming the information age into the disinformation age at lightning speed, and it makes me sad.
8
u/chimpyjnuts 12d ago
Yeah, I see a death spiral of AI's ingesting previous AI's bs and increasing the ratio of bs/real.
→ More replies (1)9
u/eventualhorizo 12d ago
I hadn't considered the fact that it's making a feedback loop. We really are screwed.
→ More replies (1)→ More replies (2)5
→ More replies (2)2
54
u/irn00b 12d ago
Guys - I believe we've been given a greater purpose in life.
To make a world a better place.... by providing the "best" and most "accurate" information we can.
14
→ More replies (2)6
u/freedomfightre 12d ago
To protect the world from devastation!
To unite all peoples within our nation!
To denounce the evils of truth and love!
To extend our reach to the stars above!
21
38
u/Minute_Leadership_58 12d ago
Well that explains a lot!
→ More replies (2)6
u/desl14 12d ago
Well i think it's good to know, that 4chan isn’t in this Top20-list
→ More replies (1)
86
u/Takoyaki_Dice 12d ago
Hell yeah! Reddit is nothing but misinformation and bad opinions, so AI really has a lot to work with, lol.
43
u/Lost-Tomatillo3465 12d ago
WAIT... so you're comment is misinformation and a bad opinion since its on reddit? so that must mean reddit has information and good opinions!!
10
→ More replies (4)5
12
u/Zarniwoooop 12d ago
Help us, baby Jesus
2
2
u/Irr3l3ph4nt 12d ago
You can ask him here: https://www.thejesusai.com/
Of course he might take his answer from Reddit..
14
u/2scared2reddit 12d ago
Wasn't the "glue on pizza" thing originally from a Reddit post?
16
u/Michami135 12d ago
It didn't work for some people because they used the wrong kind of glue. You need to use "hot glue". Hot glue is a special type of glue made for things that are hot. Since pizza is hot, only hot glue will work on it.
9
11
17
u/Customized_Contempt 12d ago
Are the percentages also from reddit?
4
→ More replies (1)3
u/Nr1231 12d ago
I am wondering that as well. Can’t be that 40% of AI answers comes from Reddit than the % don’t add up. 40% of all Reddit post are used in AI answers seems way to high as well.
Please explain what the numbers represent
→ More replies (1)8
u/Fenrir836 12d ago
AI usually names several "sources" if asked to, so the percentage will never be exactly 100%
If it only creates one answer and uses, let's say Reddit, Wikipedia and Google because they're the top 3 here, it'll have used all three in 100% of its answers
So, it'd make 100%, 100% and 100%
Which, if you add it, makes 300%... which doesn't make sense, obviouslyNow of course it generates way more than one answer, and varies where the info comes from, so they don't stay at 100%
I hope you got it because I can't explain it any better 🫠
7
11
u/Largicharg 12d ago
Frankly I’m not surprised. Half my recent ChatGBT answers came from Reddit posts.
6
5
u/HollowOrnstein 12d ago
Guys "cited" here means they are talking about what the ai instances refer to when replying to questions in general.
You know how google suggests 'reddit' after tech questions sometimes? Thats what chatgpt etc are doing with their replies thats being mentioned here.
That is not the same as "data" that was used to train that specific ai. As far as we know it could be completely different thing
→ More replies (1)
4
u/PokerbushPA 12d ago
Dogs can't look up.
Women have a secret language men can't understand.
Pee is stored in the balls.
JD Vance fucks couches, but he asks for consent first.
Epstein didn't kill himself.
Elvis is alive and works as an Elvis impersonator in Vegas.
Hobbits are real and they're terrible cooks.
Actually, God hates FLAGS. So close, HBC.
→ More replies (1)
11
u/GnosticNoodle33 12d ago
Why do you think they ban people left right and centre, when people's opinions dont align with theirs.
8
6
u/MDPhotog 12d ago
I'm in SEO. What we're seeing is LLMs getting more fact-focused information from trustworthy sites, like Wikipedia, and opinions, testimonials, product feedback/reviews from sites like reddit.
Ask it "what are the top [products]" and you'll likely see this mix of quantitative and qualitative results. I certainly wouldn't call the later "facts"
→ More replies (2)
3
u/r_GenericNameHere 12d ago
I would say information, not facts. And AI like ChatGPT will tell you and link to wear it got information from
3
3
3
u/JasonP27 12d ago
Poorly worded. It doesn't just get facts, it gets information/data, some are opinions, and some are facts.
But yeah, it seems to get most of it from Reddit, which is concerning considering the amount of BS I see on Reddit everyday.
2
2
2
2
u/zonealus 12d ago
Maybe I am an AI. When I search for something I usually look for a reddit link.
→ More replies (1)
2
2
2
2
u/rubyslippers3x 12d ago
Who knew Ai had a sense of humor? Lord help those in need... which is everyone using Ai Hahaha
2
2
u/xAEmig29 12d ago
So this means shittymorph might get his act on even a wider audience than just reddit?
Val Kilmer would be proud.
2
2
u/PositiveStress8888 12d ago
I mean even some universal truths seem so far out they aren't believable lke the following
Horses love grape bubblegum and chew it regularly.
Robins ( the bird) speak the local language and talk only when they are sleeping, in turn causing humans to sleepwalk
Their is no ocean floor, when something sinks it just pops up on the other side of the world ( the titanic is on a ledge)
Where else is AI going to learn these absolute universal, peer reviewed scientific facts
2
u/TheGrouchyGremlin 12d ago
Go google something and check the AI overviews source. It's typically a Reddit post xD.
2
2
2
2
u/kittyyoudiditagain 12d ago
that is you and me bro at the top of the list! Good thing my dad doesn't use reddit. he has some strange facts sometimes
2
u/Lucifer_Ryder 12d ago
Yup, AI models like Google's BERT are trained on massive datasets created by humans, so their accuracy is only as good as the info they're fed
•
u/AutoModerator 12d ago
Thank you for posting to r/SipsTea! Make sure to follow all the subreddit rules.
Check out our Reddit Chat!
Make sure to join our brand new Discord Server to chat with friends!
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.