OpenAI strikes Reddit deal to train its AI on your posts

return2ozma@lemmy.world · 1 year ago

OpenAI strikes Reddit deal to train its AI on your posts

myliltoehurts@lemm.ee · 1 year ago

So they filled reddit with bot generated content, and now they’re selling back the same stuff likely to the company who generated most of it.

At what point can we call an AI inbred?

jordanlund@lemmy.world · 1 year ago

BRB - changing my entire 15 year reddit comment history to “Fuck Spez”. LOL.

return2ozma@lemmy.world · 1 year ago

Know any bots or ways to perma delete all Reddit comments?

thejml@lemm.ee · 1 year ago

Reddit has backups, permanently isn’t an option.

db2@lemmy.world · 1 year ago

They’re not multiple though, edit it and then delete it and it’s gone. They disabled all the tools to do it though so it’s manually or nothing now.

SchmidtGenetics@lemmy.world · 1 year ago

They just reload a previous cached comment, doesn’t matter how many times you edit or delete, it’s all logged and backed up.

bobs_monkey@lemm.ee · edit-2 1 year ago

I used redact.dev to mass edit all my comments, worked pretty well. Problem is that if you mass delete, they’ll restore them pretty quick, but so far they haven’t reverted my edits.

catloaf@lemm.ee · 1 year ago

https://github.com/j0be/PowerDeleteSuite

Rolando@lemmy.world · 1 year ago

Back when I deleted all my comments, I was told I could claim to be in Europe and make a request citing the European law that Reddit has to follow. I think Reddit had a page where you could make the request, but of course it was hard to find.

micka190@lemmy.world · 1 year ago

Realistically, when you’re operating at Reddit’s scale, you’re probably keeping a history of each comment for analytics purposes.

AlexWIWA@lemmy.ml · 1 year ago

LLMs have been training on Reddit posts since at least 2012. Nothing really new here.

UnderpantsWeevil@lemmy.world · 1 year ago

It’s ground zero for Bots training on other Bots

YIj54yALOJxEsY20eU@lemm.ee · 1 year ago

Now they get to train on all the “deleted” comments/posts as well.

SparrowRanjitScaur@lemmy.world · edit-2 1 year ago

Probably not, I’m sure they’re training on Reddit’s internal data set which likely includes all deleted posts.

YIj54yALOJxEsY20eU@lemm.ee · 1 year ago

Did you just say probably not then agree with me?

SparrowRanjitScaur@lemmy.world · edit-2 1 year ago

Ya, lol. Sorry, I’m not sure if I replied to the wrong comment or just misread your comment earlier. I agree with you.

YIj54yALOJxEsY20eU@lemm.ee · 1 year ago

Lol no worries

Everythingispenguins@lemmy.world · 1 year ago

Some day historians will be able to look back at this moment and be able to determine it was what caused ChatGPT to become horny and weird.

assassin_aragorn@lemmy.world · 1 year ago

Only an idiot would decide to mindlessly trawl Reddit to train an LLM. They’ll be confused when their model suddenly is confidently wrong about everything and have no clue.

Everythingispenguins@lemmy.world · 1 year ago

You are a hundred percent right, but how many idiots are there out there?

assassin_aragorn@lemmy.world · 1 year ago

Uncountably many

Everythingispenguins@lemmy.world · 1 year ago

Sadly looks like we have an answer

https://lemmy.world/post/15712886

frickineh@lemmy.world · 1 year ago

My comment history was like 50% shitposting about the beauty industry and 50% hating on Christian fundamentalists. There’s honestly no way it won’t make AI at least a little bit worse, and I’m not mad about it.

Flying Squid@lemmy.world · 1 year ago

That AI is going to be super anti-Christian fundementalist (or possibly just anti-Christian), so maybe there is an upside.

filister@lemmy.world · edit-2 1 year ago

What makes you think that they are not scraping Lemmy too? The only reason they might not be is probably how niche Lemmy and the fediverse are, but I am sure there have been people already doing it.

Dr. Moose@lemmy.world · 1 year ago

Fediverse is designed to do exactly that. It’s free flow of information which is a good thing. Don’t let corporations hijack this beautiful concept. We all want information to be free.

olympicyes@lemmy.world · 1 year ago

I’m not mad about the scraping. The linkedin scraping case pretty much cemented that there was nothing that could be done to stop it. I’m just mad that I can no longer use the app of my choice. No such problem with Lemmy.

AlexWIWA@lemmy.ml · 1 year ago

Lemmy is even easier to scrape. Just set up your own instance, then read the database after activity pub pushes everything to you.

boatsnhos931@lemmy.world · 1 year ago

No wonder AI is crazy AF.

macrocephalic@lemmy.world · 1 year ago

All future AI will have autocorrect errors and will look like no one read it before hitting enter. You’re welcome.

boatsnhos931@lemmy.world · 1 year ago

No one says thank you, we already have that. WAIT JUST A GOT DAMN MINUTE!! YOU ARE ONE OF THEMS!!

Dr. Moose@lemmy.world · 1 year ago

This form of propaganda is my pet peeve. It’s not “your posts” as soon as you put something to public you don’t get to eat your cake. It’s out there, you shared it. Don’t share it if you don’t want humanity to ingest and use it.

Azzu@lemm.ee · edit-2 1 year ago

It’s not about it being used to train AI. It’s about the AI either not being open source/I don’t get access to it (i.e. not benefitting me) or reddit being paid for my comments (i e. also not benefitting me).

If this AI training would get me or the public access to the AI, or I would be paid for my comments instead of Reddit, I’d be fine with it.

Dr. Moose@lemmy.world · edit-2 1 year ago

yeah but you don’t get to choose that. You give away that right as soon as you participate in public discourse. It’s a zero sum game - either it’s a public for everyone or no one.

Don’t get me wrong, Reddit is a bitch but I think people want to cut their noses off to spite their faces here. It’s much more important to have free information flow than to fuck reddit.

My fear is that people will vote in some really dumb rules to spite AI and restrict free information flow accidentally.

Azzu@lemm.ee · edit-2 1 year ago

That’s how it is currently and maybe also your opinion. But that doesn’t mean it has to be like that in a society. It’s your opinion that everything public can go private at any time (training proprietary private AI), but we can decide as a society that’s not how we want to do things. We can require stuff that used public data to be public as well.

And yeah I kinda get to choose that. As democratic society, anything that the public (i.e. including me) decides, goes. Of course, if there are people like you that don’t want stuff trained on public data to be required to be public, democracy will also work in the sense that we don’t get that, as it is currently.

Dataprolet@lemmy.dbzer0.com · 1 year ago

You’re technically right, but nobody anticipated and therefore agreed on their posts being used for training LLMs.

SparrowRanjitScaur@lemmy.world · 1 year ago

Public information is public information.

Dataprolet@lemmy.dbzer0.com · 1 year ago

Oh boy have I bad news for you. You ever heard of copyright?

Mastengwe@lemm.ee · 1 year ago

Isn’t this news like every month?

db2@lemmy.world · 1 year ago

Not my posts. Go ahead, look at what remains. The rest was edited and then deleted.

Fuck you, Steve. Right in the ass.

RizzRustbolt@lemmy.world · 1 year ago

Those poor silicon atoms…

villainy@lemmy.world · 1 year ago

“Strikes” made me think they were cancelling the deal. Like strike-through, crossed it out, etc. Too bad.

Kyrgizion@lemmy.world · 1 year ago

I didn’t delete my comments before nuking my account, but I’m pretty sure the grand majority were shitposts containing ample amounts of smut, gore and other ridiculous over the top shit. So I consider this a win.

LilDestructiveSheep@lemmy.world · 1 year ago

Gonk

FJT@lemmy.world · 1 year ago

So it’s going to be a libtarded libtard AI that doesn’t represent the majority of the people, got it.

Seasoned_Greetings@lemm.ee · 1 year ago

The beauty of being here on lemmy is that I genuinely can’t tell whether you said this because you’re far right or because you’re far left

Stupid opinion either way. That Ai is going to catch its share of r/conservative idiots and be a nice blend of ignorance

OpenAI strikes Reddit deal to train its AI on your posts

OpenAI strikes Reddit deal to train its AI on your posts

Reddit’s deal with OpenAI will plug its posts into “ChatGPT and new products”