Maven Imported 1.12 Million Fediverse Posts

Sean Tilley@lemmy.world · 9 months ago

Maven Imported 1.12 Million Fediverse Posts

lunarul@lemmy.world · 9 months ago

I was confused why a package manager would need to import posts from a social network.

Why name a new product the same as a very popular existing product?

GBU_28@lemm.ee · 9 months ago

I mean maven is super bloated so it wouldn’t surprise me

verstra@programming.dev · 9 months ago

Oh shit, the persona guy was right! We should all be adding license to our comments, so could not legally train model that are then used for commercial purposes.

GBU_28@lemm.ee · 9 months ago

Lol that shit don’t do shit

Pennomi@lemmy.world · 9 months ago

The easiest way is a sitewide NoAI meta tag, since it’s the current standard. Researchers are much more likely to respect a common standard and extremely unlikely to respect a single user’s personal solution adding a link to their comments.

iAvicenna@lemmy.world · 9 months ago

I feel like the bad thing about this is, whereas the researchers will mostly respect this, companies who want to make money out of data will still secretly keep using the data anyways. I am more ok with the data being used for non-profit research and not for making money but this would likely have the opposite effect.

Pennomi@lemmy.world · 9 months ago

If that’s truly the case, nothing on earth can protect your data.

That being said, large corporations are far more liable to consumer protection lawsuits, especially in areas like the EU.

iAvicenna@lemmy.world · 9 months ago

They also have enough lawyer power to find loop holes

onlinepersona@programming.dev · 9 months ago

It’s especially for these kinds of dumb cases where they simply copy content wholesale and boast about it. With more people licencing their contents as non commercial, the “hot water” these companies get in could not just be trivial but actually legal.

Would be great if web and mobile clients supported signatures or a “licence” field from which signatures were generated. Even better would be if people smarter than me added a feature to poison AI training data. This could also be done by a signature or some other method.

Anti Commercial-AI license

Larry@lemmy.world · 9 months ago

Am I misunderstanding this, or did they just fuck up the integration so it’s one way with a plan to make it two ways after, and the AI alteration is just sentiment analysis on whatever they took?

Sean Tilley@lemmy.world · 9 months ago

They kind of fucked up everything in approaching this by not talking to the community and collecting feedback, making dumb assumptions in how the integration was supposed to work, leaking private posts, running everything through their AI system, and neglecting to represent the remote content as having came from anywhere else.

The other thing is that Maven’s whole concept is training an AI over and over again on the platform’s posts. Ostensibly, this could mean that a lot of Fediverse content ended up in the training data.