Home DEVELOPER The network behind Stack Overflow no longer sends data to the Internet...

The network behind Stack Overflow no longer sends data to the Internet Archive

0


Stack Exchange has announced that in the future it will only offer data dumps through its internal pages. A previous partnership with the Internet Archive ends the network of question websites, including the developer site Stack Overflow.

Advertisement


Operators announce changes on July 12 and gave more detailed reasons two weeks later In an update. Stack Exchange wants to be the first to ensure that providers of large language models (LLMs) comply Guidelines published on the Stack Overflow blog in January For socially responsible AI.

Last happened in April 2024 The release to the Internet Archive made it possible for the material to be used extensively without restriction CC by SA 4.0License to Use.

iX Workshop Software Architecture: CPSA Foundation Level with iSAQB Certification

However, downloading data from Stack Exchange will require consent by adding the following: “I understand that this file is being made available to me for my own use and for projects that do not involve the training of large language models (LLMs). Should I distribute this data for the training of LLMs, Stack Overflow reserves the right to deny me access to future downloads of this data dump.”

This addition has already caused some dissatisfaction among users. A post on Stack Exchange points to thisthat the CC BY-SA 4.0 license explicitly prohibits additional restrictions on the license.

LLMs are a challenge for Stack Overflow because they threaten the network’s fundamental business model. Anyone who used to search for code examples for a specific problem solution on Stack Overflow now often asks GitHub Copilot or ChatGPT.

In late 2022, Stack Overflow also banned content created with ChatGPT on its own platform because the content was potentially harmful to the site and its users.

However, the company announced in February A strategic collaboration with Google for generative AI And in May 2024, a partnership with OpenAI, the company behind ChatGPT, and Codex as the basis for GitHub Copilot.

In a post by a Stack Exchange member With an extremely high reputation, the network’s rating scale states: “They (Stack Exchange, editor’s note) do not say that they continue to sell image data for generative AI, they just don’t want the companies behind GenAI to get it for free”.

For users, the change in the short term means that Internet Archive won’t be rolling out its regular dumps in July. Since the change still takes some time to develop, it probably won’t hit the company’s own network until mid-August, according to Stack Exchange.


(RME)

DARPA declares war on memory errors: AI should move old C code to Rust

NO COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Exit mobile version