US platform operator Meta is pausing training its AI models with EU data until further notice. By doing so, the company is complying with a request from the Irish Data Protection Supervisory Authority – which initially wanted to approve the project but after protests now feels further discussions are needed.
Advertisement
Meta, the operator of Facebook, Instagram, Threads and WhatsApp, announced on Friday that it would not train its big language models with data from Facebook and Instagram for the time being. “We are disappointed by the request of the Irish Data Protection Commission (DPC Ireland), as the lead supervisory authority for European supervisory authorities, to suspend the training of our LLMs with content shared by our adult users from Instagram and Facebook,” the company said in an update to the details of the project published four days ago. Meta says supervisory authorities were informed of the plans as early as March.
Facebook takes on Google and OpenAI
The Zuckerberg empire already thought it was on the way to being able to work with large amounts of user data from the EU and the European Economic Area, the scope of the GDPR. As of April, the company said in its transparency report on the DSA that 260.7 million EU citizens use Facebook every month. 264.3 will use Instagram every month, 44 million of them from Germany. Meta now wants to feed its AI offerings with large amounts of content: Llama for example, but also the so-called Meta AI Assistant. The company argues that adaptation to local conditions is only possible with EU data and that Google and OpenAI have already used EU data for this.

Meta sees itself as a model student: data coming from minors’ profiles is constantly sorted, and direct messages between users are not used for training. Since May 22, more than two billion notifications have been sent to users to inform them about the plans and to be able to object to the use of their own data using the objection form. The possible uses also depend on whether the EU can really use AI or whether it wants to just watch AI innovations in the rest of the world as a spectator while the rest of the world benefits from them. Meta wants Europeans to be part of these opportunities.
Meta argues that data processing for AI training does not require separate consent for this form of data processing. It is acceptable as a “legitimate interest” within the meaning of the General Data Protection Regulation. Data protection activists such as the Austrian organisation NOYB see things differently – and have therefore filed complaints against Meta’s plans with other data protection supervisory authorities outside Ireland. The organisation founded by Max Schrems had already criticised the opt-out form weeks earlier.
Data for AI training is becoming scarce
The Irish data protection regulator welcomed Meta’s announcement that it would stop using it for the time being. “This decision was made following intensive discussions between the DPC and Meta,” the authority said. Discussions with the company will continue, along with other European supervisory authorities. The authority did not say on Friday what exactly caused the DPC’s change of heart.
It is hard to verify whether Meta is really concerned with economic and social welfare in the EU. A completely different reason for plans for large-scale data evaluation for LLM training may lie in the problem that all major AI model developers are currently facing: in just a few years, training data for huge models may become scarce. An intangible problem looming on the horizon in 2024, according to an assessment by scientists published in April 2022. became more intense,
(Never)
