A data protection taskforce has spent over a year analyzing how the European Union’s data protection rules apply to OpenAI’s popular chatbot, ChatGPT. On Friday, the taskforce released its preliminary findings, revealing that it remains undecided on key legal issues, such as the lawfulness and fairness of OpenAI’s data processing practices.
This matter is crucial because violations of the EU’s privacy regulations can result in penalties of up to 4% of a company’s global annual revenue. Watchdogs can also order non-compliant processing to cease, posing significant regulatory risks for OpenAI in the region, especially as dedicated AI laws are still years away from full implementation.
Without clear guidance from EU data protection authorities on how current laws apply to ChatGPT, OpenAI may continue its operations as usual, despite numerous complaints alleging that its technology violates the General Data Protection Regulation (GDPR).
For instance, Poland’s data protection authority opened an investigation following a complaint that ChatGPT fabricated information about an individual and refused to correct the errors. A similar complaint was lodged in Austria.
While the GDPR applies whenever personal data is collected and processed, large language models like OpenAI’s GPT scrape data from the internet on a massive scale, including social media posts, to train their models. The EU regulation empowers data protection authorities (DPAs) to stop any non-compliant processing, potentially affecting how ChatGPT operates in the region.
Last year, Italy’s privacy watchdog temporarily banned OpenAI from processing the data of local users, leading to a brief shutdown of ChatGPT in the country. The service resumed after OpenAI made changes in response to the DPA’s demands. However, the Italian investigation into OpenAI’s legal basis for processing data continues, keeping ChatGPT under a legal cloud in the EU.
To process data legally under the GDPR, entities must have a valid legal basis. The Italian DPA has instructed OpenAI that it cannot rely on contractual necessity, leaving it with consent or legitimate interests (LI) as possible bases. OpenAI has since claimed LI for processing personal data for model training. However, Italy’s DPA found in January that OpenAI violated the GDPR, though details of the findings have not been disclosed.
The taskforce’s report emphasizes that ChatGPT requires a valid legal basis for all stages of data processing. It highlights the risks associated with web scraping, which can ingest large volumes of personal data, including sensitive information. The taskforce suggests that implementing safeguards and limiting data collection could help balance OpenAI’s legitimate interests with individuals’ privacy rights.
The report also stresses the importance of transparency and fairness, noting that privacy risks cannot be shifted to users. It calls for clear information on the use of data for training and compliance with the GDPR’s accuracy principle, particularly given ChatGPT’s tendency to generate inaccurate information.
The taskforce’s work, initiated after Italy’s intervention in 2023, aims to streamline the enforcement of EU privacy rules on emerging technologies. However, there is still considerable uncertainty among DPAs on how to address ChatGPT, leading to delays in enforcement.
OpenAI recently established an EU operation in Ireland, potentially benefiting from the country’s business-friendly regulatory approach. The Irish Data Protection Commission (DPC) now serves as OpenAI’s lead supervisor for GDPR oversight, centralizing the handling of cross-border complaints.
The taskforce’s report does not offer a definitive solution but suggests measures to improve compliance with the GDPR. OpenAI was contacted for a response but had not replied at the time of publication.