Google Sued for Stealing User Data to Train AI: "Everything Ever Created and Shared on the Internet"
Google, along with its parent company Alphabet and AI subsidiary Google DeepMind, is facing a lawsuit accusing the tech giant of unlawfully scraping data from millions of users without their consent. The proposed class action suit filed by Clarkson Law Firm in a federal court on July 11, 2023, in California alleges that Google violated copyright laws and used stolen data to train and develop its artificial intelligence (AI) products, per CNN Business.
Data theft and copyright violations
The complaint filed against Google claims that the company has been covertly stealing "everything ever created and shared on the internet by hundreds of millions of Americans.” This data, including creative and copyrighted works, is allegedly being used to train Google's AI products, such as the chatbot Bard. The lawsuit argues that Google has essentially taken the entirety of users' digital footprints to build its AI tools.
Response to the complaint—Google's revised privacy policy
Halimah DeLaine Prado, Google’s general counsel told CNN about the complaint, “American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims.” According to him, training AI models for applications like Google Translate with public data sets is in line with Google's AI principles. The update simply underscores that newer services like Bard also adhere to this practice. Prado's response brings attention to a recent revision in Google's privacy policy that grants Google more freedom to train and build systems besides LLMs on public data.
Representatives of Alphabet and DeepMind have not yet responded to requests for comment on the lawsuit. It remains to be seen how the tech giant will address the allegations and defend its practices.
Another lawsuit against Google
Gannett, the largest newspaper publisher in the United States, is also suing Google and its parent company, Alphabet, claiming that the search giant is monopolizing the digital ad market with its advanced AI technologies, per The Verge. Products like Google’s AI search beta have been accused of driving away traffic from websites and dubbed “plagiarism engines.”
Growing legal scrutiny
The lawsuit comes at a time when AI tools have gained significant attention for their ability to generate written content and images based on user prompts. However, these tools, built on large language models, are drawing increasing legal scrutiny due to potential copyright issues and the use of personal and sensitive data from users, including children.
Data ownership and compensation
The attorneys bringing the suit against Google argue that personal data and information are the property of individuals and have significant value. They assert that no entity has the right to take and use such data for any purpose without explicit consent. The lawsuit seeks injunctive relief, including a temporary freeze on commercial access to and development of Google's generative AI tools, as well as unspecified damages and compensation for individuals whose data was allegedly misappropriated. The law firm claims to have eight plaintiffs, including a minor.
Balancing access and privacy
The attorneys suggest that Google should offer users an opportunity to opt out of having their data used for AI training while still being able to utilize the internet for their everyday needs. They highlight the contrast between Google's traditional indexing of online data for search engine purposes, which can drive traffic and revenue for content creators, and the alleged data scraping for AI training, which creates alternative versions of works and potentially diminishes incentives for purchasing them.
Unforeseen use of personal data
While internet users have grown accustomed to their data being collected for search results and targeted advertising, the lawsuit argues that the use of personal data for AI training was not anticipated. The attorneys assert that people could not have imagined their information being utilized in this manner, suggesting that greater transparency and control are necessary.