Apple's New AI Technology, Apple Intelligence: Controversy over Ethical Data Sourcing for Training Models

Brussels, Belgium, Belgium Belgium
Apple announced new generative AI technology called Apple Intelligence
Apple's lack of transparency regarding data sources frustrates creatives
Controversy over ethical data sourcing for training models
Historically, companies have used internet-sourced data without consent or compensation leading to lawsuits
Some reports suggest Apple entered licensing deals with image databases like Shutterstock and Photobucket for training data
Apple's New AI Technology, Apple Intelligence: Controversy over Ethical Data Sourcing for Training Models

Apple's recent announcement of Apple Intelligence, its new generative AI technology, has sparked controversy within the creative community. Many artists and creatives are expressing concerns over the ethical standards Apple may have adhered to in sourcing content for training its AI models. Historically, companies have used vast amounts of internet-sourced data for training AI without proper consent or compensation, leading to lawsuits.

Apple's lack of transparency regarding its data sources has left some members of the creative community feeling frustrated. Generative AI models need vast amounts of data to function effectively, often obtained from the public internet without consent or compensation. Some artists have filed IP infringement lawsuits against AI developers for using their work without permission or payment.

Apple's founder, Steve Jobs, was known for Apple's historically great relationship with creatives. However, the company has remained quiet about its data sourcing practices for Apple Intelligence. Some reports suggest that Apple has entered licensing deals with extensive image databases like Shutterstock and Photobucket to obtain training data.

The controversy surrounding Apple Intelligence is not a new issue in the tech industry. With the rapid growth of commercial AI, transparency regarding data gathering and usage has become increasingly important. Companies must be forthcoming about their methods to maintain trust with their customers and avoid potential legal issues.



Confidence

80%

Doubts
  • Have any lawsuits been filed against Apple regarding data sourcing for AI?
  • Is Apple transparent about its data sources for Apple Intelligence?
  • What specific image databases has Apple entered licensing deals with?

Sources

85%

  • Unique Points
    • Apple collects data using a tool called AppleBot, but the company has remained quiet about how it trains its generative AI models.
    • Apple has entered licensing deals with extensive image databases like Shutterstock and Photobucket, but the company hasn’t publicly confirmed these reports.
  • Accuracy
    • , Apple has entered licensing deals with extensive image databases like Shutterstock and Photobucket, but the company hasn’t publicly confirmed these reports.
    • Apple allows web publishers to opt out of having their content used for AI training, but there’s no straightforward process for removing previously gathered data or making an AI model forget specific information.
  • Deception (70%)
    The article contains editorializing and selective reporting. The author expresses their opinion that Apple's lack of transparency regarding its AI training data is a PR violation and a breach of trust with its customers, particularly given the company's long-standing stance on privacy. However, the author does not provide any evidence to support these claims beyond anecdotal statements from one individual and speculation about potential lawsuits against other companies. The article also selectively reports on Apple's lack of transparency without mentioning that the company has stated it collects data like everyone else and uses a tool called AppleBot for data collection, which is supposed to be more privacy-friendly. The author also fails to disclose any sources for their information.
    • It's an even bigger PR violation when considering Cupertino’s stance on privacy and that Apple has long positioned itself as the artist’s best tool.
    • However, with the rapid growth of commercial AI, that’s not good enough.
    • The only thing it has reported on the subject is that it collects data like everybody else, using a tool it calls AppleBot, which is supposed to be more privacy-friendly.
  • Fallacies (80%)
    The author makes an appeal to authority by quoting Jon Lam and Tim Cook. However, the quotes do not directly relate to the fallacy being discussed (lack of transparency in Apple's AI data gathering). The author also uses inflammatory rhetoric by stating that Apple's lack of transparency 'could not have come at a worse time' and that it is a 'PR violation'.
    • ][I wish Apple would have explained to the public in a more transparent way how they collected their training data][/], [
  • Bias (90%)
    The author expresses a negative opinion towards Apple's lack of transparency in their AI training data gathering and calls out Apple for potentially infringing on artists' intellectual property rights. The author also mentions the negative feedback from some of Apple's most passionate supporters and industry players, implying that this is a PR issue for the company.
    • However, with the rapid growth of commercial AI, that's not good enough. One would think that with Apple’s slow roll on AI, it would have learned that the climate on information harvesting for generative model training has been and continues to be chilly.
      • It’s an even bigger PR violation when considering Cupertino’s stance on privacy and that Apple has long positioned itself as the artist’s best tool. The company charges a premium for its high-end production platforms that millions of creative users swear by. Tarnishing its reputation with unscrupulous data collection is the last thing it needs.
        • More than a few artists have filed IP infringement lawsuits against AI developers for using their work without permission or payment – over a dozen by Engadget’s count. Infringement lawsuits against AI providers have popped up from prominent industry players like The New York Times and Universal to the most minor independent artists.
          • The only thing it has reported on the subject is that it collects data like everybody else, using a tool it calls AppleBot, which is supposed to be more privacy-friendly.
          • Site Conflicts Of Interest (100%)
            None Found At Time Of Publication
          • Author Conflicts Of Interest (100%)
            None Found At Time Of Publication

          88%

          • Unique Points
            • Creatives have historically been loyal customers of Apple but are frustrated by the lack of transparency regarding data sources.
            • Generative AI models need vast amounts of data to function, often obtained from the public internet without consent or compensation.
            • Some artists have filed IP infringement lawsuits against AI developers for using their work without permission or payment.
          • Accuracy
            • Apple introduced new Apple Intelligence features at WWDC, including text summarization, natural conversations with Siri, and image and emoji generation.
            • Apple has entered licensing deals with extensive image databases like Shutterstock and Photobucket.
            • Developers need hundreds of millions or even billions of data samples for their models to remain competitive.
          • Deception (100%)
            None Found At Time Of Publication
          • Fallacies (85%)
            The article contains several instances of appeals to authority and inflammatory rhetoric. The author quotes several creatives expressing their concerns about Apple's lack of transparency regarding its AI models and where they get their data from. However, the author also makes assumptions about Apple's actions without providing concrete evidence, such as stating that 'Apple is trying to shove a huge privacy risk and tech that screams scraped off the internet without consent to the public.' This is an inflammatory statement that goes beyond what has been established in the article. Additionally, there are several appeals to authority when the author quotes creatives expressing their concerns and when they reference lawsuits against other companies for using copyrighted works without permission. While these sources add credibility to the article's claims, they do not necessarily prove that Apple is engaging in unethical practices.
            • Apple is trying to shove a huge privacy risk and tech that screams scraped off the internet without consent to the public.
            • The same day it showed off Apple Intelligence at WWDC, Apple posted an article on its Machine Learning Research blog explaining that it trains on licensed data, but it hasn’t said much beyond that.
            • Many felt that Apple, of all companies, would do better and still believe it should step up to the plate and be open about the sources of data used to train its AI models.
          • Bias (80%)
            The article expresses a clear bias against Apple for not being transparent about where they are getting the data for their AI models. The author uses language that depicts Apple as profiting from other people's hard work and violating intellectual property rights. The author also implies that Apple is not doing enough to address these concerns.
            • ,
              • After all, there are billions of images freely available on the internet, but just because humans can find them and look at them doesn’t mean it’s appropriate to use them to train AI models – any more than it would be to grab a photographer’s image from their website and use it somewhere else without their express permission.
              • Site Conflicts Of Interest (100%)
                None Found At Time Of Publication
              • Author Conflicts Of Interest (0%)
                None Found At Time Of Publication

              93%

              • Unique Points
                • Many artists and creatives are concerned that Apple may not have adhered to ethical standards in sourcing content for training its AI.
                • Historically, companies have used vast amounts of internet-sourced data for training AI without proper consent or compensation, leading to lawsuits.
              • Accuracy
                • Apple faces scrutiny over AI transparency from artists and creatives.
                • Apple has reportedly signed licensing deals with Shutterstock and Photobucket but has not confirmed these arrangements publicly.
                • Generative AI relies heavily on extensive datasets for training and often involves violations of privacy and intellectual property rights.
              • Deception (100%)
                None Found At Time Of Publication
              • Fallacies (100%)
                None Found At Time Of Publication
              • Bias (80%)
                The author expresses a clear bias towards the concerns of artists and creatives regarding Apple's use of data for training its AI models. The author quotes several individuals in the creative community expressing their disappointment and frustration with Apple's lack of transparency. The article also mentions numerous lawsuits against other companies for similar practices, further emphasizing the issue.
                • I wish Apple would have explained to the public in a more transparent way how they collected their training data.
                  • The bottom line is, we know [that] for generative AI to function as is, [it] relies on massive overreach and violations of rights, private and intellectual.
                    • The entire creative community has been betrayed by every single software company we ever trusted.
                    • Site Conflicts Of Interest (100%)
                      None Found At Time Of Publication
                    • Author Conflicts Of Interest (100%)
                      None Found At Time Of Publication