OpenAI's Media Manager: Empowering Creators to Control AI Use of Their Works

San Francisco, California United States of America
Creators can identify their works and specify how they want them included or excluded from AI research and training through Media Manager.
Media Manager is expected to be introduced in 2025.
OpenAI allows artists to opt out of having their work used in image-generating models and lets website owners indicate via the robots.txt standard whether their content can be scraped for AI model training.
OpenAI has faced criticism for scraping publicly available data from the web, including a recent lawsuit by eight prominent US newspapers.
OpenAI is developing a tool called Media Manager to enable creators and content owners to control how their works are used in AI research and training.
Some creators have described OpenAI's opt-out workflow for images as onerous and criticized the relatively little payment they receive.
Third parties are attempting to build universal provenance and opt-out tools for generative AI, such as Spawning AI, Steg.AI, and Imatag.
OpenAI's Media Manager: Empowering Creators to Control AI Use of Their Works

OpenAI, a leading artificial intelligence (AI) research laboratory, is developing a tool called Media Manager to enable creators and content owners to control how their works are used in AI research and training. The tool aims to address concerns raised by some creators regarding the use of their content for model training without their consent. OpenAI has faced criticism for scraping publicly available data from the web, including a recent lawsuit by eight prominent US newspapers.

Media Manager is expected to be introduced in 2025 and will allow creators and content owners to identify their works and specify how they want them included or excluded from AI research and training. The tool's goal is to have a standard in place, possibly through the industry steering committee OpenAI recently joined.

OpenAI has taken steps to meet content creators halfway by allowing artists to opt out of having their work used in image-generating models and letting website owners indicate via the robots.txt standard whether their content can be scraped for AI model training. The company also continues to ink licensing deals with large content owners, including news organizations, stock media libraries, and Q&A sites like Stack Overflow.

Some creators have described OpenAI's opt-out workflow for images as onerous and criticized the relatively little payment they receive. To address these concerns, third parties are attempting to build universal provenance and opt-out tools for generative AI. For instance, Spawning AI offers an app that identifies and tracks bots' IP addresses to block scraping attempts and a database where artists can register their works to disallow training by vendors who choose to respect the requests. Steg.AI and Imatag help creators establish ownership of their images by applying watermarks imperceptible to the human eye, while Nightshade poisons image data to render it useless or disruptive for AI model training.

OpenAI's new tool is a response to growing criticism of its approach to developing AI, which relies heavily on scraping publicly available data from the web. The company argues that fair use shields its practice of scraping public data and using it for model training. However, not everyone agrees with this argument.

As OpenAI works on Media Manager and other solutions to address content creators' concerns, the debate around ethically sourced training data continues to gain momentum. Some advocates argue for a regime where AI companies only train algorithms on data with explicit permission from creatives and rights holders.



Confidence

91%

Doubts
  • Is OpenAI's argument that fair use shields its practice of scraping public data and using it for model training valid?
  • Will Media Manager be effective in addressing creators' concerns about data usage?

Sources

100%

  • Unique Points
    • OpenAI is developing a tool called Media Manager to let creators control how their content is used in AI research and training.
    • OpenAI has faced criticism for scraping publicly available data from the web for model training, including a recent lawsuit by eight prominent US newspapers.
    • OpenAI plans to introduce additional choices and features over time.
    • OpenAI continues to ink licensing deals with large content owners, including news organizations, stock media libraries and Q&A sites like Stack Overflow.
    • Some creators have described OpenAI’s opt-out workflow for images as onerous.
    • Startup Spawning AI offers an app that identifies and tracks bots’ IP addresses to block scraping attempts and a database where artists can register their works to disallow training by vendors who choose to respect the requests.
    • Steg.AI and Imatag help creators establish ownership of their images by applying watermarks imperceptible to the human eye.
    • Nightshade, a project from the University of Chicago, poisons image data to render it useless or disruptive to AI model training.
  • Accuracy
    No Contradictions at Time Of Publication
  • Deception (100%)
    None Found At Time Of Publication
  • Fallacies (100%)
    None Found At Time Of Publication
  • Bias (100%)
    None Found At Time Of Publication
  • Site Conflicts Of Interest (100%)
    None Found At Time Of Publication
  • Author Conflicts Of Interest (100%)
    None Found At Time Of Publication

96%

  • Unique Points
    • OpenAI has developed a new image detection classifier to determine if an image was generated by its DALL-E AI image generator.
    • OpenAI is also working on new watermarking methods, including a tamper-resistant watermark that can tag content with invisible signals.
    • OpenAI is a member of C2PA along with companies like Microsoft and Adobe.
    • OpenAI has joined C2PA’s steering committee.
  • Accuracy
    No Contradictions at Time Of Publication
  • Deception (95%)
    The article contains editorializing and selective reporting. The author states that OpenAI's image detection classifier works with around 98% accuracy for detecting DALL-E generated images but only flags 5-10% of images from other generators. However, the author does not mention the percentage of false positives or false negatives for DALL-E generated images. This selectively reports information that supports OpenAI's position while omitting important context that could undermine it.
    • OpenAI claims the classifier works even if the image is cropped or compressed or the saturation is changed.
    • While the tool can detect if images were made with DALL-E 3 with around 98 percent accuracy, its performance at figuring out if the content was from other AI models is not as good, flagging only 5 to 10 percent of pictures from other image generators.
  • Fallacies (85%)
    The article contains an example of a dichotomous depiction (OpenAI's new image detection classifier works with high accuracy but has lower accuracy for other image generators) and an appeal to authority (OpenAI claims the classifier works even if the image is cropped or compressed).
    • The classifier predicts the likelihood that a picture was created by DALL-E 3. OpenAI claims the classifier works even if the image is cropped or compressed or the saturation is changed.
    • OpenAI, along with companies like Microsoft and Adobe, is a member of C2PA.
  • Bias (100%)
    None Found At Time Of Publication
  • Site Conflicts Of Interest (100%)
    None Found At Time Of Publication
  • Author Conflicts Of Interest (100%)
    None Found At Time Of Publication

82%

  • Unique Points
    • OpenAI is facing lawsuits from artists, writers, and publishers over the use of their work to train AI systems.
    • OpenAI did not immediately respond to a request for comment on the matter.
    • Spawning's Do Not Train registry already has preferences for over 1.5 billion works, but Spawning is not working with OpenAI on the Media Manager project.
    • Jordan Meyer of Spawning worries about the proliferation of disparate opt-out tools and advocates for a universal, open system.
    • A growing movement advocates for a regime where AI companies only train algorithms on data with explicit permission from creatives and rights holders.
  • Accuracy
    No Contradictions at Time Of Publication
  • Deception (30%)
    The article contains selective reporting as OpenAI's Media Manager tool is described in a positive light without mentioning potential limitations or criticisms. The author also uses emotional manipulation by implying that OpenAI is fighting lawsuits and needs to appease creatives and rights holders.
    • OpenAI is fighting lawsuits from artists, writers, and publishers who allege it inappropriately used their work to train the algorithms behind ChatGPT and other AI systems.
    • The company says it will launch a tool in 2025 called Media Manager that allows content creators to opt out their work from the company’s AI development.
    • OpenAI did not immediately return a request for comment.
  • Fallacies (85%)
    The author makes several appeals to authority in the article. She quotes Ed Newton-Rex, CEO of Fairly Trained, and Jordan Meyer, CEO of Spawning, without explicitly stating their positions or qualifications. This can be misleading for readers who may not be familiar with these individuals or their organizations. Additionally, the author uses inflammatory rhetoric when describing OpenAI's actions as 'inappropriately used' and 'fighting lawsuits.' This language is not objective and can influence readers' perceptions of the situation.
    • The company says it will launch a tool in 2025 called Media Manager that allows content creators to opt out their work from the company’s AI development.
    • Ed Newton-Rex, CEO of the startup Fairly Trained, which certifies AI companies that use ethically sourced training data, says OpenAI’s apparent shift on training data is welcome but that the implementation will be critical.
    • Jordan Meyer, CEO of Spawning, says the company is not working with OpenAI on its Media Manager project but is open to doing so.
  • Bias (95%)
    The author expresses a clear bias towards the perspective of artists and content creators who are concerned about their work being used without permission in AI development. The author quotes Ed Newton-Rex multiple times, presenting his opinions as facts or expert analysis. The author also mentions the frustration of artists when Meta did not process their opt-out requests, implying that these requests should have been granted.
    • Many artists interpreted this as way to ask for their work to be opted out of Meta’s AI projects and were frustrated when the company declined to process their requests.
      • Opting out should be simple and universal, which we believe requires an open system built by a third party.
        • The first major question on his mind: Is this simply an opt-out tool that leaves OpenAI continuing to use data without permission unless a content owner requests its exclusion?
        • Site Conflicts Of Interest (100%)
          None Found At Time Of Publication
        • Author Conflicts Of Interest (100%)
          None Found At Time Of Publication

        97%

        • Unique Points
          • OpenAI has developed an AI image detection classifier that can identify about 98% of images generated by its own image generator, DALL-E 3, as being AI-generated.
          • OpenAI has added tamper-resistant metadata to all images created and edited by DALL-E 3 that can be used to prove their origin.
          • OpenAI is launching a $2 million fund with Microsoft to support AI education and understanding.
        • Accuracy
          No Contradictions at Time Of Publication
        • Deception (100%)
          None Found At Time Of Publication
        • Fallacies (85%)
          The article contains a few informal fallacies and an example of inflammatory rhetoric. It uses the slippery slope fallacy by implying that since efforts to combat deepfakes have been imperfect, social media platforms may not be ready for the 2024 elections. Additionally, it employs inflammatory language when referring to tech companies 'rushing' to roll out tools and stating that experts fear they 'may not be ready' for the AI chaos during major global elections. No formal fallacies or dichotomous depictions were found.
          • The article implies that since efforts to combat deepfakes have been imperfect, social media platforms may not be ready for the 2024 elections - slippery slope fallacy.
          • Experts fear that social media platforms may not be ready to handle the ensuing AI chaos during major global elections in 2024 - inflammatory rhetoric.
          • Tech companies have rushed to roll out tools to help everyone better detect AI content - inflammatory rhetoric.
        • Bias (100%)
          None Found At Time Of Publication
        • Site Conflicts Of Interest (100%)
          None Found At Time Of Publication
        • Author Conflicts Of Interest (100%)
          None Found At Time Of Publication

        99%

        • Unique Points
          • OpenAI is releasing a deepfake detector to help identify content created by its image generator DALL-E
          • OpenAI is sharing the tool with a small group of disinformation researchers for testing and improvement
          • OpenAI is also working on other ways to fight deepfakes, such as developing watermarks for AI-generated sounds and joining the Coalition for Content Provenance and Authenticity (C2PA)
        • Accuracy
          No Contradictions at Time Of Publication
        • Deception (100%)
          None Found At Time Of Publication
        • Fallacies (100%)
          None Found At Time Of Publication
        • Bias (100%)
          None Found At Time Of Publication
        • Site Conflicts Of Interest (100%)
          None Found At Time Of Publication
        • Author Conflicts Of Interest (100%)
          None Found At Time Of Publication