OpenAI has announced that it is joining the Coalition for Content Provenance and Authenticity (C2PA) steering committee, a move aimed at increasing transparency around AI-generated content. The company will integrate the open standard’s metadata into its generative AI models, allowing digital content to be certified with metadata proving its origins.

Image Source: Envato

The C2PA standard enables digital content to be labeled with metadata that indicates whether it was created entirely by AI, edited using AI tools, or captured traditionally. This move comes amid growing concerns about the potential for AI-generated content to mislead voters ahead of major elections in the US, UK, and other countries this year.

By integrating C2PA metadata into its models, OpenAI aims to provide a way to authenticate AI-created media and combat deepfakes and other manipulated content aimed at disinformation campaigns. The company has already started adding C2PA metadata to images from its latest DALL-E 3 model output in ChatGPT and the OpenAI API, and plans to integrate it into its upcoming video generation model Sora when launched more broadly.

While technical measures like C2PA integration can help, OpenAI acknowledges that enabling content authenticity in practice requires collective action from platforms, creators, and content handlers to retain metadata for end consumers.

In addition to C2PA integration, OpenAI is developing new provenance methods like tamper-resistant watermarking for audio and image detection classifiers to identify AI-generated visuals. The company has opened applications for access to its DALL-E 3 image detection classifier through its Researcher Access Program, which predicts the likelihood an image originated from one of OpenAI’s models.

Internal testing shows high accuracy distinguishing non-AI images from DALL-E 3 visuals, with around 98% of DALL-E images correctly identified and less than 0.5% of non-AI images incorrectly flagged. However, the classifier struggles more to differentiate between images produced by DALL-E and other generative AI models.

OpenAI has also incorporated watermarking into its Voice Engine custom voice model, currently in limited preview.

The company believes increased adoption of provenance standards will lead to metadata accompanying content through its full lifecycle, filling “a crucial gap in digital content authenticity practices.”