Disclosure: First Step in Data Transparency

Many of the problems caused by AI today are, at their heart, issues of transparency.

The developers of the most popular chatbots, like ChatGPT, refuse to divulge any information about the datasets on which their AI models trained. That denies corporate deployers and individual consumers the ability to judge the quality of the AI system. One of the oldest sayings in tech is “Garbage in, garbage out.” An AI model is only as good as the data on which it trained. If developers aren’t required to disclose any information about training data, there’s no incentive to use high-quality, legally obtained datasets.

This has led to widespread accusations of copyright violation—and dozens of federal lawsuits—as well as poor-quality chatbots offering inaccurate and wildly false responses.

Requiring AI developers to publish Data Declarations (a kind of digital ingredients list) will verify that training data was legally obtained and used; offers customers a metric to assess the quality of the AI model; and incentivizes the ethical use of high-quality data. This will result in better AI systems and greater consumer trust in AI overall.

Similarly, infusing the output of AI systems with a measure of transparency will prevent some of the worst current and future problems associated with this technology.  

In 2024, it’s still possible to spot the flaws in AI-generated images and video—to tell fake from fact. Within a year or two, however, GenAI technology will reach a point where it’s nearly impossible to tell the difference with the naked eye.

That’s why it’s imperative to adopt safeguards now.

Model legislation for ai disclosure

California’s AI Transparency Act, adopted in Sept. 2024, provides a model for this kind of disclosure. The Act requires AI developers to embed detection tools in the media their AI creates, and to post an AI decoder tool on their website. The decoder tool would allow consumers to upload digital media and discover whether the developer’s AI was used to create that content.

This technology already exists, and a few ethical AI developers are showing how it can be used to empower consumers. It’s up to policymakers at the state and national level to require it of all developers—for the common good.

Next: 

  • Data Privacy: What it Is, Why it’s Important

Previous
Previous

Why ‘Opt-in’ Consent Is the Best Option

Next
Next

Defining Artificial Intelligence