The false positives of AI watermarking

It's far easier to watermark all real photos as real.

Jul 25, 2023

The AI watermarking concept fails an analysis of the relative cost and frequency of Type I vs Type II errors.

Suppose the law required every AI generated image to have a bright red border around it, as Stuart Russell just proposed before the Senate.

This could maybe work if there was just one company in control of the technology; if the internet made it easy to trace the provenance of shared images; and if real photos vastly outnumbered AI generated ones. Yet in reality, the technology is largely open source; tracing provenance is nearly impossible; and generated images will soon vastly outnumber real photos.

Given this context, watermarking laws could easily backfire by enhancing trust in deepfakes that lack said watermark. That is, the rate of false negatives is impractically high.

There is a Coasian element to this: for whom is the social cost associated with content labelling lowest? There are two options: either label all deepfakes as fake, or label all real images as real.

The latter option seems clearly lower cost and easier to enforce. Rather than label all deepfakes as fake, digital cameras will label all real photos as real, whether through imperceptible watermarkings or some kind of enhanced exif data. In turn, when presented with a salacious image, we will disregard it by default absent proof of its veracity, rather than believe everything we see by default absent proof of its fakeness.

Discussion about this post

Ready for more?