In my previous article, I discussed recent and popular types of AI-generated content in terms of media creation, e.g. text, images, video, audio. I discussed that many AI algorithms, by design, are built to produce output that is nearly indistinguishable from human-created works. We have a rapidly approaching, collective challenge to detect the original source of any content we encounter. “Was this content generated by a person, or generated / modified by AI?” is now rather difficult to answer, and the long term solution may not lie in some detection of residual artifacts by analysis tools.
(Consider the remainder of this article as a discussion of some conceptual ideas over a virtual beer. While not necessarily a UX discussion, I still believe it’s helpful for everyone in the room to think through it. This is a future we all will have to own; there’s no going back.)
Cryptography enters the chat
Most modern file formats for images, video, audio, and documents come with embedded metadata about the content itself. It can indicate the author, date/time it was created, GPS location (e.g. for photos or video), software used to edit it, rough categorizations, and more. In many cases, this metadata is easily modifiable long after the file was originally created. So while the metadata is useful, it cannot definitively authenticate the original source of the content itself.
Today we are all quite accustomed to the concept of user identification and user authentication. You have had to prove that you are who you say you are, both in person, and in an online setting to some other organization or entity. In the online setting, strong cryptography underlies it all, and I believe this can be a significant part of the answer our content authenticity dilemma. Digital signatures are indeed a solution, as they grant a high degree of confidence that the signed content is authentic (created by a known sender), and maintained its integrity (did not change since that digital signature was first generated.) This type of cryptography is the foundation to our modern day-to-day digital lives.
Content source authenticity
Digitally signed files are signed by a person. Let’s consider an alternative where the content is digitally signed by something other than a person. Imagine we are using a modern smartphone’s digital camera to take a photo. Cameras normally embed modifiable metadata in the resulting image file as we discussed. However, what if the resulting image file was digitally signed by the phone on creation, which now ensures the date/time, GPS location, and the phone’s IMEI (serial #) are veritably immutable without corrupting the file itself? The photo itself and this metadata can be validated via the signature, and thus never changed without modifying the signature.
Note that we could also embed in the phone user’s identity, if recently authenticated, into the media file too. However, consider we live in a time when governments routinely persecute protesters, and undeniably identifying the person who captured certain content may have dire consequences. So the problem I am trying solve is authenticating what, when, and how the original content was captured…leaving out the who for now.
On to the problems…
While we have addressed the source of the content, we cannot guarantee whether AI played a part in manipulating the content before the digital signature. In our previous smartphone camera example, recording the digital signature of the software application used to capture the image can help provide some guidance about what manipulation capabilities may have been employed when the media was captured. But it’s not foolproof.
A second issue is with the accuracy of the metadata itself. What if the date/time, GPS location, phone’s IMEI, etc. were compromised before the content was captured? For instance, the date/time can be manipulated manually on any phone to whatever the user wants prior to capturing the image. The file’s metadata will capture whatever date/time the phone provided.
For both problems, we would likely need hardware and OS capabilities to assist in the capture of authentic source metadata. I am not entirely sure how hardware would accomplish this, but my spidey-senses suggest that physically unclonable functions (PUFs) and open standards adopted by hardware manufacturers may come into play here.
We have discussed why content authenticity is important from a viewers’ perspective. What about content author’s point of view? Some authors, like photojournalists, have a vested interest in ensuring all viewers know their original content is unadulterated. How can their content stand on its own once made public, with its authenticity unquestionable? Even if his/her content is copied, modified, and digitally signed again (i.e. dissemination of inauthentic copies), how do we know which one is the original?
One possibility is to use a decentralized blockchain specifically to capture content authenticity (e.g. as digital signatures) as the content is created. One primary advantage of a distributed blockchain is the impracticality of making changes to records that are part of the chain. Per our smartphone example, when the content is captured and signed, if the phone can push up the signature to a blockchain, the date/time of the block are forever and immutably logged for that piece of content on the chain. Note that this date/time doesn’t correspond to when the photo was taken, but instead to when the block was added to the chain. If the phone has a wireless signal, this process could be fast. Future viewers can search for the photo’s digital signature on the blockchain to verify the block’s creation date. The blockchain provides another level of trust that the photo wasn’t taken later than the block insertion date/time. Viewers can validate, beyond the date/time digitally signed in the image itself, that the block chain asserts a backstop confirming, “the image was first created sometime BEFORE this block’s date/time….but definitely not after.” Any future variations of the image, if using that same blockchain, would always be listed afterwards.
If you’ve read this far and remain even mildly interested, you can see there are more questions than answers. My goal was only to elaborate directions for possible solutions towards content authentication, hope we can agree that it will become more important to us all in our near future.