Archivists Want AI to Help Save, Analyze Everything Trump Says

Posted on Categories Discover Magazine

(Credit: Joseph Sohm/Shutterstock)

A week hasn’t even passed since the inauguration, but television news is saturated with the flurry of activity from President Donald Trump’s administration. Trump, via Twitter, promised to launch an investigation into illegal voting and threatened to “send in the Feds” if Chicago police can’t fix the “carnage.” And that was just between Tuesday and Wednesday.

This heightened scrutiny compelled the Internet Archive, a repository of everything posted on the web, to launch its Trump Archive in early January. You, perhaps, digitally time-traveled with the Internet Archive’s Wayback Machine, or checked out free books, movies and software. The Trump Archive, which draws content from The Internet Archive’s TV News Archive, includes more than 520 hours of televised Trump speeches, interviews, debates and other broadcasts tracing back to 2009. It will continue to grow.

“There’s no accessible library of television news, so television ends up washing over us like a wave,” says Roger Macdonald, director of the Internet Archive’s TV News Archive.

The TV News Archive gives journalists, scholars and citizens a chance to breathe, reflect and process that television news whitecap after it crashes ashore. And in the case of the Trump Archive, it’s a tool to track Trump’s statements on public policy issues, and ensure footage doesn’t succumb to the temporal nature of the Internet.

Already, Anna Wiener used the archive to immerse herself in Trump TV for a piece in The New Yorker, and German Chancellor, and physicist by training, Angela Merkel is reportedly poring over archived Trump interviews to get a read on the new Commander in Chief.

So the Trump Archive is already serving its purpose, but for the archive’s curators, it’s only a framework for their larger vision. These archivists want artificial intelligence to play a deeper role easing access to the statements of our elected officials in the archive, and in turn enhance accountability.

“Here’s a really clear public interest value for artificial intelligence,” says Macdonald. “We envision this as a multi-year project to model how machine intelligence could make media more accessible and interpretable, both by humans and machines.”

Going Deeper

Currently, closed captioning text is the data thread that ties the TV News Archive —1.3 million shows gathered since 2009 — together. A search on the Trump Archive, therefore, is a search for keywords in captions. This hack makes broadcast news videos searchable.

But closed captioning has its limits — try counting the errors in a live broadcast — and that’s where AI factors in. Beyond text, Macdonald and the archive team want to set loose facial recognition, voice identification, and other deep learning tools to put every second of video in context.

“We want to be able to extract novel metadata around our video collections: Who is talking, when, and what type of program is it?” says Dan Schultz, senior creative technologist at the TV News Archive. “Even conducting sentiment analysis is all within that scope of collecting novel metadata.” Sentiment analysis, quite simply, uses word choice and tone to assess whether a person’s language was, for example, negative or positive.

These algorithms will be key for journalists and curious citizens alike to interrogate the data with pointed questions (How has Trump’s language regarding the economy shifted in the past 6 months?) rather than more general inquiries, and get relevant answers in return. And, in a time when partisan battles over what’s “fake news” are being waged, AI will make it even easier to cut through the clutter.

Seeing and Believing

Artificial intelligence programs already excel at extracting information from text and images. Facebook’s facial recognition software can identify you and your friends, algorithms can automatically caption photos and researchers routinely perform sentiment analysis using Twitter data. Video, however, is a more difficult nut to crack, but the nut is indeed cracking.

Twitter’s artificial intelligence team, known as Cortex, developed an algorithm that can recognize what’s happening in a live video feed — it can tell if you’re playing a guitar or petting a cat, according to the MIT Technology Review. However, processing video, intuitively, is far more computationally heavy than text or images, and that’s what makes the task difficult.

Comcast recently acquired a company called Watchwith, which built a system that automatically generates metadata for videos using computer vision and machine learning. Google uses speech recognition to automatically generate closed captioning for videos.

Netflix and Hulu have also invested in deep learning and computer vision methods to generate video metadata to improve personal recommendations. Other companies like Clarifai, Viisights and Movida’s Deeva API rely on AI to perform similar services.

In all of these efforts, the end goal is to make videos easier to find in a digital world. Still, there’s a ways to go. “I have become fairly (skeptical) about the effectiveness of AI methods having seen so few deliver on their promise, however, it is essential to keep an open mind,” Digital Asset Management News editor Ralph Windsor wrote. For Windsor, AI still has a lot to prove before professional archivists can rely upon the technology.

Expanding the Archive

For the TV News Archive team, Trump was first in line, and in the near future they plan to expand their archival efforts to majority and minority leaders in the House of Representatives and the Senate. And, yes, they will also be archiving the digital footprint from the Obama administration.

“It is worth noting that eight years ago we didn’t have the pipelines to technology to expose this sort of thing,” Schultz said when asked why they started with Trump. “It’s sort of a perfect storm of interest, and technical timing and it aligned with the general mission of the archives.”

In addition to saving video for posterity purposes, the archive also serves as a vehicle for creative expression. For example, the TV News Archive team incorporated a tool, called Popcorn, which allows anyone to piece together video compilations of the news in their browser, without dishing out several hundred dollars for editing software.

“We’re very curious to see what will happen with it. We can’t even imagine how people will use our stuff,” says Nancy Watzman, managing editor of the Television News Archive.

Leave a Reply