Abundance of digital data brings need for vigilance against its disappearance | St. Louis Public Radio

Abundance of digital data brings need for vigilance against its disappearance

Aug 20, 2018

The digital age has ushered in many advancements and fresh possibilities – and also new concerns. One of those has to do with the need to protect vital scientific and public data resources from disappearing or even being intentionally suppressed.

While many libraries in the U.S. have long served as repositories in an effort to back up and preserve government information, that work has new urgency under a presidential administration that has expunged certain information related to topics such as climate change.

“These things [removing data] have gone on for a long time,” Washington University’s Aaron Addison said on Monday’s St. Louis on the Air, offering the missing Cook County, Illinois, data from the 1960 U.S. Census as one example. “[But] here we have a case where it’s not happening in a vacuum – it’s in concert with all these other decisions that the administration is making. And so it adds, certainly, to the concern.”

Addison, who is the director of data services for WU Libraries, joined host Don Marsh to talk about the importance of preserving data and other information resources, particularly in the public sector, that may be in danger of becoming inaccessible.

Addison and his colleagues are part of a broad “data rescue” movement that got going across the country last year. Those efforts stem from worries about potential gaps in the historical record, Addison explained.

“Any time you have a single source or a single responsible party for the storage or the accessibility of the data set, it becomes a point of concern, whether that’s happening at the federal level or the state level or even the local level,” he said. “We see examples of that across the board … if there’s a change in leadership at one of those levels, they either may not prioritize that and allocate resources to maintaining it, or they may have a different direction they want to go which includes pulling all of that data back.

Aaron Addison is the director of data services at Washington University.
Credit Evie Hemphill | St. Louis Public Radio

“And so it’s definitely a point of concern, because there are a lot of research projects – there are a lot of data-informed decisions – that take place that rely on the availability and the accessibility of that data.”

One current conversation at the national level, Addison said, involves the U.S. Census Bureau’s preparations for the next decennial census in 2020. An open comment period is under way about which data sets may or may not be available.

Meanwhile, broader questions remain when it comes to social-media platforms like Twitter – and how they impact the historical record.

“Who owns the tweets, how long do the tweets last and so forth … there are a lot of decisions that go back and look at the patterns of those tweets,” Addison said. “For example, a natural disaster – you may have people that are tweeting from a hurricane site in the U.S. or an earthquake in Mexico. How’s that data going to be used? How long should it live?”

When Marsh inquired about policies in place to help avoid the loss or destruction of such information resources, Addison indicated that specific guidelines and laws haven’t kept up with the fast-moving evolution of data.

“For data in particular, that’s an area that really hasn’t been covered well in legislation, to say that we are going to keep these things,” he said. “There are broader laws that say – locally, Sunshine Act laws – where we might be able to demand that a piece of data be made available if it was collected with taxpayer dollars … at the federal level there is a website called data.gov. What gets posted there – the timeliness of what gets posted there, the completeness of what gets posted there – I think, to be fair, is very uneven.”

The conversation also touched on some best practices for everyday data preservation by individuals, including a strategy that Addison referred to as “three-two-one.”

“[It] suggests that you want to have at least three copies of your data on two different types of media, and one of those should be off site,” he explained. “So you’re guarding against disc failure, computer failure, but you’re also guarding against physical destruction of that disc through fire, flooding and those types of things.”

Unfortunately, most people aren’t following such best practices, he added.

“We may have these cryptically named folders that have all of these digital photos in them, but nobody knows what they are, and we never curated them, we never went through and took out the ones that weren’t important to us,” Addison said. “Which is a different thing in this day and age than even it was when I was growing up 20 or 30 years ago. I could pull out a photo, and my grandmother or my mother had written on the back who was in the photo. We don’t even do that anymore, and so there’s this gap that’s forming in our family record, and that’s unfortunate.”

St. Louis on the Air brings you the stories of St. Louis and the people who live, work and create in our region. St. Louis on the Air host Don Marsh and producers Mary EdwardsAlex HeuerEvie Hemphill and Caitlin Lally give you the information you need to make informed decisions and stay in touch with our diverse and vibrant St. Louis region.