data backup - Public Libraries Online https://publiclibrariesonline.org A Publication of the Public Library Association Tue, 14 Feb 2017 18:46:44 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.5 Information Pulls a Disappearing Act https://publiclibrariesonline.org/2017/02/information-pulls-a-disappearing-act/?utm_source=rss&utm_medium=rss&utm_campaign=information-pulls-a-disappearing-act https://publiclibrariesonline.org/2017/02/information-pulls-a-disappearing-act/#respond Tue, 14 Feb 2017 17:55:41 +0000 http://publiclibrariesonline.org/?p=11706 Several have sounded the alarm that information is disappearing. We’ve known for a long time that some of our oldest materials were deteriorating and that we needed to microfilm (now digitize) the items for preservation. What’s happening now is that new information is disappearing from current databases and resources.

The post Information Pulls a Disappearing Act first appeared on Public Libraries Online.

]]>
Several have sounded the alarm that information is disappearing. We’ve known for a long time that some of our oldest materials were deteriorating and that we  needed to microfilm (now digitize) the items for preservation. What’s happening now is that new information is disappearing from current databases and resources.

Some of this is due to contractual agreements between the content holder of copyright and the aggregator database provider such as ProQuest and EBSCOhost. We also find individuals lose their rights-to-access because print content that was once available on the Internet Public Library is now only available digitally as part of aggregators such as Project MUSE and JSTOR. Unless the individual has a library nearby which subscribes to these databases, individuals would have to subscribe to the databases when in most case, they only wanted to read one article. This makes libraries indispensable to access, yet perhaps because of the contractual agreement they are not able to give access to the person wanting the information because they are ‘out of bounds’ of the region or the academic institution. I remember once paying $30 gain access to a book my daughter needed for her master’s degree work. Interlibrary Loan system used to work, but with current licensing, that is not always the case.

There is another disappearing act of websites being taken down, though these are sometimes available through the Internet Archive: Wayback Machine. The archive doesn’t capture everything, nor do they capture at any regular interval some of the websites with valuable information and data. I found one university website which was deleted but had come back as the same URL with totally different information. This sort of thing has happened with ISBNs as well; the reuse of them is a serious breach of the program, but it happens frequently enough to be wary of what you are trying to get. In one scenario, a student can’t get access to a certain music methods publication because the database subscribed to by the university dropped the magazine due to their contract with the content owner. In another, the information on Climate Change and Civil Rights was taken down from White House shortly after Trump took office as President.

There have been efforts to save this disappearing data. DataRefuge is one group trying to preserve climate data. GitHub is also working on a method to save digital content from extinction. The Library of Congress, the American Library Association, and CLIR have all been involved in what is now known as “born digital’ information and data and are actively attempting to help contain its demise. Yale University is involved as are many other institutions.

I’m not sure if this loss of digital content will change what our future populations will know as history or not, but some of the information loss will surely change some of the data available to researchers and historians and possible conclusions brought to that research. We do live in a strange universe where we now have researchers trying to replicate standing research to see if it was done correctly with the right conclusions specifically, on health issues. Without that older information, this action would not be available to us, leading us to new information and understanding.

It may be a smart idea for public libraries to update the knowledge found in older work the way law books and encyclopedia’s yearbooks receive updates.  This helps citizens and consumers with information to update their current understanding. With some articles on the net, we often see announcements “updated {date}” but I wonder how many people go back to review the old article (possibly bad or erroneous) or even that updated article, but continue to tell others; spreading the erroneous message/information. And, are libraries capturing this changing information?

The post Information Pulls a Disappearing Act first appeared on Public Libraries Online.

]]>
https://publiclibrariesonline.org/2017/02/information-pulls-a-disappearing-act/feed/ 0
A Needle in a Haystack: Writing Digitally about Proper Digital Preservation https://publiclibrariesonline.org/2016/01/a-needle-in-a-haystack-writing-digitally-about-proper-digital-preservation/?utm_source=rss&utm_medium=rss&utm_campaign=a-needle-in-a-haystack-writing-digitally-about-proper-digital-preservation https://publiclibrariesonline.org/2016/01/a-needle-in-a-haystack-writing-digitally-about-proper-digital-preservation/#respond Fri, 15 Jan 2016 00:01:30 +0000 http://publiclibrariesonline.org/?p=7748 With a little searching, maybe someone can find a needle online in the haystack of information. At least, if they have some idea of where it might have been in the first place…

The post A Needle in a Haystack: Writing Digitally about Proper Digital Preservation first appeared on Public Libraries Online.

]]>
A few weeks ago, while researching my article regarding whether digital content is being properly preserved, I came across an article about knowledge preservation by Claire McInerney, a professor in the Rutgers University Library Program, an online-based master’s program. When I referred back to the article this week, this is what I found:

Courtesy of Troy Lambert

Courtesy of Troy Lambert

With a little searching, I was able to find a link to the same study on the American Society for Information Science and Technology[1] website. Still it raised the question: what happened to the other article? And how did I have to structure my search to find it?

A simple Google search of “knowledge management” would not work. The article didn’t rank high enough according to Google to appear on the front page, and most users (including me) don’t look past page two, so I needed more information for my search string. Since I looked at the article recently, I knew what university and program the professor was affiliated with, and I remembered her last name. So a search of “knowledge management Rutgers Library McInerney” brought me to the information I was looking for, but the result was still on page two. This example highlights one of the many problems with proper preservation of digital content.

On November 20, 2015, Meredith Broussard of The Atlantic stated similar concerns in an article titled The Irony of Writing Online about Digital Preservation: “The Internet archive will allow you to find a needle in a haystack, but only if you know approximately where the needle is.”[2]

Imagine me trying to find the same Rutgers article, only a year from now. I’m likely to have read hundreds of other articles by then, and probably won’t remember the university the professor was from, and certainly not her name; I’d just have a vague notion of an article about knowledge management I would like to reference, and maybe a loose timeframe of when I read it, which has no real world relation to the date the article was published.

Not to mention the haystack is constantly growing.[3] The number of articles, like this one, regarding similar concerns over the preservation of knowledge, will be created and archived somewhere, maybe. That’s where the irony comes in. We are writing in a digital media about the difficulty of preserving digital data, and our thoughts themselves are challenging to preserve.

It’s a vicious circle, and an ongoing problem, one that libraries are ill-equipped to solve; however, these concerns have many sources and possible solutions.

Content Management

“The challenges of maintaining digital archives are as much social and institutional as technological,” said a National Science Foundation and Library of Congress study[4] from 2003. “Even the most ideal technological solutions will require management and support from institutions that in time go through changes in direction, purpose, management, and funding.”

Each website is hosted on some kind of platform designed to manage how the content looks to an end user, and many have unique themes. These vary from Drupal (used by Time magazine) to WordPress (where the content on my website is hosted), and dozens of others, some custom created for large media organizations. Media outlets that also create print materials have yet another Content Management System (CMS) for print content. All of this should be easy to preserve, right?

Not as easy as you think. Large archival organizations like LexisNexis or EBSCO scoop up digital feeds, bundle the information in a database, and license those packages to libraries, who can then search them by title, author, keyword, where and when they were posted, depending on what the feed is able to gather. But comparing EBSCO searches with searches in Google reveals a stark difference in the quantity of articles indexed, revealing one of many data gaps.

Gone are the days of print material being converted to microfiche, but there is a hazard: organizations that switch CMS or have several, with decades of information to preserve (i.e. The New York Times), all of it in different formats, face huge challenges, all of which can be costly.

User Expectations

User expectations have changed as well: researchers expect nearly instant results and unlimited access to information. But putting and keeping it all on the web just isn’t practical, and experiments searching for specific articles show just how challenging that is.

Haystacks

Such searches also raise the question of how necessary such preservation is. Unless a user is looking for a specific quote by a specific person, the proliferation of material on any subject means similar information will be found in any search. In the example above, if I hadn’t been trying to find a specific article to prove it could be done more than anything else, I could have used other sources discovered in the search string discussing knowledge management containing nearly the same information.

Social Media Interactions

Not yet included in library based searches are Tweets, Facebook posts and comments, and other online interactions authors have with their audience. These are also a source for relevant quotes and information, but social feeds are difficult for libraries to capture, archive, and preserve, let alone make useful. The Library of Congress has made an effort with Twitter, but has no idea (yet) how they will make the huge amount of data they’ve collected available to the public.[5]

The primary reason is cost, a constant issue with both preservation and public access. It’s not just about the hundreds of terabytes of storage, a number that grows daily, but about having servers fast enough to handle even the simplest search. Searching one term in a small portion of the tweets gathered, say from 2006 to 2011, would take twenty-four hours using the library’s current technology.

There are also privacy issues, even though technically each Tweet published or Facebook update posted is already in a certain portion of the public domain, depending on the user’s privacy settings. However, this is a different method of acquisition than anything libraries have done previously, and a system has to be in place to remove deleted Tweets and posts in order to comply with the same user agreements that make them public information.

Data Gaps

Twitter

Even news sites struggle with shrinking budgets, migrating CRMs and changes in IT staff. A Newspaper Research Journal article reveals major data gaps.[6] “Not one publication has a complete archive of their website,” the article states. “Most can go back no further than 2008.”

So when you look for this article in a few months, how easy will it be to find? Even if you save it to your Twitter or Facebook feed, will the link still work? For how long? Fast forward a year. Two. Will our concerns even be the same? If you can find the article, will it be relevant? How quickly will it be lost in the haystack of other articles about digital preservation?

I don’t know how this site is being archived, or when Public Libraries Online will switch content management systems. I can save this article on my computer, or even in the cloud, but while that protects my access, at least for now, it doesn’t preserve it anywhere else. It’s likely the article I write today on digital preservation will not be preserved beyond a couple of years, whether it is of scholarly interest or not.

But with a little searching, maybe someone will find this needle in the haystack of information. At least, if they have some idea of where it might have been in the first place…


Sources

[1] McInerney, Claire. “Knowledge Management – A Practice Still Defining Itself.” American Society for Information Science and Technology 28, no. 3 (February/March 2002).

[2] Broussard, Meredith. “The Irony of Writing Online About Digital Preservation.” The Atlantic, November 20, 2015. http://theatln.tc/1Qyguv2.

[3] Fridman, Alan. “3 Ways Big Data Has Changed the Digital Age.” Inc.com, July 19, 2015. http://bit.ly/1fgNYho.

[4] Hedstrom, Margaret. “It’s About Time Research: Challenges in Digital Archiving and Long-Term Preservation.” Report presents to Workshop on Research Challenges an Digital Archiving and Long-Term Preservation, Washington, DC, April 12-13, 2002.

[5] LeFrance, Adrienne. “Library of Congress has archive of tweets, but no plan for its public display.” The Washington Post, January 13, 2013. http://wapo.st/1mTDBUJ.

[6] Hansen, Kathleen A., and Nora Paul. “Newspaper archives reveal major gaps in digital age.” Newspaper Research Journal 36, No. 3 (2015): 290–298. DOI: 10.1177/0739532915600745.

The post A Needle in a Haystack: Writing Digitally about Proper Digital Preservation first appeared on Public Libraries Online.

]]>
https://publiclibrariesonline.org/2016/01/a-needle-in-a-haystack-writing-digitally-about-proper-digital-preservation/feed/ 0
Protecting Your Library Against a Data Breach https://publiclibrariesonline.org/2015/03/protecting-your-library-against-a-data-breach/?utm_source=rss&utm_medium=rss&utm_campaign=protecting-your-library-against-a-data-breach https://publiclibrariesonline.org/2015/03/protecting-your-library-against-a-data-breach/#respond Fri, 20 Mar 2015 20:41:59 +0000 http://publiclibrariesonline.org/?p=5478 With news breaking every month or so about a company that has had a serious data breach, is your library prepared to protect your information and library network?

The post Protecting Your Library Against a Data Breach first appeared on Public Libraries Online.

]]>
Sony has been in the news the past few months after its recent hacking scandal. Additionally, hacks have occurred against Target, Home Depot, and other businesses over the past year, causing customers to worry if they had used a credit card to shop at one of these places. As libraries, we don’t keep people’s credit card information, but it is still important to be secure. We want this post to encourage people to talk with their coworkers and in-building IT people. Just having the conversation makes all libraries more secure.

Generally the opinion of some library people is that they don’t have to be especially secure because they are libraries. The idea is security through obscurity. However, all that does is cause libraries to play a waiting game. It is not a question of IF there will be a problem, but when.

Libraries have a plethora of computers with good bandwidth and servers with lots of space. By the very nature of libraries wanting to provide open access, they are a target for potential hackers. Open access is both a tenant of who we are as libraries and extremely important. It is not our intent, at all, to say there should not be open access! However, we must provide this service with our eyes open — knowing it could come back to bite us later. This mode of thinking isn’t meant to scare you, but to cause you to stop and think.

In order to continue to provide the best access possible, we pose the following questions:

When was your last security audit? Have you checked to see that all your recent computer updates installed properly? Did it fix security holes or make the existing ones bigger? Getting someone to do a security audit is similar to getting someone to do a home inspection. There are plenty of people you can call, but you want someone who really knows what he or she is doing so it saves you time and money later on. To find a good security auditor you want to check with current and previous customers of your potential contractor. Are they pleased with the service they received? Did they feel it was worth the money?

Have you kept up-to-date with your updates? Sometimes something as innocuous as not updating a browser plug-in like Flash or Acrobat can be a problem. Are all your Windows updates done? Is your anti-virus up-to-date?

How good are your back-ups? This is one of those questions that can strike fear into your heart. The idea is that back-ups are there if you have a problem, but do you know if they would even help you? Have you ever tried to restore anything from one? This is just about checking to see that the files you are backing up are ones you can actually use. How often are you rotating your back-ups? What length of time do you back up your files? A day? Two days? Do you set one of your back-ups aside every so often to make sure you are not preserving compromised data that has been backing up onto what you would use to restore all your files if necessary?

Have you checked your technological band-aids? Sometimes changes to systems are made in the heat of the moment to accommodate immediate needs. Have you gone back and made sure they were done in the best possible way? Someone placed those band-aids in the best possible way at the time, but that may not be the best long-term fix for the problem.

How are you managing all your updates? There are programs like Ninite (https://ninite.com) and Wpkg (http://wpkg.org/) that can help you manage your non-Microsoft applications updates.  Are you paying attention and checking regularly for your Windows programs updates as well?

Are you ignoring security concerns because you have Apple devices? There is the belief that if you run devices from Apple that you will not be a target for hacking. That is not wholly true. It is true that there are not as many Apple computers to target as Windows computers, but that again is security through obscurity or quantity. Recently Apple has had some security issues so staying updated on your iOS updates and Apple applications updates are important. There are programs like “Get Mac Apps” (http://www.getmacapps.com/) that function similarly to Ninite and Wpkg for Windows devices that manage updates.

My IT person says you guys are wrong! We’re okay with that. Everyone will have local concerns and parameters that make different levels or types of security better or worse for them. Security can’t impede workflow or be so lax that it’s nonexistent. In the end, if you are staying up to date with your virus protection and different program updates, you should be fine. But sticking your head in the sand and pretending security isn’t an issue won’t protect you from anything either. As long as you and your local security person have talked and made a plan that works for your library, then our work has been done.

Melanie A. Lyttle is the Head of Public Services Madison Public Library. You can watch her YouTube channel, Crabby Librarian, at http://www.youtube.com/watch?v=7Rv5GLWsUowShawn D. Walsh is the Emerging Services and Technologies Librarian at Madison Public Library.

The post Protecting Your Library Against a Data Breach first appeared on Public Libraries Online.

]]>
https://publiclibrariesonline.org/2015/03/protecting-your-library-against-a-data-breach/feed/ 0
Every Cloud Leaks a Little https://publiclibrariesonline.org/2014/11/every-cloud-leaks-a-little/?utm_source=rss&utm_medium=rss&utm_campaign=every-cloud-leaks-a-little https://publiclibrariesonline.org/2014/11/every-cloud-leaks-a-little/#respond Tue, 18 Nov 2014 21:04:31 +0000 http://publiclibrariesonline.org/?p=5027 A recent media scandal involved compromising celebrity photos allegedly hacked from the cloud via the celeb’s cell phones and then distributed to the general public. Shortly after this story broke, my local weather included rain. The jokes flew: every cloud eventually leaks a little.

The post Every Cloud Leaks a Little first appeared on Public Libraries Online.

]]>
A recent media scandal involved compromising celebrity photos allegedly hacked from the cloud via the celeb’s cell phones and then distributed to the general public.  Shortly after this story broke, my local weather included rain.  The jokes flew: every cloud eventually leaks a little.

This comment was said in jest, but rang painfully true.  Those who know me are well aware that I have concerns about cloud security.  Disks fail.  This is a truism for electronics. All electronics. The reassurances of redundancy that are provided make me equally as uncomfortable.  If the outside source guarantees that my data will never be eaten by the dreaded ghost in the machine, this means that they are keeping copies of my data.  In fact, to be assured they can keep this promise, they are keeping multiple copies of my data in different locations.  When I delete my data, how do I know that all copies of it are truly gone?

Furthermore, electronics get hacked.  The bigger the system, the more likely it will be targeted at some point.  With my data kept in multiple locations, it also means that there are multiple opportunities.

And let’s not forget the obvious: accidents happen.  Here is a true story.  I am an avid online gamer.  I play text-based rpgs  (if that has no meaning to you, don’t worry). A few years ago, my preferred site announced it would be down for a few hours as our gaming data was transferred to a new, faster, and more advanced server space.   The 20,000 or so of us registered at the site are almost all geeks.  We weren’t worried about the 8 million or so posts on the site. A data transfer is easy.  Site management was excellent.  We paid greatly for a service provider, located in California.  The service provider who was updating the hardware said all the right things and provided all the right guarantees.

Our confidence failed when the few hours turned to a few days.  An “accident” had occurred when our provider went to copy the data.  Some people’s data ended up in the wrong place.  Some people’s data merged with other people’s data.  Ultimately, we learned that our gaming information was in the possession of a business in Sweden.  Our game site manager, located in Australia, had credit card information for a business in Europe.  Fortunately for us, the European business had its data merged with the Swedish company that had ours!  A deal was made.  Our geeks sorted out the business’ information, returning it all to the right parties, and we got our game posts back.

On the one hand, this was a heartwarming tale.  Lots of strangers worked together across the globe and solved a problem.  Of course, for us, it was easy.  The few weeks we were down did not “cost” us our livelihood.  We had no serious personal or financial data stored; only personally chosen usernames and email accounts.  The others had far more serious breaches.  It was lucky too that the credit card data was accidentally delivered to non-criminally minded nerds, who actively sought its rightful owners.

Still, the world of cloud security did get a boost this summer. On June 25, the Supreme Court in U.S. v. Wurie and Riley v. California held that police generally require a warrant to search information on cell phones. The ruling was unanimous.

What the court understood— that most people do not— is that the information (photos, email, etc.) accessed via the cell phones is not actually IN the cell phone.  It’s in ‘the cloud’; or in other words, it’s sitting on the cell phone service provider’s server (i.e. Verizon, ATT&T, Sprint, Virgin Mobile, etc.). In fact, it’s probably sitting on several servers.

The Supreme Court’s ruling is an evolution of Fourth Amendment rights. As this has been applied to cell phones, it is likely that this will set the precedent for the ruling to be applied to all cloud stored information.  While this is bad news for law enforcement, it is great news for the public and for libraries.

With the Patriot Act, many libraries stopped keeping particular kinds of data for fear that the government could swoop in, grab the computer, and learn a myriad of information about their patrons. Don’t get me wrong, this concern is still real and the government can still do this.  However, this new ruling can extend the protections of Fourth Amendment rights of individuals, which in the past existed only in their residences to public venues, such as libraries.

It will be interesting to see if this gets tested.  Though I for one, hope I am not the library to have the experience.

The post Every Cloud Leaks a Little first appeared on Public Libraries Online.

]]>
https://publiclibrariesonline.org/2014/11/every-cloud-leaks-a-little/feed/ 0
Personal Digital Archiving: Backing Up Your Collection https://publiclibrariesonline.org/2013/07/personal-digital-archiving-backing-up-your-collection/?utm_source=rss&utm_medium=rss&utm_campaign=personal-digital-archiving-backing-up-your-collection https://publiclibrariesonline.org/2013/07/personal-digital-archiving-backing-up-your-collection/#comments Tue, 02 Jul 2013 17:32:13 +0000 http://publiclibrariesonline.org/?p=2931 Once you have organized your files – gathered them all together, decided which ones to save, sorted the files into folders and given the important files easy-to-recognize names – it’s time to transfer a copy of your digital collection off your computer and onto another storage medium.

The post Personal Digital Archiving: Backing Up Your Collection first appeared on Public Libraries Online.

]]>
This is a bi-weekly post from staff at the Library of Congress about personal digital archiving. We recognize that public libraries have a unique function as centers of information for their communities and that their role in the spread of digital literacy is expanding. We hope that librarians and the communities they serve can benefit from our resources.

Once you have organized your files – gathered them all together, decided which ones to save, sorted the files into folders and given the important files easy-to-recognize names – it’s time to transfer a copy of your digital collection off your computer and onto another storage medium.

There is no “best” storage medium. CDs, DVDs, flash drives, solid-state drives, spinning-disk hard drives, networked cloud storage and backup tape all have benefits and drawbacks. CDs and DVDs are light and easy to store but they can be easily scratched and damaged. Flash drives do not have moving parts like spinning disks do but they are more affected by extreme temperatures. Spinning-disk drives – the kind that “whir” when you turn them on – have a large storage capacity but you could damage them easily just by dropping them.

Cloud services relieve you of the responsibility of tending to storage hardware but your collection becomes inaccessible if the network connection is disrupted. No online backup service is as reliable as a storage device that you can see and touch. Cloud storage should only be a secondary backup option.

The best strategy is to backup copies on at least two different types of media and store a copy in a different geographic location in case some disaster strikes your home or office. In fact, professional photographers – who have a financial stake in the accessibility and safekeeping of their digital photographs – have a “3-2-1” rule: make three copies, store two on different types of media and one in a different location.

This practice may seem extreme but institutions routinely practice backup redundancy. Unforeseen things happen and it is best to be prepared. Also, a general rule is to keep storage devices in the same environment that you would be comfortable in — not too hot, not too cold and not too humid. Keep them out of the attic or a damp basement.

One last thing to be aware of is that all storage devices eventually become obsolete (think of floppy disks and zip drives). Therefore, in order to keep your files accessible, you should move your collection to a new storage medium about every five to seven years. That’s about the average time for something new and different to come out. Keep migrating your collection forward to new media periodically. If you take responsibility for your digital collection and manage it wisely, your collection will always remain accessible.

You can find related information and resources at digitalpreservation.gov.[1]

 


[1] “Personal Archiving,” Digital Preservation, accessed June 23, 2013,http://digitalpreservation.gov/personalarchiving/.

The post Personal Digital Archiving: Backing Up Your Collection first appeared on Public Libraries Online.

]]>
https://publiclibrariesonline.org/2013/07/personal-digital-archiving-backing-up-your-collection/feed/ 2