Understanding the Library of the Future: Converting to Digital Archives

In this age, many companies are suffering from a glut of information. Warehouses are backlogged. Post boxes are overflowing. And SPAM isn’t just for dinner anymore. On one hand, a company must seek to satisfy its customers’ demands for information regarding its products and/or services. On the other hand, a company must also figure out how to best allocate, share, and/or constrain the information it possesses. What’s a company to do when suffering from information overload?

By far, the most prevalent solution in today’s market is to transfer the access, storage, and maintenance of information to a digital medium. If you wish to store 100,000,000 documents in a physical space, you’ll need a lot of warehouses. But keeping the same number of documents in an optimized virtual archive, such as a Storage Area Network (SAN) can require less physical space – one data center can handle all your documents and others besides. When a company can consolidate its paperwork, archives, and images, and also store all this information in a digital space a fraction of the size that storing the same physical items would require, then why isn’t everyone on the digital archive bandwagon?

Having been asked this question by several people, I thought to provide some semblance of an answer. This document is an overview of certain issues that need to be addressed, and logistical decisions that must be made when transitioning to a digital archiving system or paperless office. To move ahead one must gain an understanding of “The Library of the Future!”

Your Neighborhood Library, now Open 24/7

Libraries are no longer just dusty old buildings in the town square. Our libraries today are now hubs of dynamic information access in a fast-paced digital world. When information moves at the speed of light, how fast will you have to go to keep up with it? The fact that practically every public library in the and many of those abroad now offer some form of internet access to its patrons is a clear indicator that the librarians get it. Librarians understand that knowledge really is power. And those that control the flow and exchange of information greatly influence the structure of the world as we know it.

Learned people are always in pursuit of enlightenment. This was true in the age of Leonardo, and it’s still true today. With the advent of technology such as the internet and automated digitized archiving systems, enlightenment instead pursues you, sometimes whether you want it to or not. If you aren’t prepared to catch up and catch the train, rest assured it will keep chugging along without you and you will be left behind. “The rest of us get it,” they all say, “why don’t you?” But what is it you’re supposed to be getting, exactly?

The Masters of the Universe

The internet is like a piece cake we all want to eat. Except that this slice of cake can be instantly customized to the exact flavor that each individual drooling over it most desires. Imagine if you could have access to any information you wanted, right when you needed it most. This is the promise of “The Library of the Future.”

But be aware, satisfying the thirst for knowledge and information is certainly a full-on concern. The leadership roles that define a society usually are. Therefore, the guardianship of the Library of the Future should not be undertaken lightly. This is serious business. Like it or not, you are in charge of a wealth of information, a repository of cultural ideas and development. It may only be your individual company’s culture, or it might be information relevant to the entire global community, regardless your information is a valuable commodity that can be bought and sold. This is the crux of “The Library of the Future.”

Now you must be wondering, “Am I supposed to know all this stuff? How do I determine what is relevant? How do I know if I’m going about it in the right way? Is there a wrong way?” Yes. And no. Methods, of course, must be customized to each individual situation. And expert knowledge of specific should be employed. But there are certain issues which everyone must address, sooner or later. Let’s examine a few of the major concerns that one should think about…

IMPORTANT: Before you do anything else, make sure you have access to the advice of a professional(s) in the fast-growing field of digital archiving and the mass digitization of documents. Whether this will mean hiring a consulting firm or simply adding a new employee to your company, having such expertise at hand will quickly prove its worth.

Once the determination has been made to replace paper with pixels, make use of the following outline to help to steer you in the right direction.

Approaching a Digital Archive: Key Concerns

I. How-to Convert Your Documents

A. Image Files vs. Delineated Text

1. Documents can be scanned as a single image file, as word by word copy, or as a composite of the two. The resulting digital media selected is generally dictated by the intended end use of the specific information.
a) For instance, let’s say you have a rare 12th century manuscript in a Norse dialect containing beautiful miniature hand-painted images. The handiwork is not only historically important, but it’s also of ocular significance. Scholars, studying 12th Century literature of the Nordic countries, would not only desire to know what the text says (in modern translation), but would also be interested in seeing what the manuscript looks like. If you’d decided to only use optical character recognition (OCR) to scan the text, and then threw the manuscript away, you would have just tossed a wealth of incalculable information and a very valuable portion of your archive into the rubbish bin. (Not to mention the fact that you would also have single-handedly managed to completely upset the very scholars whom you desire to become your customers and patronize your digital archive.)

2. Text scanned via OCR can be converted into hypertext using a markup language, such as SGML. (SGML is the all-encompassing markup language preferred in the Digital Information Services industry for coding large format documents like books. By the way, the better-known HTML is a specific code subset of SGML.)

3. Hypertext encoded documents can allow for search and linking functions, making a document interactive and responsive to an individual user’s needs.

II. Intellectual Property Rights

A. Controlling Your Content

1. When making digital documents available to a wide audience – and believe us, if a document is online, then the potential audience will be wide indeed – you should ask yourself several questions, namely:

– Who owns the information?
– Where will we store the information?
– Who will have access to the information?
– What do we do with all this paper, afterwards?

2. The answers to these questions will guide you through the process of converting your archive and making in available online. Let’s answer each question in turn:

– Who owns the information? Hopefully, you are the copyright owner, or you at least possess the reprint rights of any content you plan to digitize.

– Where will we store the information? A storage solution could involve an in-house entity or department specifically created to handle the digital archive and its maintenance, or you can choose to outsource the storage of your archive via a company that handles such needs.

– Who will have access to the information? Information access can be controlled through a variety of methods. For example, you can setup a subscription service restricted to members only. You can choose to allow open access. Or to only allow access via institutional organizations, which in turn are responsible for the management of a subscription. You can even give limited free access that will convert to a subscription after a trial period.

– What do we do with all this paper, afterwards? Depending on the type of documents you are converting, the actual papers may be stored as original source material, need to be held for legal reasons, or even recycled, if no longer pertinent after the conversion process.

B. Management of Your Content Rights

1. Once you’ve decided who will be allowed to have access to your archive, you should determine what level(s) of access that will be granted, and how this will be done.

a) Membership has its privileges. Restricting premium content to a paid subscriber base is one way of controlling content access, and providing for a residual income to offset or subsidize the future maintenance of your archive(s).
1) Make sure to clearly post a privacy policy that lets potential members know what, if anything, you will do with any information you request from them when the register to access your archive.
2) Clearly state any marketing aims that you have for personal information you collect.
3) Consider age-restricting access to mature audiences only.
4) Determine the legal limits of personal information you may ask of a minor.
5) Allow members to opt-in to special promotions, third party offers, and the like.
6) Consider making your archive available in various languages or limiting access to users who speak a certain language.

b) Your documents are digital, the information can travel without you.
1) Will you allow printouts of your webpages? If so, how many copies can a person print?
2) Will you watermark your pages to prevent their unauthorized photocopying and distribution?
3) Will you allow images to be downloaded, or will they have separate redistribution rights?
4) One solution that prevents your information from being disseminated in an unauthorized manner is to employ a digital document encoding (such as PDF) that allows for the management of content rights, including the restricting of printouts, of editing, etc.

Once you have decided how to best digitize your archive; 2) after you have performed the digitization of your archive (or hired someone to do it for you); 3) after you have cataloged the digital files and made them interactive; 4) after you have allocated for or outsourced their storage; 5) after you have designed (or hired someone to design) a database driven web-site to handle the access to your files; 6) after you have setup an authentication system to control who can access your files; 7) after you have assigned content rights to control the reprinting and distribution of your files; and 8) after you have employed a staff to oversee all this — just think, you’re almost done!

Yes, it’s a lot to remember, and it we realize it probably seems daunting to the uninitiated. But look at it this way, you’re responsible for preserving a wealth of crucial information for the benefit of the common good, and for your own prosperity, would you expect anything less than the implementation of a thorough, well-drafted plan to accomplish your goals?

Leave a Reply

Your email address will not be published. Required fields are marked *


× 6 = six