Comment on this article
Bytes, Copyright, and Info-Survival
The accelerating transfer of information from paper to computer files may be a boon to academia, but it is creating questions about who owns what data. Then there’s the issue of making sure none of it gets lost in the ether.
by Bruce Fellman
February 2001
Just about every year since he started teaching at Yale in the 1940s, Vincent Scully would dutifully trek to the slide library to pick out the images he intended to discuss during the lectures for his introductory art history course. And when his students needed a closer look at the artwork, they would dutifully trek to Street Hall, where copies of the images were tacked up on what was not so affectionately called the “Wailing Wall.” So heavy was the traffic that on the nights before examinations the campus police would occasionally catch desperate undergraduates breaking into the building for a final peek.
That unloved ritual is no more. Thanks to computers, a high-speed data-transmission network, and all the other hardware and software that make up the foundation of Yale’s information-processing system, Scully’s images are only a click away. Anyone with a campus network connection can visit the course website at classes.yale.edu/hsar112a/.
The transformation of a vintage art history course is only one example of the ways computers are altering the academic landscape at Yale. Many other entries in the Blue Book also have a presence on the World Wide Web, and almost all of them have one thing in common: a database of digital information that can be accessed, retrieved, and used in a variety of ways. A database can encompass the works of Shakespeare, every issue of a scientific journal, the entire corpus of knowledge about the human genome, the catalogue raisonne of the late photographer Robert Mapplethorpe, or any collection of material that can be rendered into bits and bytes and then posted on an Internet site or burned onto a CD.
At Yale, as elsewhere, there is a rush to create innovative educational opportunities in electronic form. But according to Ann Okerson, associate University librarian, the process is fraught with unexpected perils. “We’re all trying to invent the future,” says Okerson, an expert on the impact of electronic publishing. “But we need to figure out how to set things up so that the digital age happens right.”
For Okerson and her colleagues, the issue is less one of technology (although that is certainly important) than of ensuring that information is accessible—both now and in the future. Questions such as who owns the information, who is allowed to see and use it (and at what cost), and what is the right way to display it in order to best fulfill the University’s mission once had relatively straightforward answers. “You bought a book or subscribed to a journal and put it on the shelves,” Okerson explains. “But nothing is simple anymore.”
Consider the current effort by the Art Library to digitize its half-million prints and slides. “In typical Yale fashion, this started because of a building project,” says Max Marmor, the arts librarian. “We had more than 4,000 square feet of library space invested in storing these images, and in 1995, when we started to plan for a library renovation, we saw that the future was digital.”
Transforming every photograph and transparency into electronic form would certainly have freed up shelf space, but Marmor and his colleagues opted for a more targeted approach. “We decided to create a coherent set of digital collections that directly supported teaching and research rather than a complete digital archive,” he says.
One reason, of course, was money. “Digitizing 500,000-plus images is expensive and labor-intensive,” says Marmor. But an equally important factor was programmatic. “Our goal is to build a meaningful digital library,” he explains, “and in many ways, this is no different from developing a conventional collection. You have to make choices. To make the change to digital in a cost-effective way, we had to determine precisely what we were going to digitize to best meet scholarly needs.”
The result of this analysis is what the Library has dubbed the “Imaging America” project, a database that will eventually hold thousands of images drawn from Yale’s many collections, as well as from those outside the University, and which will be used to support courses and research in American Studies both here and elsewhere. But the effort has a wider purpose than simply enabling students or professors to examine a painting by Thomas Eakins, a print by Winslow Homer, or a photograph by Walter McClintock from the comfort of a dorm room or office.
“The ultimate aim of the project is to help create and shape a digital marketplace,” says Marmor, admitting that such entrepreneurship runs counter to the ethos of the librarian. “As librarians, we’re naturally subversive—we like to give things away—but there’s also an abiding tension here,” he says. “While we’re not in it for the money, we recognize the need to create something that recovers its own costs: products and services that are academically worthwhile and worth subscribing to.”
Two years ago, with a grant from the Getty Trust, Marmor invited a dozen like-minded universities and museums to begin shaping the Imaging America digital library. The group met for the first time in April of 1999, and since then, Yale has become a testing ground for the development of a marketable image database that could eventually draw on the resources of all the consortium members.
One reason for Yale to take the lead is that the University actually owns the pictures to be digitized. All too often, notes Marmor, slides that are used in teaching have been copied from books with little regard to copyright. The practice is considered to be allowable under the “fair use” provision of the law as it applies to the classroom. However, because the Imaging America program aspires to be a sustainable public service, it had to be limited to material that complied with copyright law.
Yale had another advantage that made it a logical candidate for the testing phase. The University was already working with a California-based software company known as Luna Imaging to develop image databases that both teachers and students could access over the Web and customize to fulfill classroom and research requirements. Using the Luna package of software, it was possible, while sitting in front of the computer, to retrieve the necessary digital images from a server on campus, assemble them into a slide show, and present it to a class. There was no risk of jammed projectors or upside-down slides, and if a professor needed to substitute an image, it could be done quickly and painlessly.
Art historian Mary Miller, who has used the system for presentations to her classes in Meso-American art and architecture, likes the quality of the projected images and the ability to zoom in on details. But after extensive experimentation, she has also discovered that the test version of the software was not quite ready for prime time. “There are lots of glitches, so I’ve recently gone back to the security of slides,” says Miller. “But when they’ve worked out all the bugs, I’ll be first in line to use it.”
Bugs or no bugs, nearly 10 percent of the library’s budget is now spent on “things that people will never actually touch,” says Okerson. “But these kinds of resources pose all sorts of new challenges for universities, and their libraries in particular.”
One issue that has taken on major importance in the electronic landscape is copyright, the section of the legal code that deals with the ownership of intellectual property. “Since it was first written into law in 1709, copyright has served us well in protecting the public interest and promoting the dissemination of knowledge,” says University Librarian Scott Bennett. “But until recently nobody was paying attention to digits, and there’s a concern that the Web and other new media opportunities offer publishers the means to achieve a perfect information monopoly.”
One critical issue, says Bennett, is an effort by at least some publishers to eliminate the concept of fair use from copyright law so that they can control (and charge money for) any use of material they own. “If we don’t find a way to preserve the fair-use concept, we’re in serious trouble,” Bennett says.
Cost crops up repeatedly in any conversation about digital information. The dawn of the cyber age had led many information managers to expect lower costs, as journals discussed abandoning their print versions and publishing only on the Internet. But Joseph Esposito, a publishing strategist, is skeptical. “The big myth is that it’s cheaper to publish online than it is by the traditional, hard-copy route,” says Esposito, who has directed the development of a Web version of the Encyclopedia Britannica. The hope in that case was that revenue from on-screen advertising would sustain the cost, but things have not worked out that way. “Academic content is expensive not because of production costs or because publishers are greedy, horrible people,” Esposito says. “It’s because the economic factors of the mass market don’t work for academic journals.”
What Esposito and others call “the crisis in pricing” is a function of the fact that academic publishing is not a growth business with an unlimited audience. “What works for John Grisham—lowering your price and recouping it in extra volume—doesn’t apply to the Journal of Hand Surgery,” he says.
Okerson expects that in time the proliferation of digital resources may result in at least some savings simply because Yale’s library system won’t have to keep expanding to find shelf space for all the new material. But the switch from the outright ownership of information to a leasing arrangement is itself causing a potential problem.
The contents of a conventional publication owned by a library can reside in the stacks indefinitely, but the rights to access electronic material often live with—and can be withdrawn by—its producer. In the worst-case scenario, the result might be a kind of digital dictatorship—the result of too much information being controlled by too few hands. But even if such a dire situation never comes to pass, there are other, more immediate difficulties to solve.
Bennett and Okerson, through a $150,000 grant awarded in December by the Andrew W. Mellon Foundation, are working on ways to ensure long-term access to digital resources. Through the Mellon grant, the librarians are developing a collaboration with Elsevier Science, a publisher of scientific journals, to create the foundations of an access system. “The grand tradition of the library is that when readers come in, they can see everything we own,” says Okerson. “But because we’re often not dealing with artifacts anymore, the access relationship may come down to a case of 'let’s make a deal.' The library is signing contracts that give us certain access rights to content.”
The purpose of the relationship with Elsevier is to figure out how to build an electronic archive that will persist over time. “Perpetual access is a big issue right now,” says Okerson. “We need to determine how to write contracts that enable us to get information we’ve already paid for if, say, we drop our subscription to a journal, or if the company that created the resource goes out of business.”
The library is certainly no stranger to what is in some ways a conservation problem. Indeed, creating an environment conducive to the long-term storage of books was the impetus for the recently completed $20 million renovation effort at Sterling. “We’ve known about paper deterioration since 1823, but it wasn’t until the late 1950s that we understood the chemistry of the problem and learned how to combat it,” says Bennett.
In the early 1990s Yale made a commitment to correct the situation. The result is better heating and cooling, along with one of the premier book conservation laboratories in the country and a program called “Collections Care,” the purpose of which is to keep books in good circulating condition. “Conventional scholars worried that we would go berserk with electronic materials and neglect traditional media, but that’s not going to happen,” says Okerson. “We’re committed to information in every format.”
But while bits and bytes may not be subject to the same kinds of degradation suffered by their paper counterparts, it’s becoming apparent that the endless strings of ones and zeros which make up computer code have their own unique conservation requirements. “Digital material is in fact a lot more fragile than its traditional counterpart,” says Dale Flecker, associate director for planning and systems of the Harvard library system. “It’s not a matter of preserving bits of data, but rather preserving their utility.”
Older computer users are familiar with what’s called the “Five-and-a-Quarter-Inch Floppy Disk Syndrome.” Disks of this size used to be the standard for storing and transporting data; now, one would be hard-pressed to find a computer on the Yale campus capable of reading such a thing.
Obsolescence is commonplace in the electronic landscape, says Flecker, who is working on a five-year, $12-million effort to ensure that digital resources will not disappear into the ether of inaccessibility. “Technology is moving extremely fast,” he explains. “When it comes to books and paper, it’s conceptually easy to do such things as microfilming and rebinding to keep these resources usable, but it’s hard to predict the future formats of digital material.”
One strategy for overcoming the predictability problem is to routinely practice “data migration.” This means that every time there’s a major change in the form in which data is stored and accessed, the people responsible for housing the material must ensure that it changes shape to match. This may be as relatively easy a task as transferring the files that are stored on five-and-a-quarter-inch floppies to three-and-a-half-inch disks, or as massive a job as Project X, the recently completed (and notoriously bug-infested) migration of all of Yale’s administrative data into a new system.
“You can’t avoid the fact that waves of changing technology will roll over you faster than you can plan for them, so regular data migration is a fact of life,” says Lawrence Gall, systems manager for the Peabody Museum of Natural History and the director of its ongoing effort to create an online museum. “The need to do what might be called creative destruction has to be built into your thinking and planning about the future.”
So, clearly, is the need to determine what kind of structure the “creative destruction”—the seemingly endless cycle of digital construction and reconstruction—is supposed to produce. “We’re getting to the point where an article published in electronic form is less about content than it is about communications,” says Joseph Esposito, pointing out that most Web material now includes links to new sources of information and to the people who created it. “Ultimately, this is about changing the boundaries of organizations and creating communities that have never before existed.” |