[ONLINE]
feature

Linux and CD-ROM Networking: An Academic Library's DIY Solution

Simon Bains & Howard Richardson

ONLINE, March 2000
Copyright © 2000 Information Today, Inc.

Subscribe

The CD-ROM has played a major part in allowing libraries to provide end-user access to electronic information services. Single subscriptions replaced the complicated charging mechanisms of online hosts, interfaces became user-friendly, and it became "safe" to let users loose without an intermediary (putting to one side the whole issue of searching expertise). On the other hand, the impending doom of this "transient technology" was being discussed as early as 1990 [1]. Throughout the 1990s, the rapid development of the Web and the underlying technologies required to allow it to function effectively have allowed database providers and users to return to online versions. This time, however, they offer the advantages of both online and CD-ROM. The interface may not be all that we could wish for, and speed can still be an issue, but Web-based delivery is good enough for library managers to question investments in local CD-ROM networks.

So do we still need the CD-ROM at all? The answer to this will always depend on your institutional circumstances, but is likely to be yes, at least for the short term. Some databases stubbornly resist migration to the Web. At City University, with a Journalism Department, access to electronic newspapers is essential. The cost of subscribing to an online host would be prohibitive, and only one UK national newspaper now has a reasonably substantial archive on the Web (The Guardian). The others continue to be provided on CD-ROM, with no indication of when they will offer Web-based services (which, one imagines, will happen sooner or later). Another reason for maintaining a CD-ROM subscription is that some customers may not be entitled to use the institutional Internet connection. At City University, for example, funding situations for some students attending our School of Nursing won't allow them Internet access. Other reasons might also be cited, such as licensing or technical concerns, so it seems likely that most academic libraries, for the time being, will need to offer a mixture of CD-ROM and Internet Databases [2].

Strategic decisions about database provision, therefore, must continue to take account of local CD-ROM holdings. However, the trend toward total remote access argues that large-scale investment in local CD-ROM holdings, and the networking infrastructure to support them, may be unwise. It was within this environment that Library Information Services (LIS) at City University (London, UK) took the decision to move ahead with plans for local CD-ROM networking, but to work on the assumption that the system might become redundant in a fairly short period of time, and so would not justify large-scale expenditure.

THE OPTIONS

Useful Web Sites
GNU General Public License
http://www.fsf.org/copyleft/gpl.html

Junkbuster Proxy
http://www.junkbusters.com/ht/en/ijb.html

K desktop environment
http://www.kde.org

KSamba
http://www.ksamba.org

Opensource
http://www.opensource.org

SAMBA Web Pages
http://www.samba.org

Sun SITE
http://sunsite.doc.ic.ac.uk

SuSE
http://www.suse.de/en/default.htm

UK Mirror Service
http://www.mirror.ac.uk/

The options available were as follows:
  1. To retain a standalone system

  2. To provide a closed CD-ROM network within the main university library

  3. To provide CD-ROM access within the University's Wide Area Network (WAN)

Option 1 had been recognized as inefficient, requiring the loan of CDs to users. This created extra work for both library staff and users, and exposed the CD-ROMs to the risk of damage or theft. It became imperative that this system be replaced by a networked solution. Option 2 would allow total control over the network, but was rejected, since it would have meant locating the server close to the client PCs in a public area. Option 3 offered the potential to provide access to all University PCs, but to achieve this, it would have been necessary to deal with complex technical obstacles and would have required significant security restrictions. Instead, it was decided to use the University WAN, but to configure only four dedicated client machines in the Library to provide access to the databases. These PCs were owned by LIS, and so total control over configuration was available, which would not have been the case with non-Library PCs.

This decision made, it was then necessary to decide on how to proceed. A number of companies offer "turnkey solutions" which are attractive, as they promise to address the complicated issues of networking the varied interfaces, providing security, and integrating differing license requirements. A number of products were investigated, and one company was invited to provide a system on a trial basis. The outcome of the trial was disappointing. The time and technical skills required for proper configuration were substantial, and the software was very unstable and expensive. Other solutions exist, such as thin client architecture [3], which will allow remote access to the databases (i.e., beyond the university WAN). But the expense related to this solution was regarded as unacceptable as well. Instead, LIS decided to build its own system using Linux. All the necessary software and documentation is freely available, meaning that the only significant cost would be the hardware selected to form the server. Note that it is necessary to consider licensing issues when deciding on an appropriate solution. If, for example, you are not entitled to cache entire CDs, you will need to think about buying a disk changer and software to manage partial caching.

THE LINUX SOLUTION

Linux originated as a project to build a work-alike Unix clone, Unix being the industry standard on large servers. The code was publicized on the Internet, allowing other programmers to develop and refine it. This formed a dedicated core of Linux users, who used it to operate a variety of applications. Linux is becoming increasingly popular now that it provides graphical user interfaces and applications that are a match for commercially available alternatives. Linux provides a stable and low-cost platform for all aspects of networking, and IT professionals recently ranked it ahead of Windows NT and Sun's Solaris [4].

Linux is freely available as source code, but for ease of installation and use there are a number of pre-compiled distributions available that have particular features for certain types of computer systems. These distributions are available freely via FTP (or on CD-ROM at a small cost). Most Linux software is developed open-source, so the user is free to copy, use, and modify it according to local needs, with no cost involved. The GNU General Public License guarantees users this freedom.

For some months prior to the development of the CD server, the IT Support Officer for LIS had been experimenting with the integration of the Linux operating system (OS) into an entirely Windows-based network. The original intention had been to test the viability of a Linux network using free or inexpensive software. A number of dual-boot workstations had been set up, allowing the user to choose between Linux and Windows. Advances in installer technology made this a reasonably straightforward process, and these workstations were able to share files and printers on a Windows network while also using Windows files on Linux-based office applications. However, the lack of a complete set of Microsoft-compatible software packages, and the relatively high level of IT literacy required for library staff to feel comfortable using Linux, were sufficient reasons to justify retaining a Windows environment, at least for the time being. However, the familiarity gained using the Linux OS was indispensable when it came to using Linux to solve the CD-ROM networking issue.

THE LINUX CLIENT/SERVER SET-UP

The first task was to write a specification for the server machine. It needed sufficient speed and hard- drive storage space to carry all the networked databases. The library purchased a 400Mhz Intel PC with 256MB RAM and 40GB hard drive space (sufficient for about 60 CD-ROMs and the Linux OS).

The installation of Linux over the Internet was a simple matter, using the university's own broadband connection. There are a number of versions of Linux easily available to the UK academic community over Sun SITE or the UK Mirror Service. The German SuSE distribution was chosen, since it is regarded by the Linux community as user-friendly and easy to install [5,6]. The only prerequisite was a Linux boot floppy disk, available as a downloadable disk image with the distribution. Once the PC was booted with this disk, SuSE Linux continued to install automatically from the Internet.

One benefit of Linux is that full compatibility with all networking protocol and file-system standards is built in. Most workgroup networking is carried out using the NetBIOS standard, which can be implemented over TCP/IP, NetBEUI and IPX protocols. The NetBIOS over TCP/IP was used in this case, since it's the standard Internet protocol. This made it possible to plug the server straight into the existing LAN, with the confidence that any computer with an Internet connection could connect (subject to security clearance).

Linux comes with a package called SAMBA, which implements NetBIOS services over TCP/IP, allowing OS-transparent access to shared file resources. This meant that Windows 95 PCs could be used as clients. It is easy to configure SAMBA using the KSamba program, which runs on the graphical K Desktop Environment (KDE). KSamba can be run locally or remotely, and provides the facilities to specify security settings, connection limits and read/write permissions, among many other options. It has obvious applications for a CD-ROM network, since it allows control over which machines can connect to the databases, and can ensure adherence to license restrictions on concurrent use.

The automation of caching and labeling of the CD-ROMs was straightforward. A caching script was written to read the CD label from the disk, create a directory with the same name, copy the disk across, set the correct file permissions, and add the directory to the list of shared resources. When a new disk needs caching, all the administrator needs to do is insert the disk and enter the command to cache. Within 10 minutes, the CD should appear on the client machines in the Windows Network Neighborhood.

The simplest option for configuring the client machines was to write MS DOS-style batch scripts to map in all the cached disks as network drives. Links to the batch files are then placed in the Start menu and on the desktop as icons. The batch scripts provided basic validation functions, so an error is returned if the maximum number of concurrent users are connected, or the CDs are unavailable for any other reason. It was decided that access to Web databases would also be provided on the client PCs, but general Web surfing was prohibited. In order to enforce this, another freely available open-source software package, called Junkbuster Proxy, was installed. This provides the facility to screen out Web sites that are not used by approved database services.

CONCLUSION

Within the HE community, Linux has, until recently, been regarded as an OS of interest only to computing departments, Unix specialists, and anyone interested in having an alternative to Microsoft on their desktop. This project demonstrates that it offers genuine benefits to libraries, and indeed to any organization or department on a modest budget that wishes to make use of networking technologies. In terms of CD-ROM networking, this solution represents a cost-effective way of providing a local CD network for what might be a short period before the transience of the CD-ROM is finally proven. Even if demand for CD-ROM rises, the system is easily upgradable.

City University LIS is now exploring the other possible benefits of a Linux server. It could be used for file-sharing among library staff, and to run an internal, private-staff Web site efficiently and securely. Although little has been written about Linux use in libraries, at least one library has used it to build a Web server [7]. Unlike a turnkey product, the usefulness of the Linux solution will certainly not end with CD-ROM subscriptions.


Glossary


IPX Internetwork Packet Exchange. A communications protocol used to route messages from one node to another.
KDE K Desktop Environment. A graphical user interface for UNIX workstations. It provides a complete desktop environment like Windows with unique features. The source code is freely distributed and is widely used with Linux.
NetBEUI NetBIOS Extended User Interface. The transport layer for NetBIOS. NetBIOS and NetBEUI were originally part of a single protocol suite that was later separated.
NetBIOS The native networking protocol in DOS and Windows networks. It was originally combined with NetBEUI but now provides a programming interface for applications.
SAMBA A freeware version of the SMB (Server Message Block) protocol for UNIX machines. It allows a Windows client to share files and printers on a UNIX server.
TCP/IP Transmission Control Protocol/Internet Protocol. A communications protocol that has become the global standard for communications.

REFERENCES

[1] McSean, T. and Law, D. "Is CD-ROM a Transient Technology?" Library Association Record 92 (11), (1990): pp. 214-215.

[2] Ma, Wei. "The Near Future Trend: Combining Web Access and Local CD Networks." The Electronic Library 16(1), (February 1998): pp. 49-54.

[3] Turner, Anna. "Thin Client Architecture for Networking CD-ROMs in a Medium-Sized Public Library System." Computers in Libraries, 17(8), (September 1997): pp. 73-75.

[4] "Linux: Your Next OS?" InternetWeek, 763, (3 May 1999). URL: http://www.techweb.com/se/directlink.cgi?INW19990503S0047

[5] Lynas, Christopher. "Linux Without Tears." The Guardian (Online supplement), (Thursday, September 9, 1999): pp. 10.

[6] Panders and Octane. "SuSE 6.0: The Next Distribution?" ArsTechnica, (March 1999). URL: http://www.ars-technica.com/linux/reviews/1q99/suse-1.html

[7] Orr, G. "Building a Library Web Server on a Budget." Library Software Review, 17(3), (September 1998): pp. 171-176.


Simon Bains (s.j.bains@city.ac.uk) and Howard Richardson (h.richardson@city.ac.uk) are Electronic Information Librarian and IT Support Officer, respectively, at City University's University Library in London.

Comments? Email letters to the Editor at editor@infotoday.com.

[infotoday.com] [ONLINE] [Current Issue] [Subscriptions] [Top]

Copyright © 2000, Information Today, Inc. All rights reserved.
Comments