The Lost Art of Sourcing
by Barbara Quint
Editor, Searcher Magazine
We live in an age of information affluence, an age in which the average user would more likely be willing to pay for services that reduce information flow than increase it. Oddly enough, that kind of fits into the skill set and professional orientation of we information professionals. Despite our commitment to gathering and preserving information — all information, in practice we try to gather the “good stuff.” In the previous era of information scarcity, we might have made token statements of regret that we could not afford all the information (and I use the “infotainment” definition of information) that we might otherwise wish to amass. But in our heart of hearts and while ALA’s Intellectual Freedom troops weren’t watching, we really felt professional satisfaction from picking out the quality and rejecting the dross.
Speaking of those Intellectual Freedom fighters, let me reiterate the standard excuse for foot-dragging. It’s not that any of us want to see any piece of information completely destroyed, torn from the face of the earth, lost forever in the mists of time. We all believe in the preservation of all information items, that nothing should ever be lost completely. In fact, if we had been at Alexandria when Julius Caesar’s troops let the world’s largest library burn, we all would have joined the bucket brigade. Even today, if any of us watches George Bernard Shaw’s play, Caesar and Cleopatra, even the movie version, we all would respond to the anguished cry of the librarian — “The mind of man is burning!”
Nonetheless, while our profession defends the right of all information to continue its existence, we spend much of our professional lives engaged in what some might call censorship. Our duties as professionals require us to bring the best we can find and/or afford to our clients — the truest to our knowledge-seeking clients, the funniest or dreamiest or most thrilling for our clients seeking entertainment. And in following this service goal, we judge our success by measuring how much of our content choices reaches users, how often our books circulate to patrons, how many journal articles patrons photocopy or download and so forth.
However, our rationale for this filtering of information is often still tied to the needs and realities of an age that has passed, the age of information scarcity. No one can afford to buy everything everyone might want, so you have to buy the best. In fact, the public library could be defined as a community’s attempt to share access to expensive resources. But in this era when Google Books carries more than 10 million items, at least 2 million of which are downloadable for free, well how necessary is it to buy the classics any more? Project Gutenberg’s tens of thousands of free sources constitute a filtered collection of content. Want to view movie or television archives? Hello, Hulu! Heck, hello, YouTube. Or pay Netflix its monthly subscription minimum and get access to tens of thousands of streaming video items and 24/7 too.
So has the role of librarians and information professionals in grading and judging content become a thing of the past, a skill no longer needed or wanted? No, but the rationale has changed. The justification is no longer driven mainly by the need for prudent expenditures of money. Instead, it should be driven by the prudent expenditure of time, users’ time and our own.
We need to build or at least contribute to the building of advanced search engines that recommend the best information for specific tasks. And that doesn’t mean just “rounding up the usual suspects” by only licensing content from traditional information industry sources. I don’t know if anyone’s noticed, but some of those sources have suffered an erosion that probably cannot be stopped. Electronic collections of periodicals may seem as massive as when they began decades ago, but the periodicals may have shrunk or been discontinued even. In any case, most traditional archival collections — sadly and perhaps even shamefully — have failed to keep up with the enhanced digital content offered by publishers. More than a decade of The New York Times’ digital supplementary content is gone and lost forever.
What we need are engines that can deliver all the best content for all needed purposes and tailored to the needs and abilities of different types of users. Watching that Jeopardy contest with IBM’s Watson really made the situation pop for me. It’s not just a matter of tweaking Google’s relevancy-ranking algorithms. Watson only used “respectable” sources, but it plumbed them at speeds and with algorithmic sophistication that managed to beat the best human Trivial Pursuit mentalities. A little more tweaking might have even prevented some ludicrous gaffes. For example, someone should have taught Watson that the Final Jeopardy category is always worded very accurately, no cute wordplay. So if it says U.S. cities, you really shouldn’t name a Canadian one. But Watson’s performance was devastatingly impressive.
So what about a Watson loaded by librarians? We could even enter the questions and grade the answers in tweaking tests. We could identify the best sources — online and offline, pay-me or given away, traditional or web/app only. We could analyze the different types of users and their abilities to absorb information. On the entertainment side of the infotainment arena,
we could enhance recommender services, perhaps even managing to teach outfits like Netflix how to identify not the most popular but the rarest seen items for those who have already tapped all the major items in a genre. We could bring together all the best sources and make them jump through the hoops of user needs and desires.
Wow! Wouldn’t that be fabulous?! And all the results would bear the proud brand of “Powered by Info Pros”; that, or maybe “Approved by ALA.” For when I say “we,” I mean “we.” This task would be one for all of us, working together in networked glory. It would need teams of librarians and info pros working together, connected by social networks, and not restrained by any ZIP code constituencies or institutional employee lists. Vendors would be involved by us — consumers of their current services — pushing and prodding them to get with the program. We’d convince them that dragging their feet on delivering their product would do them more harm than good as their products would no longer bear the librarian seal of approval, that our super search engines had built-in options for reimbursement and promotion.
Hey, Big Blue, not that I’m personally very interested or anything, but just for discussion’s sake, what did you say was the sticker price on that Watson over there? Does that include all the options?