DEPARTMENTS                         
                        Letters to the Editor                          
                         Another View
  of De-Duplication
In the opinion of MuseGlobal, "The Truth About Federated
                            Searching" in your October 2003 issue contains a
                            number of statements that are erroneous. In the interest
                            of
                            presenting your readership a more balanced view of
                            federated search technology, I'd like to correct
                            some of the misimpressions left by the article.  
 1. De-duplication does work.
                          Webfeat asserts that de-duplication doesn't work.
                          Their argument is that because a single search returns
                          a very large number of hits, say 100,000, you can't
                          claim to de-dupe unless you de-dupe every one of those
                          hits. This is like saying that a search doesn't work
                          unless you view all 100,000 of those hits.
                          Sure, it would take a long time to process all of
                          those records. In federated searching as in everything
                          else in life, there are trade-offs. Most searchers
                          initially retrieve a limited set of records from each
                          source. This allows the searcher to check their general
                          usefulness (and possibly decide to re-run the search
                          on fewer, more relevant sources). Not only does this
                          save time, but the searcher is not needlessly clogging
                          up servers by delivering thousands of records that
                          will almost immediately be discarded. In this way,
                          the de-duplication performance issue of dealing with
                          impossibly large numbers has been resolved. If the
                          first set of results is found wanting, the waiting
                          mass of results are still there and can be tackled
                          in manageable bites. With the right technology, the
                          next "bite" of results can be processed like the first,
                          and the new results can be quickly de-duped against
                          those left over from the first set of results.
                          It seems reasonable that you should be able to recognize
                          that a new record being added to the set is the same
                          as an existing one. Of course, you have to be merging
                          the results from multiple sources and processing them
                          all in an integrated results set to do this (like our
                          product, MuseSearch, and several other products do),
                          not maintaining them in separate groups by source (like
                          WebFeat does) in order to perform de-duplication. We've
                          been de-duping since day one, but we would be the first
                          to say that de-duping isn't perfect by any means. In
                          fact, we often state publicly that metasearching is
                          an 80/20 solutionyou're better off with metasearching
                          than without it, and it will only improve over time.
                          De-duping is one of the many differentiators among
                          federated search products; in fact, the ability to
                          de-duplicate results is one of the key requirements
                          articulated by users. Don't take our word for itsee
                          the detailed study sponsored by the National Library
                          of New Zealand that concludes "the consensus about
                          the role of a common user interface is that it should
                          be able to broadcast a single search to a variety of
                          databases in different locations and in different formats
                          and to unify the results from these databases, then
                          present them in a useful order and de-duplicate the
                          results (emphasis added). This is just one of the
                          reasons the study awarded MuseSearch top ratings. The
                          full study can be downloaded at http://www.natlib.govt.nz/en/whatsnew/4initiatives.html#review.
                          2.	Federated search can be software or a service.
                          The WebFeat article asserts that federated searching
                          is best when offered as a service, and that this is
                          the only approach that avoids downtime for software
                          or source connector updates. The truth is, a centralized
                          service is not necessary in order to incorporate frequent
                          software updates without downtime.
                          Our Source Factory distributes software and source
                          connector package updates seamlessly, allowing extremely
                          high levels of service with very little local administration
                          effort. Updates can be made automatically, without
                          service disruption. Most of our technology partners
                          (COMPanion, Endeavor, Innovative Interfaces, Mandarin,
                          Sirsi, etc.) offer both local software implementation
                          and hosted service options. Most customers opt for
                          a local software implementation. Our experience has
                          been that local customization and security requirements
                          are best served in this way. The bottom line is, the
                          best option is flexibility to implement in the way
                          that is most effective for each user.
                          3.	You do get better results with a federated
                          search engine.
                          The aim of MuseGlobal is to provide better results
                            with less effort. In general, you can get better
                            results with federated search than by using native
                            database search because, practically speaking, few
                            searchers would have the time or patience to do these
                            searches repetitively via individual search interfaces.
                            In the real world, federated searching can exponentially
                            improve the efficiency and quality of results.
                          We invite your readers to try Muse federated search
                          technologies for themselves with MuseSeek, our new
                          consumer-oriented Web metasearch engine (http://www.museseek.com).
                          Cheryl Wright
                             
                          Vice President, Marketing
                                                     
                          MuseGlobal, Inc.
                          Where in the World?                        
                         I have been a subscriber to Information Today for
                          some time now, and generally look forward to Barbara
                          Quint's articles, which are usually very pithy and
                          informative.
                          I must tell you however, that I was a bit bothered
                          by something she wrote in her October 2003 [Up Front]
                          article, in which she referenced Earth Station 5 as
                          being "reportedly based in Palestine."
                          Perhaps I shouldn't assume that she is aware that
                          there currently is no nation or state in the world
                          by that name, which was last used by the British during
                          the Mandate period?
                          A quick Internet search revealed that Earth Station
                          5 is located in Jenin, one of the autonomous areas
                          under the control of Palestinian Authority, which would
                          have been a more accurate way to describe the location.
                          As a sophisticated journalist, I'm sure she's aware
                          of the significance of names and of the importance
                          of accuracy. And to give her the benefit of the doubt,
                          I will assume the reference was an oversight and not
                          intentional, for to inject one's own politics into
                          a professional journal article is most unfortunate,
                          and unprofessional, as I'm sure you'd agree.
                          Thank you for your time and consideration.
                          Glenn Ferdman
                           
                          Director, Asher Library
                                                     
                          Spertus Institute of Jewish Studies
                                                     
                          Chicago
                                                                                                  
                                                 
  |