SRCH2: Out-Googling Google
by Stephen E. Arnold
Have you taken Samuru for a search test-drive? What about Ark, Q-Sensei, Shodan, or YaCy?
With Google dominating search, we can ask the question whether there is room for other search systems. blekko, DuckDuckGo, and Yandex have captured more media attention than some of the search newcomers. If I include companies positioning their technology as discovery, there’s another layer of options, such as Agilex, Connotate, Palantir, Recorded Future, and Visual Analytics.
There is a perception that search and related disciplines such as content processing, discovery, and predictive analytics are stagnating. It’s true that Google has a grip on the popular imagination. The company is even the star of a motion picture; Hollywood actors (Vince Vaughn and Owen Wilson) play salesmen who land internships at Google in 20th Century Fox’s The Internship. And yet search is still a bustling discipline.
In Spring 2013, an aggressive public relations and marketing campaign pushed SRCH2 into my overflight monitoring system. The Next Web’s coverage of SRCH2 stressed that the firm was founded by former Google professionals. Investors in the company include a who’s who of savvy groups, including Data Collective, Horizon Ventures, TenOneTen Ventures (Gil Elbaz’s company), Clark Landry (SHIFT), Taher Haveliwala (Kalfix Corp.), as well as some high-
profile angels. “SRCH2’s goal is to make enterprise search ‘instant’,” according to The Next Web. We know that snappy response time can be difficult in web and enterprise settings; network bandwidth, computing re sources, and inefficient software can make some search systems trigger high blood pressure among users.
A Killer Combination
SRCH2 reports it is ready for prime time. Intrigued, I talked to the founders about the company and interviewed Chen Li, a graduate of Stanford University and a former Google employee, in May 2013.
Chen Li explained the problem he wanted to solve:
In many ways, Google’s unveiling of “Instant Search” in 2010 provided consumers with a powerful example of what search could do for them. But extending that idea out to different internal and external corporate search boxes, on a wide variety of platforms and devices, has been difficult. Customizing existing search tools to deliver some version of “rich search” functionality has proven to produce an end-product which is complex, brittle, and slow. SRCH2 offers clear differentiation when you also consider complexity and time to market. When you add in-memory performance to this, SRCH2 offers a killer combination for these use cases.
The idea is that SRCH2 is purposefully built to deliver a Google-style solution to enterprise licensees. Google tried to put “Google in a box” with its Google Search Appliance. Though the appliances are still available, the features have lagged behind what Google offers to its consumer customers.
The ideas that infuse SRCH2’s system have been percolating for 10-plus years. Chen Li first published work in the field while he was a Ph.D. candidate at Stanford more than a decade ago. His Ph.D. advisor was Jeffrey Ullman, who is now an investor and advisor at SRCH2. Chen Li’s own research focused on the challenges around fast access to data. In 2012, the ACM Special Interest Group on Management of Data honored him with its Test of Time Award, and, in early May 2013, Chen Li received the 10-Year Best Paper award from DASFAA (Database Systems for Advanced Applications) in Wu Han, China. Boiling down SRCH2’s research efforts is difficult. “We have been investing in research related to databases, search, and big data,” according to Chen Li, who was guarded about the licensees of SRCH2’ s technology and noted the following:
SRCH2 clients are using it in a wide variety of contexts and devices. One of the things we’re very proud of is the breadth of uses and use cases. A major global handset manufacturer is porting it to the kernel, across millions of hand sets. Uses and platforms include mobile, e-commerce, social, and alternative devices, like cable set-top boxes.
I surmised that SRCH2’s hand set partner is HTC Corp., a Taiwan company producing Android and Windows phones. HTC is experiencing some challenges, including management turmoil and staff turnover, according to a May 2013 article in TechCrunch. SRCH2’s caution is appropriate.
Chen Li noted that there are specific reasons for this wide variety of end-use cases. “First is the relative compactness of SRCH2’s binary and index,” he says. “Second, and equally important, is the configurability of the engine. Using SRCH2, a developer can configure error tolerance by edit-distance. She can prioritize within relevant type forward results by profitability metrics, or by location, or any other metric. An e- commerce retailer might use this to hash results toward items in stock, or nearby, or related metric,” he continues. Chen Li says that while SRCH2 is handing out the tools, the answers are in the hands of the great, forward-thinking developers who use it.
The Speed of Search
| The University of California–Irvine uses the SRCH2 system. A query results screen displays hits matching the user’s query.
A key innovation in the SRCH2 method concerns the speed with which content can be processed and then accessed to generate a response to a user’s or to a subsystem’s query. Speed, particularly in mobile applications, is essential. Latency can drag down response time, so a search system must make use of predictive or “smart” caching, as well as highly optimized algorithms. There are computational boundaries that limit what some search vendors can do with their numerical recipes. Adding efficiency to basic methods can make a significant competitive difference. SRCH2, as with Google, knows that speed is often more important than some other considerations. Iterative processes that attempt to understand the meaning of a particular query can cause unacceptable delays in a search system, which Chen Li explains:
Type forward coupled with error correction offer users the chance to minimize character entry on a touch screen. The fat finger problem is just a universal pain point in mo bile, and the topic comes up often when we talk with clients in the handset business. People mistype on their phones, a lot, and SRCH2 minimizes the pain by generating great, relevant type forward results and tolerating their typos. Finally, this feature also proves quite valuable in non-traditional contexts. Many of our users talk about the problem they have when ordering a movie or TV show using search on a set-top box. They’re using remote control devices to enter characters by pointing. Quite painful. And SRCH2 can help with that.
For organizations that want to use the SRCH2 technology, the company follows the methods that Google, HP’s Autonomy, Microsoft, and Oracle Endeca Commerce use. A licensee downloads the SRCH2 software, configures the engine on the data, starts the SRCH2 engine, and develops the front-end user interface.
Licensees use SRCH2’s RESTful API to do search queries and update data. With SRCH2, licensees are able to provide a very powerful search interface with many Google-like features. What struck me was the simplicity of configuring the system; the goal is to deliver a “launch in an afternoon” solution.
SRCH2 is more like the original Google Search Appliance approach. The complexity that has been imposed on the market has been “stifling,” from Chen Li’s view. “We want to free people up to work on improving the experience, which is an iterative process which can even begin after launch,” he says. “Imagine launching in a day, then spending the rest of your time figuring out what configuration of results is best for your particular needs in your particular industry.”
Search-based applications have become more important than brute force search and retrieval in many organizations. Chen Li explained the process of embedding search in other enterprise applications:
SRCH2 believes end-users and search integrators should be able to tap search within their applications, broadly. And the applications can be both software, and hardware embedded applications, like GPS devices. The limit should not be imposed by the search software. One factor which is limiting the size and scope of the search market is the limit imposed by the actual physical size of the search software and search index itself. We feel that in a market where the cost of memory is shrinking rapidly, where devices are getting smarter, and where data-driven apps and uses are exploding, search software has to also be accessible. For our part, we’re developing to keep ahead of these trends by constantly focusing on simplicity, elegance, and configurabil ity. Put tools into the hands of application developers, and let them decide how to implement great search.
A Place for Search
I was curious about where search would fit in the world that is evolving around those looking for information: “Mobile applications, location-based services, and cloud-based search,” according to Chen Li. “The current search tools and services, including Google and several open source solutions, are not optimized for such applications and technologies. The shortcoming lies far deeper than product design. It lies in the very foundational algorithms.” He says that SRCH2’s search engine is built on the world’s leading research concerning the type of search that is uniquely tooled for mobile, location, and cloud applications. “We don’t merely respond to these areas,” he says. “We lead.” He says the higher and tougher the demands from new applications, the more advantage SRCH2’s search technology would have over other solutions.
Search and retrieval as a discipline presents a paradox. On one hand, the methods used by many vendors are not substantially different from those in use for many years. For example, Google is more than a decade old. Autonomy and Endeca are more modern than the ISYS Search Software, now owned by Lexmark. ISYS’s technology dates from the 1980s. On the other hand, the volume of audio, image, and video contents go up. Searching for nontext objects is less advanced than processing traditional text.
Some search vendors have repositioned themselves to serve niche markets. It is too early to determine if Coveo’s shift to customer support or Digital Reasoning’s pursuit of the financial analytics market will pay off in revenues and profits. Other vendors have shifted their business from dependence on one system to emerge as a platform upon which applications can be built. BA Insight and Dassault Systèmes’ Exalead are taking this approach. The jury is still out whether these strategic shifts will pay off.
SRCH2 is a thoroughbred in the Search Derby. At the Kentucky Derby this year, a veteran of the betting windows told me, “Any horse can win unless there is a second entry.” SRCH2 is off to the races.