ON THE NET
blekko: A New Search Approach
By Greg R. Notess
The search engine marketplace has been contracting for the past few years. Google’s domination equates to vastly higher barriers to entry for new search engine companies. One company, blekko (www.blekko.com), hopes to break into this difficult market. Despite its rather unpleasant name, blekko offers a different approach to web search by offering slashtags and many options for the data-hungry searcher. The use of slashtags, user input, and open ranking statistics all make blekko a significant information resource.
Marydee Ojala wrote an informative overview in an Information Today, Inc. NewsBreak (http://snipr.com/1xaoyb) describing blekko’s features after it launched to the public on Nov. 1, 2010. However, in the past 3 months, it has already expanded with new features. Will blekko suddenly become everyone’s first-choice, go-to search engine for popular and obscure topics? No. But for professional searchers, data geeks, and search engine marketers, blekko offers rich sources of data about searching, ranking, and webpages that make it a fascinating destination site.
Bill of Rights
The company takes a bold stance with its Web Search Bill of Rights. While this builds on ideals espoused by other new search attempts (such as Wikia Search, which is now defunct), it takes obvious aim at the closed algorithmic approaches of Google and Bing, not to mention a direct hit on their personal data collection practices.
1. Search shall be open
2. Search results shall involve people
3. Ranking data shall not be kept secret
4. Web data shall be readily available
5. There is no one-size-fits-all for search
6. Advanced search shall be accessible
7. Search engine tools shall be open to all
8. Search and community go hand-in-hand
9. Spam does not belong in search results
10. Privacy of searchers shall not be violated
It is nice PR, but how many people really care about the principles of a web search engine? The contrast between blekko and the market leaders lies much more in the way they operate.
The main web search engines, well exemplified by Google, have long since moved beyond just a single database of webpages. With image, video, news, and other specialized databases now integrated into search results, you may find a purely textual results set of webpages a bit surprising—even a bit retro. To get to image, video, and news content through blekko, you must choose or enter a slashtag. Contrast that with the experience at Bing or Google that integrates results from other databases into an initial search results set.
Compare, for example, a search on that overly cute, diminutive primate, the tarsier. A search at blekko gives 20 textual results, starting with the ever-dominant Wikipedia entry. Search tarsier at Bing, and four videos appear at the top of the results (and just mouse over any of them to get a preview of the video) while seven images appear near the bottom. Google puts its seven images at the top, even prior to the Wikipedia text link, and shows snapshots from two videos after the fourth text result. Both Bing and Google list related searches.
With search on a news topic such as congress, both Google and Bing have several headline news stories linked and embedded in the results. Again, blekko returns just its 20 web results. Local information, quick answers, and embedded map links are available at Bing, Google, and Yahoo! for many searches but are noticeably absent
In some ways, blekko harkens back to the days of 10 blue links (even though 20 are displayed). Remember, the search engine is still in beta, and it is beginning to develop some of the features long trumpeted by its older, bigger cousins. You can find flight status (DL 12 shows the flight details for Delta Flight 12) and entering a UPS tracking number results in a quick answer link pointing directly to “Track UPS package” for the number entered. My main point in these quick comparisons is that when average users (are there such creatures?) try blekko for a search or two, I doubt that there will be much to keep them from returning. Then again, it is unlikely that Avery Average will even hear about or try blekko, since only above-average users read columns such as this one.
It is for just such users that blekko really has much to offer, with its ability to turn and tune search results with a wide assortment of slashtags. It offers huge amounts of data about searching and enables customization and social graph connections. With blekko’s unique approaches and advanced features, it is easy to write at length about its many options. Unfortunately, it is just that complexity and abundance of options that can get in the way of effective and broad use of its capabilities. So I will take a more detailed look at a few of its capabilities that I think will be of most interest to information professionals (or at least to me in my work environment).
Calling itself “slashtag search,” blekko repeatedly emphasizes its uses of slashtags. In a nutshell, slashtags are a way to limit results. They go in the search box, usually after the query terms, so that a search can be quickly targeted to a particular topic or approach. According to Rich Skrenta, blekko’s CEO, “A slashtag is a tool to filter search results. Rather than searching the entire Web, a slashtag allows you to search just the sites you want searched.”
The slashtag approach is easiest to understand by trying some examples. These do seem to follow item No. 6 in blekko’s bill of rights about making advanced search accessible. As blekko likes to point out, these types of searches could be difficult to achieve with other search engines.
Try "family values" /liberal to search for references to the phrase “family values” (and yes, blekko supports the quotation marks to identify phrases) limited to sources identified as liberal. Similarly, a search on “public option” /conservative can be used to search blekko-identified conservative websites for pages discussing the phrase “public option.” The slashtags on both of these can then be reversed to find opinions from the other side of the ever-sharpening political divide.
Looking at topical slashtags, try a general search such as cookies /glutenfreeblogs that searches a collection of 125 sites focused on a gluten-free diet. On the scientific side is a search such as greatest stars /physics with results coming from 65 well-known physics sites. Contrast those results to greatest stars /gossip for an example of the ways that use of slashtags can completely change the results set.
The people slashtag (try something such as “patent law” /people) is a built-in option that limits results to “pages associated with a person.” It seems that the identification of people-oriented sites works well, at least for the top results. The ranking, however, seems to be based more on web presence and links rather than professional reputation.
Speaking of links, there is also a /links slashtag, one of the many data and command slashtags. While blekko uses the slashtag for many of its functions, it does not always require the searcher to manually enter the slashtag. For links to a specific URL, entering www.loc.gov /links works, but so does finding the Library of Congress site in a search results list and then clicking on the small, blue “links” link, which is the third option listed just under the title.
For the professional searcher who lost access to Yahoo!’s extensive link searching when Yahoo! moved to Bing’s database (even though Yahoo! Site Explorer still offers limited link searching), blekko adds another source for looking at link patterns on the web. While I have not yet found a way to do combined link searching, such as excluding links from a specific domain or a set of domains, searchers can at least find the pages that blekko has identified as linking to a specific URL.
The “links” link is just one of the several data and command options listed on the line below the title of each result. The others include like (for logged-in users to add the site to a slashtag or link to a Facebook like; initially called “tag”); seo (detailed data I’ll describe next); cache (yes, blekko also caches copies of pages); ip (looks for other sites with the same numeric IP address, which can find interesting relationships between sites or disparate sites hosted on a shared server); info (produces a pop-up with brief descriptive information on the site or topic and links to a Wikipedia article); and spam (for logged-in users to remove results from that host for all of their own future searches).
What is available in the seo (search engine optimization) data? It brings up a whole collection of information, both for the specific URL and for the whole domain. This includes inbound links, crawl statistics, pages on the site, ranking scores, a compare function to compare that page with another one, and a “duplicate content” feature that tries to analyze other domains that have exactly the same or very similar textual content. The latter can be a great tool to explore intellectual property theft and investigate disparate sites that may be getting the same textual information but do not otherwise appear to be connected. Other links show the page source and let users visualize the difference between up to four URLs with charts showing links, rank, and crawl statistics.
When exploring blekko, be sure to mouse over or click on links and icons, especially when you are not sure what they might do. At the top of the results page, for example, next to the /relevance and /date sort options is a small yellow star icon that links to detailed ranking data for the search. (Or just enter /rank in a search query to go directly to the ranking data.) This data shows where search terms (or variants—another fascinating look into the alternate terms that the search engine also searches) occur on the pages found. It highlights domains, anchor, title, and URLs and then gives a final ranking score for the search. Even more detailed criteria can be seen (such as language, redirection, and other as-yet-unexplained criteria) by clicking the “more detail” link near the top.
Another interesting collection of data available from blekko is the ability to see what sites share the same Google AdSense code. Presumably, those sites are owned by the same person or organization. Within the search results, look for an “adsense” link next to or in place of the info link. Clicking on it will reveal the AdSense number, and then you can find other pages using that same account to display Google-run AdSense advertising on other hosts and pages. The slashtag of /adsense=number works as well. This data has been available at sites such as Reverse Internet (www.reverseinternet.com), but blekko provides yet another source and can turn up additional sites.
Recent Slashtag Approaches
Beyond a data-rich search environment, blekko has been exploring other new approaches to search, including the automatic application of slashtags and entering into the social graph of Facebook. Recognizing, I suppose, that training searchers to use slashtags on a regular basis will be a Sisyphean task, blekko automatically applies some slashtags. Since the launch, clicking some of the data links, the ranking choices, or the suggestions on the left would all automatically apply a slashtag. More recently, any query identified as a health-related query automatically has the /health slashtag added. At least users can choose to broaden their search from the (as of now) 77 approved /health sites by clicking on the message stating something such as, “Showing results for ‘headaches /health.’ Show web results for ‘headaches?’” or just by adding /web to the search.
With Facebook Connect, blekko users can connect to their Facebook accounts and use the new /likes slashtag. By tapping into what Facebook calls its social graph, blekko searchers can find pages from sites “liked” by Facebook friends. Look for the /likes filter with the Facebook logo next to it on a blekko results page. Click on that, and log in to your Facebook account (and you must also grant blekko permission to access your Facebook information). Then use the /likes limit to see sites that “you and your friends have liked via the Facebook like button.” blekko does note that “only you can see and search your /likes slashtag.” It is a fascinating approach to finding out more about your friends in your own social graph and their interests, or at least the interests of those who actively use the Facebook like button.
While I earlier contrasted blekko with the major search engines in how it fails to integrate image, video, news, and other database content into regular results, that does not mean that blekko has no access to such results. The company has partnered with others to supplement its own web crawl. Try a blekko search on tarsier /images, and a collection of image results “powered by Bing” appears. Similarly, a search of tarsier /video gets results “powered by YouTube,” and biloxi /map displays a Bing map. If blekko finds no results or very few results, it pulls “additional web results” from Yahoo! (which is powered by Bing). More external searches are available from third-party API slashtags (www.blekko.com/tag/show). There is also a list of topics slashtags and a searchable database of user-built slashtags.
I am finding that, rather than using blekko to search the web and quickly jump to destination sites, I have been using blekko to research the web, to explore the data from Facebook friends’ likes, to link patterns, and to rank data. I spend more time on blekko’s site than do I using its results. Others have praised blekko for allowing users to create slashtags that essentially create your own limits. It’s certainly another reason to take time to learn blekko’s capabilities. It has plenty to offer. Don’t feel the need to learn everything, but explore blekko to see what it might help with in your work environment.