ONLINE Magazine
THE LEADING MAGAZINE FOR 
INFORMATION PROFESSIONALS 
Table of Contents Previous Issues Subscribe Now!
VOLUME 26 NUMBER 5 SEPTEMBER/OCTOBER 2002 
ON THE NET  
The Blog Realm: 
News Sources, Searching with Daypop, and Content Management
by Greg R. Notess
Reference Librarian, Montana State University

Like to browse, surf, wander, and explore the Web? It is easy to find many interesting nooks and crannies during the process, yet remembering the URLs for each is a daunting task. Bookmark lists and favorites can help, but too often they just expand too quickly, soon becoming unmanageable. And they are individual, designed for a single user to keep track of items, rather than a tool for adding and sharing notes and commentary with the entire Net community. 

Enter the Web log. Quickly conjugated to "Weblog," the shift of a space makes "we blog," and the shortened version is "blog." It has become the "in" technology of the moment on the Net. While at first glance, a blog may appear to be little more than an online diary of oft-uninteresting personal opinion and seemingly random links, there is much more to the blogging world of which the information professional should be aware.

Despite the many purely personal-focused blogs and opinionated pontificating of others, Weblogs offer access to breaking news, rumors, evaluations, and other information that might not otherwise be readily available from our traditional databases. Above and beyond their information value, the software for creating blogs is basic content management software, and it can fulfill purposes well beyond the keeping of an online diary.

WHAT'S A BLOG?

The world of Weblogs is a strange one that many an information professional may appropriately choose to ignore. Blogs are as varied and diverse as their creators. Typically, it is a Web site with frequent, dated entries listed in reverse chronological order. The entries have links and commentary and often an opportunity for others to comment.

Many blogs have a high link-to-word ratio; so, reading a blog can involve frequent meandering to other pages to fully understand the reference and comments. Thus, blogs can function as a teen's diary, a political pundit's soapbox, or even simply as one person's view of the news.

For a more detailed introduction to blogs, see Darlene Fichter's "Internet Librarian" column, "Blogging Your Life Away," in the May/June 2001 ONLINE. She covers the history of blogs, usage on an intranet, and basics about the software. She also lists several library-oriented blogs. Or just explore a few on your own such as www.theshiftedlibrarian.com.

EXPLORING THE BLOGOSPHERE

A quick look at a few Weblogs may leave you rather unimpressed and uninterested in the whole blogging phenomena, especially if the bloggers' interests do not match yours. Why would a busy information professional want to spend time reading the thoughts and opinions of someone who seemingly only spends time reading news, linking to it, and commenting on it? With thousands of blogs online and new comments added at almost every moment, a blog-reading addict can quickly find that there is little time left for anything else.

But the profusion of blogs allows the Internet community to provide an interesting and very up-to-the-moment news source. Rumors and inside information along with blatant errors fill the Blogosphere. Yet when you need current information about a just-breaking topic, or would like some commentary on recently released software or Web sites, check the blogs. For every hundred blogs of absolutely no interest to you, there will be one or two with top-notch information and commentary.

The problem, of course, is how to find a relevant blog, especially when you are on a deadline. Check the Open Directory listings [www.dmoz.org/Computers/Internet/On_the_Web/Weblogs/] to try and find a few of interest. But the real need is for searchable access to the blog content. Fortunately, Daypop provides that opportunity.

DAYPOP FOR SEARCHING BLOGS

Daypop is a specialty search engine that just crawls and indexes Weblogs and news sites. It does not try to get every blog out there. Instead it focuses on what it calls the best of the blogs and news sites. It does cover over 7,500 and refreshes its entire index once or more per day. The news sites include both English and non-English language sources.

Daypop is a great search engine for getting news from beyond the traditional media. While it does not have sophisticated advanced search features, it does offer a few important options. Its search defaults to a Boolean AND and supports phrase searching with quotation marks. Use a + to force a stop word search or the ­ to exclude terms. No full Boolean searching or OR searching is available.

Daypop has an advanced search page, but both the basic and advanced offer four content-type limits: the default news and blogs, just news, just blogs, or RSS headlines. The advanced search also has language limits, country limits, and the choice of how many results to display.

Daypop takes after Google in several ways. The search results use a keyword-in-context display, have a link to a cached copy of the page, and include the size of the page. The results are labeled with an N for the news sources and a W for Weblogs. The blog hits also have a link to "citations," which then finds other blogs that link to the original hit.

The ability to browse sideways with the "citations" link to see which other blogs are providing commentary is one way to use Daypop to get more than a single blog's viewpoint. Daypop also has a couple of special pages that use their own link analysis to identify top interests in the indexed blogging community. Their "Top 40" page is a ranked list of links that are most frequently linked by bloggers in their daily section (not just anywhere on their Web pages). The page even offers graphic depictions of the rise and decline of each link's popularity. A similar "Top News" page shows the most-linked-to news stories.

SEARCH ISSUES

Blog writers have certain areas of common interest. So blog searching with Daypop can be an effective tool for finding rumors and controversy on current news topics and political stories. Information technology topics like Linux, cascading style sheets, blogs of course, content syndication, the latest technology gadgets, and Microsoft are popular. Do not expect to find the latest in geological research, geriatric psychology, or in-depth industry analysis. On the other hand, Daypop is a great source to troll for hints, rumors, inside information, and consumer reaction to specific businesses and products.

Beyond Daypop, blogs are indexed in the regular search engines. Google even tries to identify blogs and similar sites with frequently updated content for near-daily crawls. Even so, most general search engines do not include the most current content from the blog. Even Daypop is not completely up-to-date, but it comes the closest. However, due to the design of most blogs, Daypop, Google, and the other search engines have additional difficulties with the indexing.

Blogs consist of many postings, typically featured prominently in the center of the page. By organizing the postings in reverse chronological order, the most recent entry is displayed at the top. To keep the page manageable, older posts are shuttled off the front page and into an archive section. With some bloggers so active and prolific that they post dozens of entries per day, the posting that was indexed yesterday may now already be at the URL of the archive, even though the search engine still points to the front page. Even if it points to the correct URL, it may be difficult to quickly find exactly where on the page the section of interest is.

Also note that unlike traditional online news databases, the search engines and Daypop do not index each entry separately. Instead, the words on the page are indexed as they appear at the time the search engine visits the page. So be prepared for a bit of digging to get from the search engine link to the content you want. One way around this problem at Daypop is to choose to search the RSS headlines that do provide entry-level indexing; unfortunately, the RSS headlines are not as complete as the other kinds of searching.

CONTENT MANAGEMENT

OK, you have explored the Blogosphere, tried Daypop, read more vehement opinions than percolate around the water cooler, and you are still unimpressed? Despite the hype from bloggers and journalists about Weblogs becoming the new media, it is nowhere near that level yet, and many information professionals can safely ignore the whole phenomena. But do not turn your back on the world of Weblogs completely until you also consider the content management angle.

Part of the popularity of Weblogging is due to the simplicity of creating one. Blog software is easy to use and may even come bundled with free hosting of the blog. No HTML, scripting, direct Web editing, XML, RSS, or CSS knowledge is required. Simply run through the setup, and then start creating entries. One or more people can be given posting privileges. They just log into a special Web page, and then type in (or copy and paste) their content. Posting it is a simple as clicking a submit button.

The software automatically formats and posts the entry. It also automatically archives older ones on separate pages. If categories are used in the creation of entries, the software can also create subject-specific archives based on the keywords used. The site design can be edited within the blog software using pre-defined settings or more sophisticated redesigns.

And while the intent of the blogging software is to create a Weblog, it can be used for many other content management needs. Especially for those without a full content management system, the blog software can be an opportunity to get more people involved in posting content on a Web site. Use it for maintaining a news page, a What's New section, librarian's favorite books, incoming titles, or any other periodically updated page. It can even be used for more static sections on a site.

Weblogs are a fascinating Net development. They have been around for years. If, like me, you have avoided or dismissed them in the past, take another look. With the search capability of Daypop and the content management capabilities of the blogging software, new possibilities for both make them something to consider in a new light.


Greg NotessGreg R. Notess (greg@notess.com; www.notess.com/) is a reference librarian at Montana State University and founder of SearchEngineShowdown.com

Comments? Email the editor at marydee@infotoday.com.


[Contents] [ONLINE Home] [Subscribe] [Top] [Information Today, Inc.]