Information Today
Volume 18, Issue 10 — November 2001
IT Interview
Sentius Corp.'s RichLink Embeds Content in Context
CEO and founder Marc Bookman talks about his company's technology and markets
by Paula J. Hane

Sentius Corp. is the developer of RichLink, a patented database linking and embedding technology that automatically adds content to Web sites. RichLink embeds layers of relevant content that pops up at the reader's request in a window called a "Knowledge Burst." The company recently launched its e-publishing suite, which is designed to increase advertising and licensing revenues for publishers. I talked with Marc Bookman, Sentius Corp.'s CEO and founder, about the company's technology, markets, and plans for the future.

Q Content enrichment or enhancement on Web sites is usually a manual process done by editors who code or write annotations and decide upon and provide links. Let's talk about the concept of "automated content enrichment," for readers who might not be familiar with this technology.

A The goal of automatic content enrichment is to enable publishers to rapidly layer additional content and functionality into their documents in a highly scalable manner. The enrichment focuses on helping readers to understand, decide, and act more effectively on that information. One example of content enrichment is enabling English documents with foreign-language assistance to make non-native readers proficient. Another example would be to enable a customer-support person to read a technical document and answer customer inquiries more quickly. Ultimately, Web sites should be able to publish documents to a worldwide audience with the best references based on the language, proficiency, and relationship that they have with the end-user.

The process starts with an analysis of the HTML or XML document to determine the important parts of the page, and then automates and executes the rules for linking. Following this structural analysis, we do a linguistic analysis of words, sentences,and terms. Though it may sound simple, it's quite complex. We've invested tens of thousands of man-hours in the technology to make this work.

Q And is this proprietary technology that was entirely developed by Sentius?

A We built everything from scratch: all the components, the architecture, and our own English-language parser. Recently, we added Inxight's LinguistX package into our system. It's great for analyzing inflections or morphology and doing things like stemming.

Q What kind of content is included in your pop-up Knowledge Bursts besides definitions and descriptions?

A We work closely with our customers to help them crystallize their ideas of what to incorporate into their Knowledge Bursts. That can mean company information, term definitions, marketing messages, and so on. Lately, we've been hearing more requests from our customers and prospects for RichLink-enabled product information that would show the most relevant product info when a reader clicks on a product name. We think that's going to be a big area. We also see great interest in Web sites being able to link their key terms to the latest news items on their site. There are images in some of our content sources and many of our customers intend to include movies down the road. When the tools improve for audio and visual search access, we will be able to link more effectively to multimedia materials.

Q Let's return to our discussion of the term "automated." It would seem that a great deal of people involvement would be required in addition to the automated process, especially in the initial setup process.

A Correct, though our foreign-language application is at the stage where it's pretty easy for us to set up an e-globalization solution for someone. Even if customers have their own glossaries it can be done fairly quickly. For other RichLink applications on sites that are set up well with a decent search engine and taxonomy, we can basically implement an out-of-the-box solution. But, if we really want to go in and evaluate the content and the layers of information, it does take some planning and solid editorial work. Once the system is implemented, however, the process is thereafter totally automated.

Q Tell us about the background of starting the company and developing RichLink. Was it originally developed just to deal with language translations?

A I was in a sea of Japanese-language documents and e-mail when I was living in Japan and working for Sony, and that's when the need became clear to me. When I started the company, I wasn't sure exactly which application to pursue first. When I showed prototypes to people, I heard, "I wish I had had this for my chemistry textbook," or "I wish I could read technical documents this way." I chose foreign-language annotation as the first application since the need for it is so clear. We immediately moved on knowing that foreign-language applications were just the tip of the iceberg.

Q We all like information to be available at the point of need—you call it "at the point of impact." Visually what RichLink provides is different: Clicking doesn't send the user off to another page or site or browser window; the information pops up in a box. How does your automated technology actually differ from other competitors' technologies (such as Atomica), both technically and from the users' experience? You've patented your technologies, but why couldn't competitors just copy the concept of a pop-up box?

A Our intellectual property relates to taking a textual document with no structure, no links, no tagging, and ending up with a document with enriched links that guide the user as to what to click on. And when the user clicks on those anchor points in the document, pop-up information is provided. It's the combination of all those things that we patented. We provide direct linkages between a term on a page and content in a database or in multiple database sources. We give the user database content within context.

We thought this entire process through some time ago, and there have been a few companies that have tried to employ a similar process. Flyswat, the company that NBCi purchased, is one. There are some other companiesnow releasing technology that driveslinks onto Web pages and documents via the contents of databases or external reference sources.

What Atomica is trying to do is very interesting. The goal of aggregating information pop-up portals based on words and phrases is a similarity between our two companies. Right now they seem to be focused on corporate intranet applications. They have a desktop client that a user installs, and then when a user presses ALT and clicks on a term, a pop-up window appears. It's a neat application and is actually an approach that we toyed with many years ago, but we really wanted to get out of trying to sell a search tool to end-users—either to the corporate market, which we felt would be a difficult sell and require dealing with IT departments, or to consumers, where the ceiling on how much they are willing to spend on a reference work is pretty low. Our customer is a Web site or a publisher that is syndicating or publishing content to the Web. We are trying to help them turn their documents into richer portals of information.

Q In the databases of information that you have available for licensing, I understand that you have over 3 million terms in 15 industry areas. Who are the content partners that provide these databases and what are some of the databases that our readers might know?

A In Japan, we're working with Kenkyusha, a dictionary provider, and also with Nova, a large translation software company, so we have top-of-the-line Japanese reference information. For European languages, we're working with Lernout and Hauspie, as well as some smaller vendors who are doing more specialized European language dictionaries for medical and other scientific fields. For each category, we try to identify who the top-branded source is and work out a partnership. For some of our U.S. domestic content, we've worked out relationships with Academic Press (The Dictionary of Science and Technology), Merriam-Webster, Facts and Comparisons and Lexi-Comp (both drug database providers), MedicineNet (see Figure 1), as well as some general sources like American Heritage and Columbia Encyclopedia. These sources serve as a base for our customers to work with, if they choose. They can then add custom layers or work through us to add other content.

Q RichLink is available either as a hosted service or for on-site installation, isn't it?

A Yes, we offer the software and a full end-to-end hosted service, and also hybrids of these. For example, people will use our software but have us manage their databases for them.

Q With such a range of possible installations, is it possible to quote an average cost or a price range for this?

A Yes, the cost can start as low as $25,000 and range up to about $250,000, depending on what is licensed, how many servers, and how many databases are licensed.

Q You seem to have a fairly large range of target markets: content publishers, portals, corporate extranets, etc. As a small company, how do you sell to and support this diverse group?

A Our focus today is clearly on three vertical markets: medical and pharmaceutical, information technology, and the financial services industry. Our strategy is to work with top-line publishers in each space, licensing either our technology or services to them. We're also interested in working with other solutions providers, either those providing content management systems or content technology to these marketplaces. We want to embed our solution within these other platforms. We have worked with quite a few enterprise customers ourselves, but we think a more effective approach for us is to work through these channel partners, which have larger sales forces and already have relationships with enterprise customers.

Content companies that are making their databases available to affiliate content sites on the Web, by working with us can offer the ability to embed their content directly in the pages of their customers, and expect to greatly increase the viewership of that content on their customers' pages. An example of this is Facts and Comparisons, which saw the opportunity to complement the drug look-up functionality of their data on their customers' sites with direct access to their content at the word and phrase level of their customers' sites using RichLink. (See image on page 1.)

A publisher like Reuters Health has a news feed that they deliver to a large number of sites, and they want to expand the number of sites internationally as well as provide value-added solutions to those sites. RichLink is helping them build new audiences and build more functionality into their news.

Q What prompted this discussion today was the recent announcement of the launch of your e-publishing suite, with Word Burst for increasing ad revenue, and License Burst for embedding licensed content and increasing syndication revenue. Let's focus for a minute on the targeted ads popping up in these windows. As a user I find great dislike for ads that pop up automatically—they are annoying and intrusive.In your case though, the user has to opt to activate an ad by clicking on a word, so I would guess users don't object and advertisers and publishers must love it.

A That's right. We encourage customers not to build in pure advertising as pop-ups, since people will just stop clicking. When users are rewarded with high-value content, they can then be offered other messages within that same window. This is not about in-your-face advertising; it's about providing high-value information within context to a user. We get the end-user at the point when they want to know more—and for an advertiser to get a user at that point is a powerful way to get your message across. Our point-of-impact slogan was inspired by the education world, which talks about the "teachable moment." If you can talk to someone at that teachable moment, your effectiveness is much higher.

Q Publishers and editors have long been concerned about the lines of distinction between editorial content and advertising. Moving to the Web, we've been even more observant and careful. From what I've seen though, in your examples, it's very clear when a piece of database information is presented and when the user is invited to click on an advertisement for more information.

A Yes, it is. We're very active now in the medical and pharmaceutical space where the sensitivity is very high on this topic. We enable our customers to customize the pop-up window however they like.

Q How does the License Burst system actually work? Your press release mentioned that it "works seamlessly with any content management system." What exactly is required?

A This answer actually applies to RichLink in general and not just to License Burst. We will plug into almost any system: homegrown CMS; a flat Apache Web server environment; or into a more complex environment, such as the Documentum content management system. A site running databases with great work flow and a taxonomy in place will be able to do more with RichLink than a site than doesn't have those things.

For License Burst, we want to work with companies that are already selling their content on the Web to other sites and affiliates. We want to enable them to offer another version of their content, not just a search capability but a RichLink capability to embed their content in another site. It's up to the customer to figure out how to license it and charge for it. For example, because Reuters Health has a version of their medical content RichLink-enabled with Japanese-language annotations, they expect to be able to double their content licensing in Japan.

Q So License Burst doesn't enable someone to find an article and request to license it for another site—something like Qpass, which allows one to purchase an article.

A Someone could certainly integrate Qpass with RichLink, and use the RichLink capability to direct people to the other content. We have people who are thinking about doing that. We do have people exploring the use of RichLink to drive archive sales. It's one of the next things we'll be focusing on.

Q You are a privately held company, so I don't know how you've been doing in these difficult market conditions. Has this been tough for you? You launch a new company, get off to a good start with your technology, and then watch potential customers scale back their software and technology purchases as they struggle with their bottom lines.

A The most difficult times were actually in January and February of this year. The sales pipeline in December of last year was very promising, but by the end of February folks were disappearing from companies. Projects were being canceled. There were layoffs, delays in Web site plans—our customers were actually disassembling themselves. We are now seeing interest pickup within the publishing space, so we've put considerable effort into our e-publishing suite. We're offering publishers new ways to monetize their content. In this kind of market, companies want to be offered ways to increase their revenue, create new products, and raise customer satisfaction. We're getting good traction and we're incredibly busy right now.

Q In fact, a solution like RichLink couldreally help companies that are struggling now to enable their communication and their e-commerce to be more efficient and productive. Maybe you are in the right place at the right time.

A It's not just the economy. I think there's also been a natural progression of electronic media technologies. Web technology is starting to stabilize and people are looking to build greater functionality and improve customer satisfaction, just as happened earlier with CD-ROM publishers. Web sites know their end-users better now and are looking for ways to add real value.

Sentius Corp. is based in Palo Alto, California. For more information, visit RichLink Word Burst can be seen in action on MedicineNet at, and RichLink e-globalization can be seen at RichLink License Burst as being sold by Facts and Comparisons can be seen at

Paula J. Hane is editor of NewsBreaks, contributing editor of Information Today, a former reference librarian, and a longtime online searcher. Her e-mail address is

© 2001 Information Today, Inc.