Online KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM EContentMag Faulkner Information Services Fulltext Sources Online InfoToday Europe Internet@Schools Intranets Today ITIResearch.com KMWorld Library Resource Literary Market Place OnlineVideo.net Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer



Magazines > Information Today > January 2003
Back Index Forward
 




SUBSCRIBE NOW!
Information Today
Vol. 20 No. 1 — January 2003
CONFERENCE CIRCUIT
The New Business Intelligence
by Hugh McKellar

Every year, when you walk through the aisles of the KMWorld conference, you hear a bevy of buzz phrases bandied about: "value proposition," "win-win situation," "pain points." Then there are the acronyms. You've heard them all: CM, DM, EIP, ERP, CRM. The list goes on and on, but let's not forget these: EAI, AI. Oh.

This year's acronym was UDM. It's not techno-jive hip-hop for "you da man," but "unstructured data management." You'll be hearing that term a lot over the next 12 months, as companies previously associated with search, taxonomy development, and categorization software bring analytic capabilities to their solutions.

For example, in a somewhat complicated deal involving technology assets and human intellectual capital, Intelliseek and Inxight are deploying technology from the former WhizBang! Labs into their individual solutions. For Intelliseek, the deal means its deep search analysis and other robust technology will be enhanced through the extraction of information from multiple, disparate data sources in the form of what it calls "facts." Intelliseek will produce ASP solutions that create analytic tools and structured reports from vast amounts of unstructured data, such as Web pages, chat rooms, Microsoft Office documents, and e-mail. In its Smart Discovery software, Inxight will use WhizBang's fact-extraction technology, which crawls even dynamically generated Web pages, classifies them, extracts the entities, and associates them into a database record.

With its emphasis on ontologies and enhanced metadata, Semagix takes a different approach to UDM. It aggregates information from any internal or external source: Web site, content repository, or relational database. Through the help of human experts and trusted sources, it builds a domain-specific ontology, which it calls a "superset" of a taxonomy with classes, attributes, relationships, and the like, all connected through a semantic network. The software then enhances the content with inferred metadata from the ontology. Powerful stuff.

ClearForest takes the auto-tagging approach to mining the riches of unstructured data. The company's ClearTags software semantically, structurally, and statistically tags content, greatly enhancing the number and value of the tags. The process is automatic and results in the discovery of relevant and related information, both inside individual documents and between documents in large document repositories. These richly tagged XML files are ideally suited when repurposed or repackaged for use in other applications. They can be stored in a ClearForest knowledgebase, where they can be further leveraged by other ClearForest analytic software such as ClearResearch, ClearCharts, and ClearLab.

Stratify automates the process of organizing unstructured information by using the structure implicit in documents to construct a taxonomy customized for a business. The company says that when users employ custom industry standards or third-party taxonomies—or organize their information using a file server or Web server—the system can directly import that existing work and automatically extend it. The system uses multiple classification technologies that operate in parallel to classify documents more accurately than systems that depend on a single technology. Stratify adds that its technology compares and combines the results from each classifier to produce the best possible results.

Unstructured data management is arguably the most interesting technology to watch these days. UDM's value proposition is a win-win situation guaranteed to eliminate any organization's pain point.

Hey, man, you know, just keeping it real. It's all good.

 


Hugh McKellar is executive editor of KMWorld magazine. His e-mail address is hugh_mckellar@kmworld.com.
       Back to top