Automatic Classification in Indexing and Searching
Recent developments in Indexing, Searching and Information Retrieval Technologies

Automatic Classification Other Sections Index page

This page is updated regularly, please send your suggestions to:

Automatic Classification

IKEM Toolkit
IKEM Toolkit is a hybrid knowledge-based platform for thesaurus-oriented electronic document management. The project was sponsored by IWT. IKEM Toolkit contains various tools to manage your hybrid documents in an intelligent and user-oriented way.

Willpower Information. Information Management Consultants
Thesauri and vocabulary control: Principles and practice
Software for building and editing thesauri

CMU Text Learning Group
Goal is to develop new machine learning algorithms for text and hypertext data. Applications of these algorithms include information filtering systems for the Internet, and software agents that make decisions based on text information.

CMU World Wide Knowledge Base (WebKB) project
Goal is to develop a probabilistic, symbolic knowledge base that mirrors the content of the world wide web. If successful, this will make text information on the web available in computer-understandable form, enabling much more sophisticated information retrieval and problem solving.

Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering
Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (arrow) and document clustering (crossbow).
The library and its front-ends were designed and written by Andrew McCallum.

Homepage of Andrew McCallum

Contains a lot of information on Learning Classification algorithms for text recognition.

Reinforcement Learning with Selective Perception and Hidden State. PhD Thesis, by Andrew Kachites McCallum
Method uses memory-based learning and a robust statistical test on reward in order to learn a structured policy representation that makes perceptual and memory distinctions only where needed for the task at hand. It can also be understood as a method of Value Function Approximation. The model learned is an order-n partially observable Markov decision process. It handles noisy observation, action and reward.

WWW -- Wealth, Weariness or Waste: Controlled vocabulary and thesauri in support of online information access
David Batty

Using Automated Classification for Summarizing and Selecting Heterogeneous Information Sources
R. Dolin, D. Agrawal, A. El Abbadi, J. Pearlman

This page is updated regularly, please send your suggestions to: