Automatic Classification in Indexing and Searching
Recent developments in Indexing, Searching and Information Retrieval Technologies


 
Automatic Classification Other Sections Index page

This page is updated regularly, please send your suggestions to: demchenko@terena.nl


Automatic Classification

IKEM Toolkit
http://bikit.rug.ac.be:80/ikem/
IKEM Toolkit is a hybrid knowledge-based platform for thesaurus-oriented electronic document management. The project was sponsored by IWT. IKEM Toolkit contains various tools to manage your hybrid documents in an intelligent and user-oriented way.

Willpower Information. Information Management Consultants
www.willpower.demon.co.uk
Thesauri and vocabulary control: Principles and practice
http://www.willpower.demon.co.uk/thesprin.htm
Software for building and editing thesauri
http://www.willpower.demon.co.uk/thessoft.htm

CMU Text Learning Group
http://www.cs.cmu.edu/afs/cs/project/theo-4/text-learning/www/index.html
Goal is to develop new machine learning algorithms for text and hypertext data. Applications of these algorithms include information filtering systems for the Internet, and software agents that make decisions based on text information.

CMU World Wide Knowledge Base (WebKB) project
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
Goal is to develop a probabilistic, symbolic knowledge base that mirrors the content of the world wide web. If successful, this will make text information on the web available in computer-understandable form, enabling much more sophisticated information retrieval and problem solving.

Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering
Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (arrow) and document clustering (crossbow).
The library and its front-ends were designed and written by Andrew McCallum.
http://www.cs.cmu.edu/~mccallum/bow/rainbow/

Homepage of Andrew McCallum
http://www.cs.cmu.edu/~mccallum/

Contains a lot of information on Learning Classification algorithms for text recognition.


Reinforcement Learning with Selective Perception and Hidden State. PhD Thesis, by Andrew Kachites McCallum
http://www.cs.rochester.edu/u/mccallum/phd-thesis/
Method uses memory-based learning and a robust statistical test on reward in order to learn a structured policy representation that makes perceptual and memory distinctions only where needed for the task at hand. It can also be understood as a method of Value Function Approximation. The model learned is an order-n partially observable Markov decision process. It handles noisy observation, action and reward.

WWW -- Wealth, Weariness or Waste: Controlled vocabulary and thesauri in support of online information access
David Batty
http://www.dlib.org/dlib/november98/11contents.html

Using Automated Classification for Summarizing and Selecting Heterogeneous Information Sources
R. Dolin, D. Agrawal, A. El Abbadi, J. Pearlman
http://www.dlib.org/dlib/january98/dolin/01dolin.html
 



This page is updated regularly, please send your suggestions to: demchenko@terena.nl