Search Engine Overview
Recent developments in Indexing, Searching and Information Retrieval Technologies

SE news
  • New largest Search Engine launched by Fast Search & Transfer
  • NREN Search and Index Services
    Special purposes Search Engines
    SE Special Services
    SE Technologies
  • Report on the 1999 Search Engines Meeting by Avi Rappoport
  • Search Engines Tools
  • Free Indexing and Searching Software
  • Commercial SW
  • SE tips and links
    Search Engine Projects
    Search Engines Papers
  • Research Papers related to Google!
  • Research Papers related to IBM CLEVER Searching Project
  • Other SE papers
  • SE Legal issues
    Other Sections Index page

    This page is updated regularly, please send your suggestions to:

    SE news

    Search Engines News

    Current Search Engine Report

    Search Engine Size

    News at Web Site Search Tools

    Results from our Site Search Tools Survey!
    First results from our search tools survey are in, and they're interesting! Most web administrators who haven't installed a site search say it's because they don't have time or the applications are too complex. Those who have cite improved navigation as their number one reason, by far. More surprising results come from sites aimed towards information professionals (many don't have search), and sites with three or more languages (they have search). weekly

    New largest Search Engine launched by FAST Search & Transfer

    August 2, 1999 FAST (Fast Search & Transfer) has launched a new site called Alltheweb ("FAST Search: All the Web, All the Time") The announced size of their index is more than 200 millions pages that is estimated as 25% of all web. See more information.

    NREN Search and Index Services

    German Web Index
    Metagenerator -
    Metadata scheme -
    Fireball was developed by FLP/KIT -
    KIT -

    Swiss search service
    Allows metadata search -

    Nordic Web index

    Special purposes Search Engines

    US Government Search Engine Launched
    A new search engine that focuses on information from US government sources was opened in May. Called Gov.Search, the service is jointly produced by search engine Northern Light and the U.S. Commerce Department's National Technical Information Service through a five-year agreement.
    The service is unusual for the web in that searching is not free. Those wishing to use it must pay for access, which ranges from US $15 for a day pass, $30 for a monthly pass or $250 for a year. Special pricing is also available to companies and organizations that require multiple accounts.
    Northern Light has now indexed about 4 million web pages located on more than 20,000 US government servers, which also include military and some educational sites. In addition to this information, it has also indexed about 2 million specialty records from the NTIS.

    Google US Government Search
    Google has its own US government search service. Test queries show it to be much smaller than Northern Light's index, yielding only 10 to 50 percent of Northern Light's counts. But the relevancy of some of the matches was impressive. Definitely worth a visit.

    Cora Search Engine
    Cora is a special-purpose search engine covering computer science research papers.

    SE Special Services

    Northern Light Adds Research Options
    Northern Light now also operates a "research" version of its service, where the default is to search within its Special Collection index. This index has information from over 5,400 publications, much of which is not available on the web. Searching is for free, and then documents can be purchased for between $1 and $4.
    Titles can be downloaded from Northern Light Research Version ( )
    Northern Light Special Editions

    Research Service at HotBot

    "Invisible Web" Revealed
    Lycos and IntelliSeek have teamed up to produce an index of search databases to help users find information that is invisible to search engines. The "Invisible Web Catalog" provides links to more than 7,000 specialty search resources. Users can browse listings or search Lycos index base.
    Lycos Invisible Web Catalog


    Direct Search
    Catalog of specialty databases. Search inside particular database.

    Guide to searchable databases. Browse or search through listings.

    Northern Light Adds clustering
    This is to prevent domination of results from one site.
    In addition to pages index NL provides list of Custom Search Folders ™ created/generated of clustered search data by group of servers of type of pages.

    Navigate web smarter and easier with Alexa

    Netscape's keywords service

    SE Technologies

    Report on the 1999 Search Engines Meeting
    by Avi Rappoport, Search Tools Consulting

    The main questions discussed:

    For more information read Abstracts of the Report.

    Natural Language Processing & Information Retrieval (NLPIR) group of ITL NIST (
    Valuable information. Publications

    Information on DARPA TIPSTER Text Program

    IBM Patents Network -

    Lycos holds patent 5,748,954
    ( ), which covers roughly any kind of web spider that heuristically downloads "better" documents before "worse" documents, and explicitly includes a reference to looking at how often a document is linked as a goodness heuristic.

    TUSTEP (TUebingen System of Text Processing Programs)
    Munltilingual Textdata Processing and Fuzzy Searching

    Search Engines Tools

    Web Site Search Tools

    Web Site Search Tools - Related Topics

    Search Tools Product Listings

    Free Indexing and Searching Software

    Harvest, an open-source project, has been re-implemented in Perl and can summarize documents in SOIF (Summary Object Interchange Format). This version saves the data in a database file and does not include a Broker or search engine, but it is entirely extensible.

    The Combine System for disributed indexing

    Zebra Information Server
    Powerful free-text indexing and retrieval system, combined with a Z39.50 server. The Zebra server is freely available for noncommercial applications.

    Framework for Advanced Search (ASF)
    ASF Freeware

    OCLC Z39.50 freely reusable code (C and Java)

    Perlfect Search 3.01

    PLWeb Turbo has released a new version, 3.0 for Windows NT with improved performance, customization, web-crawling capability, and a browser-based interface.
    PLWeb and all PLS products are now freeware from AOL.

    AltaVista (Windows NT and Unix search tool) has just introduced a free version of AltaVista Search Intranet, Entry Level, which will index up to 3,000 pages.

    Commercial SW

    Ultraseek on Linux
    The Ultraseek search engine and the Content Classification Engine now run on Linux Redhat Linux 5.1 on a PC, Kernel 2.0.34 or better, or glibc 2.0.7-19 or better. Commercial
    Download free trial version

    Ultraseek Content Classification Engine Product Information

    Super Site Searcher Perl CGI works with other modules to create searchable site directory. Commercial.

    Extense - a powerful search engine developed in France which uses the syntactic declination of French words (masculine/feminine and singular/plural). Commercial.

    Inxight LinguistX code library - provides language identification, stemming and tokanization, among other features.
    A collection of componants for many languages that provide word and phrase analysis, stemming, tokanization, parts of speech analysis, noun phrase extraction, language identification, summarization, etc.
    Platform: Windows 95 and NT, Solaris Sparc (will port to other Unix systems). Commercial.

    Verify products

    Knowledge Retrieval products

    SE tips and links

    Search Engines links
    Contains such sections:

    Search Tips and Tricks Advanced Searching

    Information Retrieval systems

    Top search words and terms

    Ask Jeeves Peak Through The Keyhole

    Weekly Search Engine Keyword Statistics For Web and Internet Marketing

    Dogpile Top 200 Search Words
    Top words from the meta-search engine Dogpile from January to July 1997. Unfortunately, the actual keyword phrases are not shown.

    Search Spy
    This is a database of search terms available for desktop use. You enter a term, and the program scans to find matches. You can sort results by count or by keyword. Data is gathered from various live search displays.

    Life on the Internet, Finding Things Jakob Nielsen's Website
    He formulated new approach in SE - LSD: Logo, Search, Directory.

    Search Engine Projects

    IBM's CLEVER Searching

    Web Archeology Project at Digital Research
    Contains sections:

    The MetaWeb Project
    The aim of the Metadata Tools and Services project - known as MetaWeb - is to develop indexing services, tools, and metadata element sets in order to promote the use of, and exploitation of metadata on the Internet.

    DFN Indexing and Searching projects -
    MetaGer (subject meta search), MESA (email address meta search), Level3 (search service for the DFN-Expo project), and

    X.500 Directory E-mail Addresses Search (AMBIX-D) -

    Search Engines Papers

    Research Papers related to Google!

    Research Papers related to IBM CLEVER Searching Project

    John Kleinberg Homepage
    Researches and publications related to IBM's CLEVER Searching project.

    Other SE papers

    TREC Publications
    TREC (the Text REtrieval Conference) sponsored by NIST provides a set of realistic test collections, uniform scoring, unbiased evaluators and a chance to see the changes and improvements of search engines over time.
    Results are in materials of Annual Conferences at

    Retrieval Performance in FERRET: A Conceptual Information Retrieval System
    Michael L. Mauldin
    Appeared at The 14th International Conference on Research and Development in Information Retrieval, Chicago, October 1991, ACM SIGIR.

    Enhancing the World Wide Web
    Social Software for the Evolution of Knowledge

    Learning Webs by J. Bollen, & F. Heylighen,
    Hebbian learning can be implemented on the web, by changing the strength of links depending on how often they are used. paper is exploring the "brain" metaphor for making the web more intelligent. The basic idea is that web links are similar to associations in the brain, as supported by synapses connecting neurons. The strength of the links, like the connection strength of synapses, can change depending on the frequency of use of the link. This allows the network to "learn" automatically from the way it is used.

    Identification, location and versioning of web-resources. URI Discussion paper. Version 1.0. 12 March 1999
    Titia van der Werf-Davelaar
    This document is a discussion document for use in developing a consensus on practical approaches to be pursued for better information management techniques and methods on the Web.
    This work is done in the context of the following projects: DONOR, DESIRE, NEDLIB.

    Report on the WWW8 conference by Nicky Ferguson

    Semantic Web vision paper
    Alexander Chislenko. - Version 0.28 - 29 June, 1997

    SE Legal issues


    This page is updated regularly, please send your suggestions to: