Synonyms: document indexer Related Terms: search indices, spider, import/export, index (noun), index (verb) Before a search engine can quickly search through documents, it must first create search indices that list every word in every document, along with information about each document's Meta Data. The program that performs this task is often referred to as an indexer. Usage: "indexer" is an older term, and is typically used when the process of indexing will be fairly simple and can be run from the command line; for more complicated web crawling the term "spider" is preferred.
A program that identifies all of the terms in a document or corpus and builds a table that indicates where the terms are used.
When a search engine spiders (downloads) a page on a web, it must process the page to store it. A spider is responsible for the downloading, while the Indexer is responsible for process the page. An search engine indexer will typically process a page by removing all HTML tags, checking for and story links, often compressing the page by pulling out filter words, looking for stop words, and finally storing the page in a online searchable database.
The part of the search engine that processes and places spidered, or crawled, web documents into a database. The indexer typically processes a document by removing all tags, storing links in a queue, removing filter words, looking for stop words, and storing the document in a searchable database.
This is the software that compiles a searchable database containing billions of web pages and documents which is utilized every time a user puts in a query. The indexes or indices of search engines are created by the spider (crawler) and automatically sorted into rankings according to the rules of the particular algorithms. (See Algorithm and Spider).
Data collected by crawler is read by indexer and creates an index based on the words contained in each document.