- Windows Search Overview
- Introduction
- Windows Search Service
- Development Platform
- User Interface
- Technical Prerequisites
- SDK Download and Contents
- Windows Search SDK Documentation
- History of Windows Search
- Search engine
- How to access a search engine
- How a search engine works
- Do all search engines give the same results?
- What is the best search engine?
Windows Search Overview
Windows Search is a desktop search platform that has instant search capabilities for most common file types and data types, and third-party developers can extend these capabilities to new file types and data types.
This topic is organized as follows:
Introduction
Windows Search is a standard component of WindowsВ 7 and WindowsВ Vista, and is enabled by default. Windows Search replaces Windows Desktop Search (WDS), which was available as an add-in for WindowsВ XP and Windows ServerВ 2003.
Windows Search is composed of three components:
Windows Search Service
The WSS organizes the extracted features of a collection of documents. The Windows Search Protocol enables a client to communicate with a server that is hosting a WSS, both to issue queries and to enable an administrator to manage the indexing server. When processing files, WSS analyzes a set of documents, extracts useful information, and then organizes the extracted information so that properties of those documents can be efficiently returned in response to queries.
A collection of documents that can be queried comprises a catalog, which is the highest-level unit of organization in Windows Search. A catalog represents a set of indexed documents that can be queried. A catalog consists of a properties table with the text or value and corresponding location (locale) stored in columns of the table. Each row of the table corresponds to a separate document in the scope of the catalog, and each column of the table corresponds to a property. A catalog may contain an inverted index (for quick word matching) and a property cache (for quick retrieval of property values).
The indexer process is implemented as a Windows service running in the LocalSystem account and is always running for all users (even if no user is logged in), which permits Windows Search to accomplish the following:
- Maintain one index that is shared among all users.
- Maintain security restrictions on content access.
- Process remote queries from client computers on the network.
The Search service is designed to protect the user experience and system performance when indexing. The following conditions cause the service to throttle back or pause indexing:
- High CPU usage by non-search-related processes.
- High system I/O rate including file reads and writes, page file and file cache I/O, and mapped file I/O.
- Low memory availability.
- Low battery life.
- Low disk space on the drive that stores the index.
Development Platform
The preferred way to access the Search APIs and create Windows Search applications is through a Shell data source. A Shell data source is a component that is used to extend the Shell namespace and expose items in a data store. A data store is a repository of data. A data store can be exposed to the Shell programming model as a container that uses a Shell data source. The items in a data store can be indexed by the Windows Search system using a protocol handler.
For example, ISearchFolderItemFactory is a component that can create instances of the search folder data source, which is a sort of «virtual» data source provided by the Shell that can execute queries over other data sources in the Shell namespace and enumerate results. It can do so either by using the indexer or by manually enumerating and inspecting items in the specified scopes. This interface permits you to set up the parameters of the search by using methods that create and modify search folders. If methods of this interface are not called, default values are used instead.
Accessing the Windows Search capability indirectly through the Shell data model is preferred because it provides access to full Shell functionality at the level of the Shell data model. For example, you can set the scope of a search to a library (which is a feature available in WindowsВ 7 and later) to use the library folders as the scope of the query. Windows Search then aggregates the search results from those locations if they are in different indexes (if the folders are on different computers). The Shell data layer also creates a more complete view of items’ properties, synthesizing some property values. It also provides access to search features for data stores that are not indexed by Windows Search. For example, you can search a Universal Serial Bus (USB) storage devices, portable device that uses the MTP protocol, or an File Transfer Protocol (FTP) server through the Shell data sources that provides access to those storage systems. Doing so ensures a better user experience.
Windows Search has a cache of property values that is used in the implementation of the Windows Search Service (WSS). These property values can be programmatically queried by using the Windows Search OLEВ DB provider, or through ISearchFolderItemFactory, which represents items in search results and query-based views. Windows Search then collects and stores properties emitted by filter handlers or property handlers when an item such as a Word document is indexed. This store is discarded and rebuilt when the index is rebuilt.
Third-party developers can create applications that consume the data in the index through programmatic queries, and can extend the data in the index for custom file and item types to be indexed by Windows Search. If you want to show query results in Windows Explorer, you must implement a Shell data source before you can create a protocol handler to extend the index. However, if all queries are programmatic (through OLEВ DB for example) and interpreted by the application’s code rather than the Shell, a Shell namespace is still preferred but not required.
A protocol handler is required for Windows to obtain information about file contents, such as items in databases or custom file types. While Windows Search can index the name and properties of the file, Windows has no information about the content of the file. As a result, such items cannot be indexed or exposed in the Windows Shell. By implementing a custom protocol handler, you can expose these items. For a list of handlers identified by the developer scenario you are trying to achieve, see «Overview of Handlers» in Windows Search as a Development Platform.
A Shell data source is sometimes known as a Shell namespace extension. A handler is sometimes known as a Shell extension or a Shell extension handler.
User Interface
In WindowsВ Vista and later, Windows Search is integrated into all Windows Explorer windows for instant access to search. This enables users to quickly search for files and items by file name, properties, and full-text contents. Results can also be filtered further to refine the search. Here are some more features of Windows Search:
- An instant search box in every window enables instant filtering of all items currently in view. Instant search boxes appear in the Start menu to search for programs or files, and in the upper-right corner of all Windows Explorer windows to filter the results shown. Instant search is also integrated into some other Windows features, such as Windows Media Player, to find related files.
- Documents can be tagged with keywords to group them by custom criteria that are defined by the user. Tags are metadata items that are assigned by the user or applications to make it easier to find files based on keywords that may not be in the item name or contents. For example, a set of pictures might be tagged as «Arizona Vacation 2009» to quickly retrieve later by searching for any of the included words.
- Enhanced column headers in Windows Explorer views enable sorting and grouping documents in different ways. For example, files can be sorted according to name, date modified, type, size, and tags. Documents can also be grouped according to any of these properties and each group can be filtered (hidden or displayed) as desired.
- Documents can be stacked according to name, date modified, type, size, and tags. Stacks include all documents that have the specified property and are located within any subfolder of the selected folder.
- Searches can be saved (to be retrieved later) by clicking the Save Search button in the search pane in Windows Explorer. The results will be dynamically repopulated based on the original criteria when the saved search is opened. For instructions, see Save Your Search Results.
- Preview handlers and thumbnail handlers enable users to preview documents in Windows Explorer without having to open the application that created them.
Technical Prerequisites
Before you start reading the Windows Search SDK documentation, you should have a fundamental understanding of the following concepts:
- How to implement a Shell data source.
- How to implement a handler.
- How to work in native code.
A Shell data source is a component that is used to extend the Shell namespace and expose items in a data store. In the past, the Shell data source was referred to as the Shell namespace extension. A handler is a Component Object Model (COM) object that provides functionality for a Shell item. For a list of handlers identified by the developer scenario you are trying to achieve, see «Overview of Handlers» in Windows Search as a Development Platform.
For more information about the Windows Search SDK interoperability assembly for working with COM objects that are exposed by Windows Search and other programs that use managed code, see Using Managed Code with Shell Data and Windows Search. However, note that filters, property handlers, and protocol handlers must be written in native code. This is due to potential common language runtime (CLR) versioning issues with the process that multiple add-ins run in. Developers who are new to C++ can get started with the Visual C++ Developer Center and Windows Development Getting Started.
SDK Download and Contents
In addition to meeting the listed technical prerequirements, you must also download the Windows SDK to get the Windows Search libraries. The Windows Search SDK Samples contain useful code samples and an interoperability assembly for developing with managed code. For more information on using the code samples, see Windows Search Code Samples.
Windows Search SDK Documentation
The contents of the Windows Search SDK documentation are as follows:
Outlines the main development scenarios in Windows Search. Provides a list of handlers identified by the development scenario you are trying to achieve, add-in installer guidelines, and implementation notes.
Describes the Search API code samples that are available.
Describes WindowsВ 7 support for search federation to remote data stores using OpenSearch technologies that enable users to access and interact with their remote data from within Windows Explorer.
Lists technologies related to Windows Search: Enterprise Search, SharePoint Enterprise Search, and legacy applications such as Windows Desktop Search 2.x and Platform SDK: Indexing Service.
Defines essential terms used in Windows Search and Shell technologies.
History of Windows Search
Windows Search replaces Windows Desktop Search (WDS), which was available as an add-in for WindowsВ XP and Windows ServerВ 2003. WDS replaced the legacy Indexing Service from previous versions of Windows with enhancements to performance, usability, and extensibility. The new development platform supports requirements that produce a more secure and stable system. While the new querying platform is not compatible with MicrosoftВ Windows Desktop Search (WDS) 2.x, filters and protocol handlers written for previous versions of WDS can be updated to work with Windows Search. Windows Search also supports a new property system. For information on filters, property handlers, and protocol handlers, see Extending the Index.
Windows Search is built into WindowsВ Vista and later, and is available as a redistributable update to WDS 2.x, to support the following operating systems:
- 32-bit versions of WindowsВ XP with Service Pack 2 (SP2).
- All x64-based versions of WindowsВ XP.
- Windows ServerВ 2003 with Service Pack 1 (SP1) and later.
- All x64-based versions of Windows ServerВ 2003.
Systems running these operating systems must have Windows Search installed in order to run applications written for Windows Search. For more information, see KB article 917013: Description of Windows Desktop Search 3.01 and the Multilingual User Interface Pack for Windows Desktop Search 3.01.
Search engine
A search engine is software accessed on the Internet that searches a database of information according to the user’s query. The engine provides a list of results that best match what the user is trying to find. Today, there are many different search engines available on the Internet, each with its own abilities and features. The first search engine ever developed is considered Archie, which was used to search for FTP files, and the first text-based search engine is considered Veronica. Currently, the most popular and well-known search engine is Google. Other popular search engines include AOL, Ask.com, Baidu, Bing, DuckDuckGo, and Yahoo.
How to access a search engine
For users, a search engine is accessed through a browser on their computer, smartphone, tablet, or another device. Today, most new browsers use an omnibox, which is a text box at the top of the browser. The omnibox allows users to type in a URL or a search query. You can also visit one of the major search engines’ home page to perform a search.
How a search engine works
Because large search engines contain millions and sometimes billions of pages, many search engines display the results depending on their importance. This importance is commonly determined using various algorithms.
As illustrated, the source of all search engine data is collected using a spider or crawler that visits each page on the Internet and collects its information.
Once a page is crawled, the data contained in the page is processed and indexed. Often, this can involve the steps below.
- Strip out stop words.
- Record the remaining words on the page and the frequency they occur.
- Record links to other pages.
- Record information about any images, audio, and embedded media on the page.
The data collected is used to rank each page. These rankings then determine which pages to show in the search results and in what order.
Finally, once the data is processed, it’s broken up into files, inserted into a database, or loaded into memory where it’s accessed when a search is performed.
Do all search engines give the same results?
Not necessarily. Search engines use proprietary algorithms to index and correlate data, so every search engine has its own approach to finding what you’re trying to find. Its results may be based on where you’re located, what else you’ve searched for, and what results were preferred by other users searching for the same thing. Each search engine uniquely weights these and offers you different results.
What is the best search engine?
There isn’t one search engine that is better than all the others. Many people could argue that Google’s search engine is the best, and it is the most popular and well-known. It’s so popular that people often use it as a verb when telling someone to search for their question.
Microsoft’s Bing search engine is also popular and used by many people. Bing does an excellent job of finding information and answering questions. Bing is also what powers the search in Windows 10 and the Yahoo search engine.
Users concerned with privacy, enjoy using Duck Duck Go. This search engine makes its users anonymous and is an excellent solution for users concerned with how much information Google and Bing collect on its users.