Journal of Information Retrieval, Kluwer-Academic, vol. 8 (January 2005)
The heterogeneous, distributive and voluminous nature of many government and corporate data sources impose severe constraints on meeting the diverse requirements of users who analyze the data. Additionally, communication bandwidth limitations, time constraints, and multiplicity of data formats impose further restrictions on users of these distributed data sources. What is required is a reliable, robust, and efficient data retrieval technique that can access data from distributed data sources while maintaining the autonomy of individual sources. In this paper, we present an Agent-based Complex QUerying and Information Retrieval Engine (ACQUIRE) for large, heterogeneous, and distributed data sources. ACQUIRE acts as a softbot or interface agent by presenting users with the appearance of a single, unified, homogenous data source, against which users can pose high-level declarative queries. ACQUIRE translates each such user query into a set of sub-queries by employing a combination of planning and traditional database query optimization techniques. For each sub-query, ACQUIRE then spawns a corresponding mobile agent, which retrieves data from the appropriate data source. These mobile agents carry with them data-processing code that can be executed at the remote site, thus reducing the size of data returned by the agent. When all mobile agents have returned, ACQUIRE filters and merges the retrieved data and presents the results to the user. Validation experiments on simulated NASA Distributed Active Archive Centers (DAACs) have demonstrated that complex queries can be effectively decomposed and retrieved by this approach, resulting in the twin benefits of improved ease of use and significantly reduced query retrieval times.
For More Information
(Please include your name, address, organization, and the paper reference. Requests without this information will not be honored.)