COLLEGE DIRECTORY       :      VISIT ELLER      :      LOG IN 
Eller College of Management
Eller College Home > MIS > Artificial Intelligence Laboratory > Research > Digital Libraries > DGPort
Artificial Intelligence Laboratory

Digital Libraries Projects - DGPort

DGPort - Digital Government Portal

DGPort is an online search system designed to provide efficient and precise searching of web documents that may be relevant to researchers in the Digital Government domain.

See a PDF of the poster presentation.

DGPort Concept Space:
DGPort Concept Space is a "thesaurus" of terms relevant to the Digital Government. Domain and was created by the Artificial Intelligence Lab of the MIS department at the University of Arizona. This thesaurus was created using a text mining approach to extract terminologies and their weighted relationships from a collection of documents pertaining to Digital Government. The primary purpose of the DGPort Concept Space is to provide users of the DGPort search system a list of keywords that are highly related to the user’s original search term and which may subsequently assist the user in finding the information he or she is seeking.

The current version of the DGPort Concept Space was created from a collection of 300,000 web documents. As the DGPort portal expands, however, the DGPort Concept Space will be broadened to reflect the new documents added to DGPort. In the future, the DGPort Concept Space will reflect the relationships between keywords contained in over 1 million web documents.

DGPort Search Engine:
The DGPort search engine is a vertical search engine created specifically for the domain of Digital Government. The current prototype of the DGPort search engine is designed to search a collection of over 300,000 quality web documents pertinent to Digital Government researchers. The current document collection was generated using the AI Lab’s Search Engine Toolkit and Meta Search module. In conjunction with these tools, an advanced page collecting methodology is used to ensure the quality and coverage of the collection. Furthermore, a content analysis algorithm and link analysis algorithm are used to rank the search results as well as filter out unrelated pages. Microsoft SQL Server is used as a backend database server.

Future plans for DGPort include the expansion of the DGPort document collection to over 1 million documents as well as the inclusion of the Stanford .gov collection - a collection of almost 17 million web documents relating to over 5,500 government web sites.

Meta Search:
The DGPort search system allows the meta searching of quality sites related to Digital Government. The purpose of meta searching is to send queries to multiple search engines, and literature databases, online journals, and to collate only the highest-ranking subset from each data source, thus increasing precision. Meta search subsequently provides a simple uniform user interface that promises significant advances in coping with information overload and low-precision issues.

Document Categorization and Visualization:
An ideal Information Retrieval (IR) system should categorize retrieved documents automatically and give the user rapid access to various aspects of the subject of interest. DGPort strives for this goal by providing two tools that provide categorization of returned documents: the Document Organizer and the Self Organizing Map. With the help of the Document Organizer, the documents retrieved from a meta search are classified into different categories based on the occurrence of keywords extracted from the documents. In addition, the Self Organizing Map visualization tool helps to facilitate the elucidation of meaning of the collection of returned documents.

Return to Top