ET-Space - Visualizing Entertainment
In 1995, a testbed of 110,000 Internet homepages from the entertainment section of Yahoo! was gathered by an Internet Spider. An automatic indexing algorithm was applied to the homepages and a concept space (ET-Space) and multi-layered Kohonen SOM (ET-Map) were created.
The Concept Space interface allowed users to search for documents directly with keywords or access the concept space for search term suggestion. The concept space was an automatically generated thesaurus based on word co-occurrence both within individual documents and within the collection as a whole. The system could suggest terms for searching as they appear in the documents enhancing information recall. The top 40 terms were displayed in order based on a weighting/ranking algorithm. Each concept space term had an identifier that links it to the source search term. The interface allowed the user to enter search terms as a single word or a phrase and multiple search terms could be entered. Clicking on the box adjacent to a concept space term added it to your previous search. The user could then elect to see more concept space terms or search for documents.
A 2-D multi-layered SOM was generated from the entertainment collection. Text labels and colors are used to demarcate regions in the SOM, color has no specific meaning. SOMs will cluster similar categories near each other on the map. Clicking on a map region will take you down a layer in a multi-layered map, or show you the documents associated with that category if the number of documents is less than 200.