Document (#26556)

Author
Dodge, M.
Title
¬A map of Yahoo!
Source
http://mappa.mundi.net/maps/maps_009/
Year
2000
Content
"Introduction Yahoo! is the undisputed king of the Web directories, providing one of the key information navigation tools on the Internet. It has maintained its popularity over many Internet-years as the most visited Web site, against intense competition. This is because it does a good job of shifting, cataloguing and organising the Web [1] . But what would a map of Yahoo!'s hierarchical classification of the Web look like? Would an interactive map of Yahoo!, rather than the conventional listing of sites, be more useful as navigational tool? We can get some idea what a map of Yahoo! might be like by taking a look at ET-Map, a prototype developed by Hsinchun Chen and colleagues in the Artificial Intelligence Lab [2] at the University of Arizona. ET-Map was developed in 1995 as part of innovative research in automatic Internet homepage categorization and it charts a large chunk of Yahoo!, from the entertainment section representing some 110,000 different Web links. The map is a two-dimensional, multi-layered category map; its aim is to provide an intuitive visual information browsing tool. ET-Map can be browsed interactively, explored and queried, using the familiar point-and-click navigation style of the Web to find information of interest.
The View From Above Browsing for a particular piece on information on the Web can often feel like being stuck in an unfamiliar part of town walking around at street level looking for a particular store. You know the store is around there somewhere, but your viewpoint at ground level is constrained. What you really want is to get above the streets, hovering half a mile or so up in the air, to see the whole neighbourhood. This kind of birds-eye view function has been memorably described by David D. Clark, Senior Research Scientist at MIT's Laboratory for Computer Science and the Chairman of the Invisible Worlds Protocol Advisory Board, as the missing "up button" on the browser [3] . ET-Map is a nice example of a prototype for Clark's "up-button" view of an information space. The goal of information maps, like ET-Map, is to provide the browser with a sense of the lie of the information landscape, what is where, the location of clusters and hotspots, what is related to what. Ideally, this 'big-picture' all-in-one visual summary needs to fit on a single standard computer screen. ET-Map is one of my favourite examples, but there are many other interesting information maps being developed by other researchers and companies (see inset at the bottom of this page). How does ET-Map work? Here is a sequence of screenshots of a typical browsing session with ET-Map, which ends with access to Web pages on jazz musician Miles Davis. You can also tryout ET-Map for yourself, using a fully working demo on the AI Lab's website [4] . We begin with the top-level map showing forty odd broad entertainment 'subject regions' represented by regularly shaped tiles. Each tile is a visual summary of a group of Web pages with similar content. These tiles are shaded different colours to differentiate them, while labels identify the subject of the tile and the number in brackets telling you how many individual Web page links it contains. ET-Map uses two important, but common-sense, spatial concepts in its organisation and representation of the Web. Firstly, the 'subject regions' size is directly related to the number of Web pages in that category. For example, the 'MUSIC' subject area contains over 11,000 pages and so has a much larger area than the neighbouring area of 'LIVE' which only has 4,300 odd pages. This is intuitively meaningful, as the largest tiles are visually more prominent on the map and are likely to be more significant as they contain the most links. In addition, a second spatial concept, that of neighbourhood proximity, is applied so 'subject regions' closely related in term of content are plotted close to each other on the map. For example, 'FILM' and 'YEAR'S OSCARS', at the bottom left, are neighbours in both semantic and spatial space. This make senses as many things in the real-world are ordered in this way, with things that are alike being spatially close together (e.g. layout of goods in a store, or books in a library). Importantly, ET-Map is also a multi-layer map, with sub-maps showing greater informational resolution through a finer degree of categorization. So for any subject region that contains more than two hundred Web pages, a second-level map, with more detailed categories is generated. This subdivision of information space is repeated down the hierarchy as far as necessary. In the example, the user selected the 'MUSIC' subject region which, not surprisingly, contained many thousands of pages. A second-level map with numerous different music categories is then presented to the user. Delving deeper, the user wants to learn more about jazz music, so clicking on the 'JAZZ' tile leads to a third-level map, a fine-grained map of jazz related Web pages. Finally, selecting the 'MILES DAVIS' subject region leads to more a conventional looking ranking of pages from which the user selects one to download.
ET-Map was created using a sophisticated AI technique called Kohonen self-organizing map, a neural network approach that has been used for automatic analysis and classification of semantic content of text documents like Web pages. I do not pretend to fully understand how this technique works; I tend to think of it as a clever 'black-box' that group together things that are alike [5] . It is a real challenge to automatically classify pages from a very heterogeneous information collection like the Web into categories that will match the conceptions of a typical user. Directories like Yahoo! tend to rely on the skill of human editors to achieve this. ET-Map is an interesting prototype that I think highlights well the potential for a map-based approach to Web browsing. I am surprised none of the major search engines or directories have introduced the option of mapping results. Although, I am sure many are working on ideas. People certainly need all the help they get, as Web growth shows no sign of slowing. Just last month it was reported that the Web had surpassed one billion indexable pages [6].
Information Maps There are many other fascinating examples that employ two dimensional interactive maps to provide a 'birds-eye' view of information. They use various underlying techniques of textual analysis and clustering to turn the mass of information into a useful summary map (see "Mining in Textual Mountains" in Mappa.Mundi Magazine). In terms of visual representations they can be divided into two groups, those that generate smooth surfaces and those that produce regular, tiled maps. Unfortunately, we don't have space to examine them in detail, but they are well worth spending some time exploring. I will be covering some of them in future columns.
Research Prototypes Visual SiteMap Developed by Xia Lin, based at the College of Library and Information Science, Drexel University. CVG Cyberspace geography visualization, developed by Luc Girardin, at The Graduate Institute of International Studies, Switzerland. WEBSOM Maps the thousands of articles posted on Usenet newsgroups. It is being developed by researchers at the Neural Networks Research Centre, Helsinki University of Technology in Finland. TreeMaps Developed by Brian Johnson, Ben Shneiderman and colleagues in the Human-Computer Interaction Lab at the University of Maryland. Commercial Information Maps: NewsMaps Provides interactive information landscapes summarizing daily news stories, developed Cartia, Inc. Web Squirrel Creates maps known as information farms. It is developed by Eastgate Systems, Inc. Umap Produces interactive maps of Web searches. Map of the Market An interactive map of the market performance of the stocks of major US corporations developed by SmartMoney.com."
Theme
Suchmaschinen
Object
Yahoo

Similar documents (content)

  1. dpa: Yahoo schließt nun ein Bündnis mit Google : Fusionsgespräche mit dem Software-Riesen Microsoft wurden endgültig abgesagt (2008) 1.66
    1.6554284 = sum of:
      1.6554284 = weight(abstract_txt:yahoo in 1927) [ClassicSimilarity], result of:
        1.6554284 = fieldWeight in 1927, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.6217136 = idf(docFreq=159, maxDocs=44218)
          0.25 = fieldNorm(doc=1927)
    
  2. Charlier, M.: Netgeschichten: http://sexygirls.com (1997) 1.66
    1.6554284 = sum of:
      1.6554284 = weight(abstract_txt:yahoo in 1216) [ClassicSimilarity], result of:
        1.6554284 = fieldWeight in 1216, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.6217136 = idf(docFreq=159, maxDocs=44218)
          0.25 = fieldNorm(doc=1216)
    
  3. Interactice magazine combines Web, print and CD-ROM (1996) 1.43
    1.433643 = sum of:
      1.433643 = weight(abstract_txt:yahoo in 4991) [ClassicSimilarity], result of:
        1.433643 = fieldWeight in 4991, product of:
          1.7320508 = tf(freq=3.0), with freq of:
            3.0 = termFreq=3.0
          6.6217136 = idf(docFreq=159, maxDocs=44218)
          0.125 = fieldNorm(doc=4991)
    
  4. Callery, A.; Tracy-Proulx, D.: Yahoo! : Cataloging the Web (1997) 1.43
    1.433643 = sum of:
      1.433643 = weight(abstract_txt:yahoo in 3405) [ClassicSimilarity], result of:
        1.433643 = fieldWeight in 3405, product of:
          1.7320508 = tf(freq=3.0), with freq of:
            3.0 = termFreq=3.0
          6.6217136 = idf(docFreq=159, maxDocs=44218)
          0.125 = fieldNorm(doc=3405)
    
  5. Siering, F.: Barfuß zum Erfolg (1999) 1.24
    1.2415713 = sum of:
      1.2415713 = weight(abstract_txt:yahoo in 8504) [ClassicSimilarity], result of:
        1.2415713 = fieldWeight in 8504, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.6217136 = idf(docFreq=159, maxDocs=44218)
          0.1875 = fieldNorm(doc=8504)