Search (5 results, page 1 of 1)

  • × classification_ss:"025.04"
  1. Huberman, B.: ¬The laws of the Web: : patterns in the ecology of information (2001) 0.04
    0.036574703 = product of:
      0.09143676 = sum of:
        0.06385853 = weight(_text_:link in 6123) [ClassicSimilarity], result of:
          0.06385853 = score(doc=6123,freq=2.0), product of:
            0.2711644 = queryWeight, product of:
              5.3287 = idf(docFreq=582, maxDocs=44218)
              0.05088753 = queryNorm
            0.23549749 = fieldWeight in 6123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3287 = idf(docFreq=582, maxDocs=44218)
              0.03125 = fieldNorm(doc=6123)
        0.027578231 = weight(_text_:22 in 6123) [ClassicSimilarity], result of:
          0.027578231 = score(doc=6123,freq=2.0), product of:
            0.17819946 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05088753 = queryNorm
            0.15476047 = fieldWeight in 6123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=6123)
      0.4 = coord(2/5)
    
    Date
    22.10.2006 10:22:33
    Footnote
    Rez. in: nfd 54(2003) H.8, S.497 (T. Mandl): "Gesetze der digitalen Anarchie - Hyperlinks im Internet entstehen als Ergebnis sozialer Prozesse und können auch als formaler Graph im Sinne der Mathematik interpretiert werden. Die Thematik Hyperlinks ist im Information Retrieval höchst aktuell, da Suchmaschinen die Link-Struktur bei der Berechnung ihrer Ergebnisse berücksichtigen. Algorithmen zur Bestimmung des "guten Rufs" einer Seite wie etwa PageRank von Google gewichten eine Seite höher, wenn viele links auf sie verweisen. Zu den neuesten Erkenntnissen über die Netzwerkstruktur des Internets liegen zwei sehr gut lesbare Bücher vor. Der Autor des ersten Buchs, der Wirtschaftswissenschaftler Huberman, ist Leiter einer Forschungsabteilung bei Hewlett Packard. Huberman beschreibt in seinem Buch zunächst die Geschichte des Internet als technologische Revolution und gelangt dann schnell zu dessen Evolution und den darin vorherrschenden Wahrscheinlichkeitsverteilungen. Oberraschenderweise treten im Internet häufig power-law Wahrscheinlichkeitsverteilungen auf, die der Zipf'schen Verteilung ähneln. Auf diese sehr ungleichen Aufteilungen etwa von eingehenden HypertextLinks oder Surfern pro Seite im Internet bezieht sich der Titel des Buchs. Diese immer wieder auftretenden Wahrscheinlichkeitsverteilungen scheinen geradezu ein Gesetz des Internet zu bilden. So gibt es z.B. viele Sites mit sehr wenigen Seiten und einige wenige mit Millionen von Seiten, manche Seiten werden selten besucht und andere ziehen einen Großteil des Internet-Verkehrs auf sich, auf die meisten Seiten verweisen sehr wenige Links während auf einige wenige populäre Seiten Millionen von Links zielen. Das vorletzte Kapitel widmen übrigens beide Autoren den Märkten im Internet. Spätestens hier werden die wirtschaftlichen Aspekte von Netzwerken deutlich. Beide Titel führen den Leser in die neue Forschung zur Struktur des Internet als Netzwerk und sind leicht lesbar. Beides sind wissenschaftliche Bücher, wenden sich aber auch an den interessierten Laien. Das Buch von Barabási ist etwas aktueller, plauderhafter, länger, umfassender und etwas populärwissenschaftlicher."
  2. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.02
    0.022121247 = product of:
      0.11060623 = sum of:
        0.11060623 = weight(_text_:link in 7) [ClassicSimilarity], result of:
          0.11060623 = score(doc=7,freq=6.0), product of:
            0.2711644 = queryWeight, product of:
              5.3287 = idf(docFreq=582, maxDocs=44218)
              0.05088753 = queryNorm
            0.40789366 = fieldWeight in 7, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.3287 = idf(docFreq=582, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.2 = coord(1/5)
    
    Abstract
    The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.
    Content
    Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading
  3. Manning, C.D.; Raghavan, P.; Schütze, H.: Introduction to information retrieval (2008) 0.01
    0.012771706 = product of:
      0.06385853 = sum of:
        0.06385853 = weight(_text_:link in 4041) [ClassicSimilarity], result of:
          0.06385853 = score(doc=4041,freq=2.0), product of:
            0.2711644 = queryWeight, product of:
              5.3287 = idf(docFreq=582, maxDocs=44218)
              0.05088753 = queryNorm
            0.23549749 = fieldWeight in 4041, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3287 = idf(docFreq=582, maxDocs=44218)
              0.03125 = fieldNorm(doc=4041)
      0.2 = coord(1/5)
    
    Content
    Inhalt: Boolean retrieval - The term vocabulary & postings lists - Dictionaries and tolerant retrieval - Index construction - Index compression - Scoring, term weighting & the vector space model - Computing scores in a complete search system - Evaluation in information retrieval - Relevance feedback & query expansion - XML retrieval - Probabilistic information retrieval - Language models for information retrieval - Text classification & Naive Bayes - Vector space classification - Support vector machines & machine learning on documents - Flat clustering - Hierarchical clustering - Matrix decompositions & latent semantic indexing - Web search basics - Web crawling and indexes - Link analysis Vgl. die digitale Fassung unter: http://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf.
  4. Anderson, J.D.; Perez-Carballo, J.: Information retrieval design : principles and options for information description, organization, display, and access in information retrieval databases, digital libraries, catalogs, and indexes (2005) 0.00
    0.003447279 = product of:
      0.017236395 = sum of:
        0.017236395 = weight(_text_:22 in 1833) [ClassicSimilarity], result of:
          0.017236395 = score(doc=1833,freq=2.0), product of:
            0.17819946 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05088753 = queryNorm
            0.09672529 = fieldWeight in 1833, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1833)
      0.2 = coord(1/5)
    
    Content
    Inhalt: Chapters 2 to 5: Scopes, Domains, and Display Media (pp. 47-102) Chapters 6 to 8: Documents, Analysis, and Indexing (pp. 103-176) Chapters 9 to 10: Exhaustivity and Specificity (pp. 177-196) Chapters 11 to 13: Displayed/Nondisplayed Indexes, Syntax, and Vocabulary Management (pp. 197-364) Chapters 14 to 16: Surrogation, Locators, and Surrogate Displays (pp. 365-390) Chapters 17 and 18: Arrangement and Size of Displayed Indexes (pp. 391-446) Chapters 19 to 21: Search Interface, Record Format, and Full-Text Display (pp. 447-536) Chapter 22: Implementation and Evaluation (pp. 537-541)
  5. Information science in transition (2009) 0.00
    0.003447279 = product of:
      0.017236395 = sum of:
        0.017236395 = weight(_text_:22 in 634) [ClassicSimilarity], result of:
          0.017236395 = score(doc=634,freq=2.0), product of:
            0.17819946 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05088753 = queryNorm
            0.09672529 = fieldWeight in 634, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.01953125 = fieldNorm(doc=634)
      0.2 = coord(1/5)
    
    Date
    22. 2.2013 11:35:35