Document (#36042)

Author
Manning, C.D.
Raghavan, P.
Schütze, H.
Title
Introduction to information retrieval
Imprint
Cambridge : Cambridge University Press
Year
2008
Pages
XXI, 482 S
Isbn
978-0-521-86571-5
Abstract
Class-tested and coherent, this textbook teaches information retrieval, including web search, text classification, and text clustering from basic concepts. Ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students. Slides and additional exercises are available for lecturers. - This book provides what Salton and Van Rijsbergen both failed to achieve. Even more important, unlike some other books in IR, the authors appear to care about making the theory as accessible as possible to the reader, on occasion including short primers to certain topics or choosing to explain difficult concepts using simplified approaches. Its coverage [is] excellent, the quality of writing high and I was surprised how much I learned from reading it. I think the online resources are impressive.
Content
Inhalt: Boolean retrieval - The term vocabulary & postings lists - Dictionaries and tolerant retrieval - Index construction - Index compression - Scoring, term weighting & the vector space model - Computing scores in a complete search system - Evaluation in information retrieval - Relevance feedback & query expansion - XML retrieval - Probabilistic information retrieval - Language models for information retrieval - Text classification & Naive Bayes - Vector space classification - Support vector machines & machine learning on documents - Flat clustering - Hierarchical clustering - Matrix decompositions & latent semantic indexing - Web search basics - Web crawling and indexes - Link analysis Vgl. die digitale Fassung unter: http://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf.
LCSH
Text processing (Computer science)
Information retrieval
Document clustering
Semantic Web
RSWK
Dokumentverarbeitung / Information Retrieval / Abfrageverarbeitung (GBV)
Information Retrieval / Einführung (BVB)
Semantic Web (BVB)
Textverarbeitung (BVB)
World Wide Web / Suchmaschine (HBZ)
BK
54.64 / Datenbanken
DDC
025.04
GHBS
AZE (PB)
TWP (PB)
LCC
QA76.9.T48
RVK
ST 306
ST 205
ST 270
ST 515

Similar documents (author)

  1. Manning, R.W.: ¬The Anglo-American Cataloguing Rules and their future (1999) 2.26
    2.256227 = sum of:
      2.256227 = product of:
        4.512454 = sum of:
          4.512454 = weight(author_txt:manning in 6809) [ClassicSimilarity], result of:
            4.512454 = score(doc=6809,freq=1.0), product of:
              0.78375393 = queryWeight, product of:
                1.1233603 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.07573692 = queryNorm
              5.7574883 = fieldWeight in 6809, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.625 = fieldNorm(doc=6809)
        0.5 = coord(1/2)
    
  2. Manning, R.W.: ¬The Anglo American Cataloguing Rules and their future (2000) 2.26
    2.256227 = sum of:
      2.256227 = product of:
        4.512454 = sum of:
          4.512454 = weight(author_txt:manning in 189) [ClassicSimilarity], result of:
            4.512454 = score(doc=189,freq=1.0), product of:
              0.78375393 = queryWeight, product of:
                1.1233603 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.07573692 = queryNorm
              5.7574883 = fieldWeight in 189, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.625 = fieldNorm(doc=189)
        0.5 = coord(1/2)
    
  3. Manning, C.D.: Part-of-Speech Tagging from 97% to 100% : is it time for some linguistics? (2011) 2.26
    2.256227 = sum of:
      2.256227 = product of:
        4.512454 = sum of:
          4.512454 = weight(author_txt:manning in 1121) [ClassicSimilarity], result of:
            4.512454 = score(doc=1121,freq=1.0), product of:
              0.78375393 = queryWeight, product of:
                1.1233603 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.07573692 = queryNorm
              5.7574883 = fieldWeight in 1121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.625 = fieldNorm(doc=1121)
        0.5 = coord(1/2)
    
  4. Mallett, J.; Manning, C.: Multimedia and database design : a discussion of database technology and its use in multimedia (1993) 1.80
    1.8049816 = sum of:
      1.8049816 = product of:
        3.6099632 = sum of:
          3.6099632 = weight(author_txt:manning in 6277) [ClassicSimilarity], result of:
            3.6099632 = score(doc=6277,freq=1.0), product of:
              0.78375393 = queryWeight, product of:
                1.1233603 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.07573692 = queryNorm
              4.6059904 = fieldWeight in 6277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.5 = fieldNorm(doc=6277)
        0.5 = coord(1/2)
    
  5. Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 1.80
    1.8049816 = sum of:
      1.8049816 = product of:
        3.6099632 = sum of:
          3.6099632 = weight(author_txt:manning in 1060) [ClassicSimilarity], result of:
            3.6099632 = score(doc=1060,freq=1.0), product of:
              0.78375393 = queryWeight, product of:
                1.1233603 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.07573692 = queryNorm
              4.6059904 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.5 = fieldNorm(doc=1060)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Colomb, R.M.: Information spaces : the architecture of cyberspace (2002) 0.16
    0.16061792 = sum of:
      0.16061792 = product of:
        0.80308956 = sum of:
          0.09313245 = weight(abstract_txt:exercises in 262) [ClassicSimilarity], result of:
            0.09313245 = score(doc=262,freq=1.0), product of:
              0.16095416 = queryWeight, product of:
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.021731686 = queryNorm
              0.57862717 = fieldWeight in 262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.078125 = fieldNorm(doc=262)
          0.037793893 = weight(abstract_txt:including in 262) [ClassicSimilarity], result of:
            0.037793893 = score(doc=262,freq=1.0), product of:
              0.11115421 = queryWeight, product of:
                1.1752408 = boost
                4.352168 = idf(docFreq=1547, maxDocs=44218)
                0.021731686 = queryNorm
              0.34001315 = fieldWeight in 262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.352168 = idf(docFreq=1547, maxDocs=44218)
                0.078125 = fieldNorm(doc=262)
          0.043278746 = weight(abstract_txt:concepts in 262) [ClassicSimilarity], result of:
            0.043278746 = score(doc=262,freq=1.0), product of:
              0.12166378 = queryWeight, product of:
                1.2295454 = boost
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.021731686 = queryNorm
              0.35572416 = fieldWeight in 262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.078125 = fieldNorm(doc=262)
          0.59508246 = weight(subject_txt:dokumentverarbeitung in 262) [ClassicSimilarity], result of:
            0.59508246 = score(doc=262,freq=1.0), product of:
              0.27898505 = queryWeight, product of:
                1.3165561 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.021731686 = queryNorm
              2.1330264 = fieldWeight in 262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.21875 = fieldNorm(doc=262)
          0.033801977 = weight(abstract_txt:information in 262) [ClassicSimilarity], result of:
            0.033801977 = score(doc=262,freq=3.0), product of:
              0.10318255 = queryWeight, product of:
                1.961226 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.021731686 = queryNorm
              0.32759392 = fieldWeight in 262, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=262)
        0.2 = coord(5/25)
    
  2. Kuropka, D.: Modelle zur Repräsentation natürlichsprachlicher Dokumente : Ontologie-basiertes Information-Filtering und -Retrieval mit relationalen Datenbanken (2004) 0.15
    0.15445429 = sum of:
      0.15445429 = product of:
        0.96533936 = sum of:
          0.8415737 = weight(subject_txt:dokumentverarbeitung in 4325) [ClassicSimilarity], result of:
            0.8415737 = score(doc=4325,freq=2.0), product of:
              0.27898505 = queryWeight, product of:
                1.3165561 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.021731686 = queryNorm
              3.0165548 = fieldWeight in 4325, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.21875 = fieldNorm(doc=4325)
          0.03638114 = weight(abstract_txt:text in 4325) [ClassicSimilarity], result of:
            0.03638114 = score(doc=4325,freq=1.0), product of:
              0.14394596 = queryWeight, product of:
                1.6379825 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.021731686 = queryNorm
              0.25274166 = fieldWeight in 4325, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4325)
          0.022079358 = weight(abstract_txt:information in 4325) [ClassicSimilarity], result of:
            0.022079358 = score(doc=4325,freq=2.0), product of:
              0.10318255 = queryWeight, product of:
                1.961226 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.021731686 = queryNorm
              0.21398345 = fieldWeight in 4325, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=4325)
          0.065305166 = weight(abstract_txt:retrieval in 4325) [ClassicSimilarity], result of:
            0.065305166 = score(doc=4325,freq=2.0), product of:
              0.21260835 = queryWeight, product of:
                2.8152351 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.021731686 = queryNorm
              0.3071618 = fieldWeight in 4325, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=4325)
        0.16 = coord(4/25)
    
  3. Kuropka, D.: Modelle zur Repräsentation natürlichsprachlicher Dokumente : Ontologie-basiertes Information-Filtering und -Retrieval mit relationalen Datenbanken (2004) 0.15
    0.15445429 = sum of:
      0.15445429 = product of:
        0.96533936 = sum of:
          0.8415737 = weight(subject_txt:dokumentverarbeitung in 4385) [ClassicSimilarity], result of:
            0.8415737 = score(doc=4385,freq=2.0), product of:
              0.27898505 = queryWeight, product of:
                1.3165561 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.021731686 = queryNorm
              3.0165548 = fieldWeight in 4385, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.21875 = fieldNorm(doc=4385)
          0.03638114 = weight(abstract_txt:text in 4385) [ClassicSimilarity], result of:
            0.03638114 = score(doc=4385,freq=1.0), product of:
              0.14394596 = queryWeight, product of:
                1.6379825 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.021731686 = queryNorm
              0.25274166 = fieldWeight in 4385, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4385)
          0.022079358 = weight(abstract_txt:information in 4385) [ClassicSimilarity], result of:
            0.022079358 = score(doc=4385,freq=2.0), product of:
              0.10318255 = queryWeight, product of:
                1.961226 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.021731686 = queryNorm
              0.21398345 = fieldWeight in 4385, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=4385)
          0.065305166 = weight(abstract_txt:retrieval in 4385) [ClassicSimilarity], result of:
            0.065305166 = score(doc=4385,freq=2.0), product of:
              0.21260835 = queryWeight, product of:
                2.8152351 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.021731686 = queryNorm
              0.3071618 = fieldWeight in 4385, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=4385)
        0.16 = coord(4/25)
    
  4. Belew, R.K.: Finding out about : a cognitive perspective on search engine technology and the WWW (2001) 0.12
    0.1204782 = sum of:
      0.1204782 = product of:
        0.37649438 = sum of:
          0.055879474 = weight(abstract_txt:exercises in 3346) [ClassicSimilarity], result of:
            0.055879474 = score(doc=3346,freq=1.0), product of:
              0.16095416 = queryWeight, product of:
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.021731686 = queryNorm
              0.3471763 = fieldWeight in 3346, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
          0.09807357 = weight(abstract_txt:textbook in 3346) [ClassicSimilarity], result of:
            0.09807357 = score(doc=3346,freq=2.0), product of:
              0.18587688 = queryWeight, product of:
                1.0746365 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.021731686 = queryNorm
              0.5276265 = fieldWeight in 3346, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
          0.022676336 = weight(abstract_txt:including in 3346) [ClassicSimilarity], result of:
            0.022676336 = score(doc=3346,freq=1.0), product of:
              0.11115421 = queryWeight, product of:
                1.1752408 = boost
                4.352168 = idf(docFreq=1547, maxDocs=44218)
                0.021731686 = queryNorm
              0.20400788 = fieldWeight in 3346, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.352168 = idf(docFreq=1547, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
          0.024639813 = weight(abstract_txt:semantic in 3346) [ClassicSimilarity], result of:
            0.024639813 = score(doc=3346,freq=1.0), product of:
              0.11748136 = queryWeight, product of:
                1.2082266 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.021731686 = queryNorm
              0.20973381 = fieldWeight in 3346, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
          0.034483954 = weight(abstract_txt:making in 3346) [ClassicSimilarity], result of:
            0.034483954 = score(doc=3346,freq=1.0), product of:
              0.14699031 = queryWeight, product of:
                1.3514757 = boost
                5.0048037 = idf(docFreq=805, maxDocs=44218)
                0.021731686 = queryNorm
              0.23460017 = fieldWeight in 3346, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0048037 = idf(docFreq=805, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
          0.05457171 = weight(abstract_txt:text in 3346) [ClassicSimilarity], result of:
            0.05457171 = score(doc=3346,freq=4.0), product of:
              0.14394596 = queryWeight, product of:
                1.6379825 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.021731686 = queryNorm
              0.37911248 = fieldWeight in 3346, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
          0.026182897 = weight(abstract_txt:information in 3346) [ClassicSimilarity], result of:
            0.026182897 = score(doc=3346,freq=5.0), product of:
              0.10318255 = queryWeight, product of:
                1.961226 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.021731686 = queryNorm
              0.25375316 = fieldWeight in 3346, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
          0.05998663 = weight(abstract_txt:retrieval in 3346) [ClassicSimilarity], result of:
            0.05998663 = score(doc=3346,freq=3.0), product of:
              0.21260835 = queryWeight, product of:
                2.8152351 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.021731686 = queryNorm
              0.28214616 = fieldWeight in 3346, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.046875 = fieldNorm(doc=3346)
        0.32 = coord(8/25)
    
  5. Salton, G.; Rijsbergen, C.J. van; Maron, M.E.: Panel on key issues in information retrieval (1983) 0.12
    0.11726889 = sum of:
      0.11726889 = product of:
        0.7329306 = sum of:
          0.23308441 = weight(abstract_txt:salton in 7410) [ClassicSimilarity], result of:
            0.23308441 = score(doc=7410,freq=1.0), product of:
              0.23707822 = queryWeight, product of:
                1.2136536 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.021731686 = queryNorm
              0.98315406 = fieldWeight in 7410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.109375 = fieldNorm(doc=7410)
          0.2580538 = weight(abstract_txt:rijsbergen in 7410) [ClassicSimilarity], result of:
            0.2580538 = score(doc=7410,freq=1.0), product of:
              0.25372097 = queryWeight, product of:
                1.25553 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.021731686 = queryNorm
              1.0170772 = fieldWeight in 7410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.109375 = fieldNorm(doc=7410)
          0.061093427 = weight(abstract_txt:information in 7410) [ClassicSimilarity], result of:
            0.061093427 = score(doc=7410,freq=5.0), product of:
              0.10318255 = queryWeight, product of:
                1.961226 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.021731686 = queryNorm
              0.5920907 = fieldWeight in 7410, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.109375 = fieldNorm(doc=7410)
          0.18069895 = weight(abstract_txt:retrieval in 7410) [ClassicSimilarity], result of:
            0.18069895 = score(doc=7410,freq=5.0), product of:
              0.21260835 = queryWeight, product of:
                2.8152351 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.021731686 = queryNorm
              0.8499146 = fieldWeight in 7410, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.109375 = fieldNorm(doc=7410)
        0.16 = coord(4/25)