Search (492 results, page 1 of 25)

  • × theme_ss:"Computerlinguistik"
  1. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.34
    0.34402242 = product of:
      0.5406067 = sum of:
        0.045470655 = weight(_text_:web in 563) [ClassicSimilarity], result of:
          0.045470655 = score(doc=563,freq=8.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.43268442 = fieldWeight in 563, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.15343313 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.15343313 = score(doc=563,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.0065784254 = weight(_text_:information in 563) [ClassicSimilarity], result of:
          0.0065784254 = score(doc=563,freq=2.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.116372846 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.019532489 = weight(_text_:retrieval in 563) [ClassicSimilarity], result of:
          0.019532489 = score(doc=563,freq=2.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.20052543 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.15343313 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.15343313 = score(doc=563,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.15343313 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.15343313 = score(doc=563,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.008725693 = product of:
          0.02617708 = sum of:
            0.02617708 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.02617708 = score(doc=563,freq=2.0), product of:
                0.11276386 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032201413 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.33333334 = coord(1/3)
      0.6363636 = coord(7/11)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  2. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.30
    0.29612988 = product of:
      0.5429048 = sum of:
        0.051144376 = product of:
          0.15343313 = sum of:
            0.15343313 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.15343313 = score(doc=562,freq=2.0), product of:
                0.27300394 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.032201413 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.022735327 = weight(_text_:web in 562) [ClassicSimilarity], result of:
          0.022735327 = score(doc=562,freq=2.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.21634221 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.15343313 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.15343313 = score(doc=562,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.15343313 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.15343313 = score(doc=562,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.15343313 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.15343313 = score(doc=562,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.008725693 = product of:
          0.02617708 = sum of:
            0.02617708 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.02617708 = score(doc=562,freq=2.0), product of:
                0.11276386 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032201413 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
      0.54545456 = coord(6/11)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  3. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.19
    0.18597956 = product of:
      0.5114438 = sum of:
        0.051144376 = product of:
          0.15343313 = sum of:
            0.15343313 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.15343313 = score(doc=862,freq=2.0), product of:
                0.27300394 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.032201413 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.15343313 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.15343313 = score(doc=862,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.15343313 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.15343313 = score(doc=862,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.15343313 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.15343313 = score(doc=862,freq=2.0), product of:
            0.27300394 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032201413 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.36363637 = coord(4/11)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  4. Jensen, N.: Evaluierung von mehrsprachigem Web-Retrieval : Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF) (2006) 0.05
    0.053686425 = product of:
      0.14763767 = sum of:
        0.039378747 = weight(_text_:web in 5964) [ClassicSimilarity], result of:
          0.039378747 = score(doc=5964,freq=6.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.37471575 = fieldWeight in 5964, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
        0.0065784254 = weight(_text_:information in 5964) [ClassicSimilarity], result of:
          0.0065784254 = score(doc=5964,freq=2.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.116372846 = fieldWeight in 5964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
        0.033831265 = weight(_text_:retrieval in 5964) [ClassicSimilarity], result of:
          0.033831265 = score(doc=5964,freq=6.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.34732026 = fieldWeight in 5964, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
        0.067849234 = weight(_text_:konstanz in 5964) [ClassicSimilarity], result of:
          0.067849234 = score(doc=5964,freq=2.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.37373447 = fieldWeight in 5964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
      0.36363637 = coord(4/11)
    
    Abstract
    Der vorliegende Artikel beschreibt die Experimente der Universität Hildesheim im Rahmen des ersten Web Track der CLEF-Initiative (WebCLEF) im Jahr 2005. Bei der Teilnahme konnten Erfahrungen mit einem multilingualen Web-Korpus (EuroGOV) bei der Vorverarbeitung, der Topic- bzw. Query-Entwicklung, bei sprachunabhängigen Indexierungsmethoden und multilingualen Retrieval-Strategien gesammelt werden. Aufgrund des großen Um-fangs des Korpus und der zeitlichen Einschränkungen wurden multilinguale Indizes aufgebaut. Der Artikel beschreibt die Vorgehensweise bei der Teilnahme der Universität Hildesheim und die Ergebnisse der offiziell eingereichten sowie weiterer Experimente. Für den Multilingual Task konnte das beste Ergebnis in CLEF erzielt werden.
    Imprint
    Konstanz : UVK Verlagsgesellschaft
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  5. Vichot, F.; Wolinksi, F.; Tomeh, J.; Guennou, S.; Dillet, B.; Aydjian, S.: High precision hypertext navigation based on NLP automation extractions (1997) 0.05
    0.051250994 = product of:
      0.1879203 = sum of:
        0.013156851 = weight(_text_:information in 733) [ClassicSimilarity], result of:
          0.013156851 = score(doc=733,freq=2.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.23274569 = fieldWeight in 733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=733)
        0.039064977 = weight(_text_:retrieval in 733) [ClassicSimilarity], result of:
          0.039064977 = score(doc=733,freq=2.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.40105087 = fieldWeight in 733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=733)
        0.13569847 = weight(_text_:konstanz in 733) [ClassicSimilarity], result of:
          0.13569847 = score(doc=733,freq=2.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.74746895 = fieldWeight in 733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.09375 = fieldNorm(doc=733)
      0.27272728 = coord(3/11)
    
    Imprint
    Konstanz : Universitätsverlag
    Source
    Hypertext - Information Retrieval - Multimedia '97: Theorien, Modelle und Implementierungen integrierter elektronischer Informationssysteme. Proceedings HIM '97. Hrsg.: N. Fuhr u.a
  6. Experimentelles und praktisches Information Retrieval : Festschrift für Gerhard Lustig (1992) 0.04
    0.04465778 = product of:
      0.1637452 = sum of:
        0.016113786 = weight(_text_:information in 4) [ClassicSimilarity], result of:
          0.016113786 = score(doc=4,freq=12.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.2850541 = fieldWeight in 4, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4)
        0.051678102 = weight(_text_:retrieval in 4) [ClassicSimilarity], result of:
          0.051678102 = score(doc=4,freq=14.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.5305404 = fieldWeight in 4, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=4)
        0.095953315 = weight(_text_:konstanz in 4) [ClassicSimilarity], result of:
          0.095953315 = score(doc=4,freq=4.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.5285404 = fieldWeight in 4, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.046875 = fieldNorm(doc=4)
      0.27272728 = coord(3/11)
    
    Content
    Enthält die Beiträge: SALTON, G.: Effective text understanding in information retrieval; KRAUSE, J.: Intelligentes Information retrieval; FUHR, N.: Konzepte zur Gestaltung zukünftiger Information-Retrieval-Systeme; HÜTHER, H.: Überlegungen zu einem mathematischen Modell für die Type-Token-, die Grundform-Token und die Grundform-Type-Relation; KNORZ, G.: Automatische Generierung inferentieller Links in und zwischen Hyperdokumenten; KONRAD, E.: Zur Effektivitätsbewertung von Information-Retrieval-Systemen; HENRICHS, N.: Retrievalunterstützung durch automatisch generierte Wortfelder; LÜCK, W., W. RITTBERGER u. M. SCHWANTNER: Der Einsatz des Automatischen Indexierungs- und Retrieval-System (AIR) im Fachinformationszentrum Karlsruhe; REIMER, U.: Verfahren der Automatischen Indexierung. Benötigtes Vorwissen und Ansätze zu seiner automatischen Akquisition: Ein Überblick; ENDRES-NIGGEMEYER, B.: Dokumentrepräsentation: Ein individuelles prozedurales Modell des Abstracting, des Indexierens und Klassifizierens; SEELBACH, D.: Zur Entwicklung von zwei- und mehrsprachigen lexikalischen Datenbanken und Terminologiedatenbanken; ZIMMERMANN, H.: Der Einfluß der Sprachbarrieren in Europa und Möglichkeiten zu ihrer Minderung; LENDERS, W.: Wörter zwischen Welt und Wissen; PANYR, J.: Frames, Thesauri und automatische Klassifikation (Clusteranalyse): HAHN, U.: Forschungsstrategien und Erkenntnisinteressen in der anwendungsorientierten automatischen Sprachverarbeitung. Überlegungen zu einer ingenieurorientierten Computerlinguistik; KUHLEN, R.: Hypertext und Information Retrieval - mehr als Browsing und Suche.
    Imprint
    Konstanz : Univ.-Verlag Konstanz
  7. Kummer, N.: Indexierungstechniken für das japanische Retrieval (2006) 0.04
    0.040357746 = product of:
      0.1479784 = sum of:
        0.012404398 = weight(_text_:information in 5979) [ClassicSimilarity], result of:
          0.012404398 = score(doc=5979,freq=4.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.21943474 = fieldWeight in 5979, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=5979)
        0.04510835 = weight(_text_:retrieval in 5979) [ClassicSimilarity], result of:
          0.04510835 = score(doc=5979,freq=6.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.46309367 = fieldWeight in 5979, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=5979)
        0.09046565 = weight(_text_:konstanz in 5979) [ClassicSimilarity], result of:
          0.09046565 = score(doc=5979,freq=2.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.49831262 = fieldWeight in 5979, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.0625 = fieldNorm(doc=5979)
      0.27272728 = coord(3/11)
    
    Abstract
    Der vorliegende Artikel beschreibt die Herausforderungen, die die japanische Sprache aufgrund der besonderen Struktur ihres Schriftsystems an das Information Retrieval stellt und präsentiert Strategien und Ansätze für die Indexierung japanischer Dokumente. Im Besonderen soll auf die Effektivität aussprachebasierter (yomi-based) Indexierung sowie Fusion verschiedener einzelner Indexierungsansätze eingegangen werden.
    Imprint
    Konstanz : UVK Verlagsgesellschaft
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  8. Hahn, U.: Informationslinguistik : I: Einführung in das linguistische Information Retrieval (1985) 0.04
    0.038958173 = product of:
      0.14284663 = sum of:
        0.009806538 = weight(_text_:information in 3115) [ClassicSimilarity], result of:
          0.009806538 = score(doc=3115,freq=10.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.1734784 = fieldWeight in 3115, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3115)
        0.03189642 = weight(_text_:retrieval in 3115) [ClassicSimilarity], result of:
          0.03189642 = score(doc=3115,freq=12.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.32745665 = fieldWeight in 3115, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=3115)
        0.10114367 = weight(_text_:konstanz in 3115) [ClassicSimilarity], result of:
          0.10114367 = score(doc=3115,freq=10.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.55713046 = fieldWeight in 3115, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.03125 = fieldNorm(doc=3115)
      0.27272728 = coord(3/11)
    
    Abstract
    Als Bestandteil des Ausbildungsprogramms im Aufbaustudiun Informationswissenschaft an der Universitaet Konstanz (vgl. VOGEL 1984) ist ein Veranstaltungszyklus zur Informationslinguistik entwickelt worden. Die curriculare Planung zu diesem informationswissenschaftlichen Teilgebiet war stark eingebunden in die gesamte Organisation des Aufbaustudiums bzw. Diplomstudiengangs Informationswissenschaft, wobei insbesondere zwei Faktoren einen bestimmenden Einfluss auf die Definition der Lehrinhalte hatten: - die inhaltlichen Anforderungen des entwickelten Berufsbilds Informationsvermittlung und Informationsmanagemsnt - der zulaessige Zeitrahmen des Diplom-Studiengangs Informationswissenschaft (2 Jahre) und die nicht geringen Aufwaende fuer das uebrige Ausbildungsprogramm Die Informationslinguistik ist somit aus einer stark funktionalen Sicht heraus definiert worden, die den Leistungsbeitrag zur umfassenden informationswissenschaftlichen Ausbildung letztlich mehr betont als dis ziplinaere Eigendynamik. Die jetzt realisierte Verbindung aus obligatorischen und fakultativen Veranstaltungen zur Informationslinguistik erlaubt jedoch den an entsprechenden Fachfragen interessierten Studenten durchaus eine qualitativ ausreichende Vertiefung im Rahmen des informationswissenschaftlichen Lehrangebots, das ggf. noch durch Veranstaltungen der unterschiedlichen linguistischen Abteilungen an der Universitaet Konstanz ergaenzt werden kann. Schliesslich ist einer der Forschungsschwerpunkte des Lehrstuhls fuer Informationswissenschaft, das automatische Abstracting-Projekt TOPIC (HAHN/REIMER 1985), eindeutig dem Bereich Informationslinguistik zuzuordnen, der engagierten Studenten weitere Optionen fuer eine spezialisierte Ausbildung und - im Rahmen von durch Studien- und Diplomarbeiten klar abgesteckten Aufgaben - eigenstaendige Forschungs- und Entwicklungstaetigkeit eroeffnet. Informationslinguistik wird am Lehrstuhl fuer Informationswissenschaft der Universitaet Konstanz nun in folgender Konstellation gelehrt:
    (1) "Informationslinguistik I: Einfuehrung in das linguistische Information Retrieval" (2) "Informationslinguistik II: linguistische und statistische Verfahren im experimentellen Information Retrieval" (3) "Intelligente Informationssysteme: Verfahren der Kuenstlichen Intelligenz im experimentellen Information Retrieval" Kursabschnitt zu natuerlichsprachlichen Systemen (4) Spezialkurse zum automatischen Uebersetzen, Indexing und Retrieval, Abstracting usf. dienen zur Vertiefung informationslinguistischer Spezialthemen Die Kurse (1) und (3) gehoeren zu dem Pool der Pflichtveranstaltungen aller Studenten des Diplom-Studiengangs Informationswissenschaft, waehrend (2) und (4) lediglich zu den Wahlpflichtveranstaltungen zaehlen, die aber obligatorisch fuer die Studenten des Diplomstudiengangs sind, die ihren Schwerpunkt (z.B. in Form der Diplomarbeit) im Bereich Informationslinguistik suchen - fuer alle anderen Studenten zaehlen diese Kurse zum Zusatz angebot an Lehrveranstaltungen.
    Content
    2. Teil u.d.T.: Linguistische und statistische Verfahren im experimentellen Information Retrieval
    Footnote
    Kurs-Skript der gleichnamigen Lehrveranstaltung im WS 1982/83 am Lehrstuhl für Informationswissenschaft der Universität Konstanz
    Imprint
    Konstanz : Universität / Fachgruppe Politik-/ Verwaltungswissenschaft / Informationswissenschaft
  9. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.04
    0.038544495 = product of:
      0.10599736 = sum of:
        0.034922563 = weight(_text_:wide in 1338) [ClassicSimilarity], result of:
          0.034922563 = score(doc=1338,freq=2.0), product of:
            0.14267668 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.032201413 = queryNorm
            0.24476713 = fieldWeight in 1338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
        0.018946107 = weight(_text_:web in 1338) [ClassicSimilarity], result of:
          0.018946107 = score(doc=1338,freq=2.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.18028519 = fieldWeight in 1338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
        0.012258172 = weight(_text_:information in 1338) [ClassicSimilarity], result of:
          0.012258172 = score(doc=1338,freq=10.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.21684799 = fieldWeight in 1338, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
        0.039870523 = weight(_text_:retrieval in 1338) [ClassicSimilarity], result of:
          0.039870523 = score(doc=1338,freq=12.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.40932083 = fieldWeight in 1338, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
      0.36363637 = coord(4/11)
    
    Abstract
    A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.8, S.1577-1596
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  10. Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.04
    0.037473563 = product of:
      0.08244184 = sum of:
        0.024445795 = weight(_text_:wide in 1616) [ClassicSimilarity], result of:
          0.024445795 = score(doc=1616,freq=2.0), product of:
            0.14267668 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.032201413 = queryNorm
            0.171337 = fieldWeight in 1616, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
        0.026524551 = weight(_text_:web in 1616) [ClassicSimilarity], result of:
          0.026524551 = score(doc=1616,freq=8.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.25239927 = fieldWeight in 1616, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
        0.0066465978 = weight(_text_:information in 1616) [ClassicSimilarity], result of:
          0.0066465978 = score(doc=1616,freq=6.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.11757882 = fieldWeight in 1616, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
        0.019734902 = weight(_text_:retrieval in 1616) [ClassicSimilarity], result of:
          0.019734902 = score(doc=1616,freq=6.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.20260347 = fieldWeight in 1616, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
        0.005089988 = product of:
          0.015269964 = sum of:
            0.015269964 = weight(_text_:22 in 1616) [ClassicSimilarity], result of:
              0.015269964 = score(doc=1616,freq=2.0), product of:
                0.11276386 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032201413 = queryNorm
                0.1354154 = fieldWeight in 1616, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1616)
          0.33333334 = coord(1/3)
      0.45454547 = coord(5/11)
    
    Abstract
    The information available in languages other than English in the World Wide Web is increasing significantly. According to a report from Computer Economics in 1999, 54% of Internet users are English speakers ("English Will Dominate Web for Only Three More Years," Computer Economics, July 9, 1999, http://www.computereconomics. com/new4/pr/pr990610.html). However, it is predicted that there will be only 60% increase in Internet users among English speakers verses a 150% growth among nonEnglish speakers for the next five years. By 2005, 57% of Internet users will be non-English speakers. A report by CNN.com in 2000 showed that the number of Internet users in China had been increased from 8.9 million to 16.9 million from January to June in 2000 ("Report: China Internet users double to 17 million," CNN.com, July, 2000, http://cnn.org/2000/TECH/computing/07/27/ china.internet.reut/index.html). According to Nielsen/ NetRatings, there was a dramatic leap from 22.5 millions to 56.6 millions Internet users from 2001 to 2002. China had become the second largest global at-home Internet population in 2002 (US's Internet population was 166 millions) (Robyn Greenspan, "China Pulls Ahead of Japan," Internet.com, April 22, 2002, http://cyberatias.internet.com/big-picture/geographics/article/0,,5911_1013841,00. html). All of the evidences reveal the importance of crosslingual research to satisfy the needs in the near future. Digital library research has been focusing in structural and semantic interoperability in the past. Searching and retrieving objects across variations in protocols, formats and disciplines are widely explored (Schatz, B., & Chen, H. (1999). Digital libraries: technological advances and social impacts. IEEE Computer, Special Issue an Digital Libraries, February, 32(2), 45-50.; Chen, H., Yen, J., & Yang, C.C. (1999). International activities: development of Asian digital libraries. IEEE Computer, Special Issue an Digital Libraries, 32(2), 48-49.). However, research in crossing language boundaries, especially across European languages and Oriental languages, is still in the initial stage. In this proposal, we put our focus an cross-lingual semantic interoperability by developing automatic generation of a cross-lingual thesaurus based an English/Chinese parallel corpus. When the searchers encounter retrieval problems, Professional librarians usually consult the thesaurus to identify other relevant vocabularies. In the problem of searching across language boundaries, a cross-lingual thesaurus, which is generated by co-occurrence analysis and Hopfield network, can be used to generate additional semantically relevant terms that cannot be obtained from dictionary. In particular, the automatically generated cross-lingual thesaurus is able to capture the unknown words that do not exist in a dictionary, such as names of persons, organizations, and events. Due to Hong Kong's unique history background, both English and Chinese are used as official languages in all legal documents. Therefore, English/Chinese cross-lingual information retrieval is critical for applications in courts and the government. In this paper, we develop an automatic thesaurus by the Hopfield network based an a parallel corpus collected from the Web site of the Department of Justice of the Hong Kong Special Administrative Region (HKSAR) Government. Experiments are conducted to measure the precision and recall of the automatic generated English/Chinese thesaurus. The result Shows that such thesaurus is a promising tool to retrieve relevant terms, especially in the language that is not the same as the input term. The direct translation of the input term can also be retrieved in most of the cases.
    Footnote
    Teil eines Themenheftes: "Web retrieval and mining: A machine learning perspective"
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.7, S.671-682
  11. Hahn, U.: Informationslinguistik : II: Einführung in das linguistische Information Retrieval (1985) 0.03
    0.03492163 = product of:
      0.12804598 = sum of:
        0.009399708 = weight(_text_:information in 3116) [ClassicSimilarity], result of:
          0.009399708 = score(doc=3116,freq=12.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.16628155 = fieldWeight in 3116, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3116)
        0.030145561 = weight(_text_:retrieval in 3116) [ClassicSimilarity], result of:
          0.030145561 = score(doc=3116,freq=14.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.30948192 = fieldWeight in 3116, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3116)
        0.088500716 = weight(_text_:konstanz in 3116) [ClassicSimilarity], result of:
          0.088500716 = score(doc=3116,freq=10.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.48748916 = fieldWeight in 3116, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3116)
      0.27272728 = coord(3/11)
    
    Abstract
    Als Bestandteil des Ausbildungsprogramms im Aufbaustudiun Informationswissenschaft an der Universitaet Konstanz (vgl. VOGEL 1984) ist ein Veranstaltungszyklus zur Informationslinguistik entwickelt worden. Die curriculare Planung zu diesem informationswissenschaftlichen Teilgebiet war stark eingebunden in die gesamte Organisation des Aufbaustudiums bzw. Diplomstudiengangs Informationswissenschaft, wobei insbesondere zwei Faktoren einen bestimmenden Einfluss auf die Definition der Lehrinhalte hatten: - die inhaltlichen Anforderungen des entwickelten Berufsbilds Informationsvermittlung und Informationsmanagemsnt - der zulaessige Zeitrahmen des Diplom-Studiengangs Informationswissenschaft (2 Jahre) und die nicht geringen Aufwaende fuer das uebrige Ausbildungsprogramm Die Informationslinguistik ist somit aus einer stark funktionalen Sicht heraus definiert worden, die den Leistungsbeitrag zur umfassenden informationswissenschaftlichen Ausbildung letztlich mehr betont als dis ziplinaere Eigendynamik. Die jetzt realisierte Verbindung aus obligatorischen und fakultativen Veranstaltungen zur Informationslinguistik erlaubt jedoch den an entsprechenden Fachfragen interessierten Studenten durchaus eine qualitativ ausreichende Vertiefung im Rahmen des informationswissenschaftlichen Lehrangebots, das ggf. noch durch Veranstaltungen der unterschiedlichen linguistischen Abteilungen an der Universitaet Konstanz ergaenzt werden kann. Schliesslich ist einer der Forschungsschwerpunkte des Lehrstuhls fuer Informationswissenschaft, das automatische Abstracting-Projekt TOPIC (HAHN/REIMER 1985), eindeutig dem Bereich Informationslinguistik zuzuordnen, der engagierten Studenten weitere Optionen fuer eine spezialisierte Ausbildung und - im Rahmen von durch Studien- und Diplomarbeiten klar abgesteckten Aufgaben - eigenstaendige Forschungs- und Entwicklungstaetigkeit eroeffnet. Informationslinguistik wird am Lehrstuhl fuer Informationswissenschaft der Universitaet Konstanz nun in folgender Konstellation gelehrt:
    (1) "Informationslinguistik I: Einfuehrung in das linguistische Information Retrieval" (2) "Informationslinguistik II: linguistische und statistische Verfahren im experimentellen Information Retrieval" (3) "Intelligente Informationssysteme: Verfahren der Kuenstlichen Intelligenz im experimentellen Information Retrieval" Kursabschnitt zu natuerlichsprachlichen Systemen (4) Spezialkurse zum automatischen Uebersetzen, Indexing und Retrieval, Abstracting usf. dienen zur Vertiefung informationslinguistischer Spezialthemen Die Kurse (1) und (3) gehoeren zu dem Pool der Pflichtveranstaltungen aller Studenten des Diplom-Studiengangs Informationswissenschaft, waehrend (2) und (4) lediglich zu den Wahlpflichtveranstaltungen zaehlen, die aber obligatorisch fuer die Studenten des Diplomstudiengangs sind, die ihren Schwerpunkt (z.B. in Form der Diplomarbeit) im Bereich Informationslinguistik suchen - fuer alle anderen Studenten zaehlen diese Kurse zum Zusatz angebot an Lehrveranstaltungen.
    Das vorliegende Skript entspricht dem Inhalt des Kurses "Informationslinguistik II" im SS 1983 bzw. SS 1984. Es ist im Juli 1983 inhaltlich abgeschlossen und im Januar 1985 lediglich redaktionell ueberarbeitet worden. Die Erstellung des Skripts entspricht einem dezidierten Auftrag des Projekts "Informationsvermittlung", der die Entwicklung geeigneter Lehrmaterialien zum informationswissenschaftlichen Aufbaustudium vorsah. Aufgrund des engen Projektzeitrahmens (1982-84) kann das Skript nicht in dem Masse voll ausgereift und ausformuliert sein, wie es gaengigen Standards entspraeche. Im Unterschied zum Skript "Informationslinguistik I" (HAHN 1985) laesst das vorliegende Skript wahlweise eine eher methoden- oder mehr systembezogene Darstellung informationslinguistischer Konzepte des experimentellen Information Retrieval zu (beides zusammen schliesst der enge Zeitrahmen eines Sommersemesters ausl). Die Entscheidung darueber sollte wenn moeglich in Abhaengigkeit zur personellen Zusammensetzung des Kurses getroffen werden, wobei - sofern die bislang genachten Erfahrungen verallgemeinerbar sind - sich bei einem nicht ausschliesslich an einer informationslinguistischen Spezialisierung interessierten und damit heterogenen Publikum die mehr systembezogene Praesentation als fuer das Verstaendnis informationslinguistischer Fragestellungen und entsprechender Verfahrensloesungen guenstiger gezeigt hat. Innerhalb dieser Nuancierung besitzt aber auch dieses Skript schon eine akzeptable inhaltliche Stabilitaet. Nichtsdestotrotz sollte gerade die Veroeffentlichung des Skripts als Anregung dienen, kritische Kommentare, Anmerkungen und Ergaenzungen zu diesem curricularen Entwurf herauszufordern, um damit die weitere disziplinaere Klaerung der Informationslinguistik zu foerdern.
    Content
    1. Teil u.d.T.: Einführung in das linguistische Information Retrieval
    Footnote
    Kurs-Skript der gleichnamigen Lehrveranstaltung im WS 1982/83 am Lehrstuhl für Informationswissenschaft der Universität Konstanz
    Imprint
    Konstanz : Universität / Fachgruppe Politik-/ Verwaltungswissenschaft / Informationswissenschaft
  12. Chowdhury, G.G.: Natural language processing (2002) 0.03
    0.034752384 = product of:
      0.09556905 = sum of:
        0.041907072 = weight(_text_:wide in 4284) [ClassicSimilarity], result of:
          0.041907072 = score(doc=4284,freq=2.0), product of:
            0.14267668 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.032201413 = queryNorm
            0.29372054 = fieldWeight in 4284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
        0.022735327 = weight(_text_:web in 4284) [ClassicSimilarity], result of:
          0.022735327 = score(doc=4284,freq=2.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.21634221 = fieldWeight in 4284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
        0.011394167 = weight(_text_:information in 4284) [ClassicSimilarity], result of:
          0.011394167 = score(doc=4284,freq=6.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.20156369 = fieldWeight in 4284, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
        0.019532489 = weight(_text_:retrieval in 4284) [ClassicSimilarity], result of:
          0.019532489 = score(doc=4284,freq=2.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.20052543 = fieldWeight in 4284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
      0.36363637 = coord(4/11)
    
    Abstract
    Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge an how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform desired tasks. The foundations of NLP lie in a number of disciplines, namely, computer and information sciences, linguistics, mathematics, electrical and electronic engineering, artificial intelligence and robotics, and psychology. Applications of NLP include a number of fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems. One important application area that is relatively new and has not been covered in previous ARIST chapters an NLP relates to the proliferation of the World Wide Web and digital libraries.
    Source
    Annual review of information science and technology. 37(2003), S.51-90
  13. Nhongkai, S.N.; Bentz, H.-J.: Bilinguale Suche mittels Konzeptnetzen (2006) 0.03
    0.034167327 = product of:
      0.1252802 = sum of:
        0.008771234 = weight(_text_:information in 3914) [ClassicSimilarity], result of:
          0.008771234 = score(doc=3914,freq=2.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.1551638 = fieldWeight in 3914, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3914)
        0.026043316 = weight(_text_:retrieval in 3914) [ClassicSimilarity], result of:
          0.026043316 = score(doc=3914,freq=2.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.26736724 = fieldWeight in 3914, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3914)
        0.09046565 = weight(_text_:konstanz in 3914) [ClassicSimilarity], result of:
          0.09046565 = score(doc=3914,freq=2.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.49831262 = fieldWeight in 3914, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.0625 = fieldNorm(doc=3914)
      0.27272728 = coord(3/11)
    
    Imprint
    Konstanz : UVK Verlagsgesellschaft
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  14. Tartakovski, O.; Shramko, M.: Implementierung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten (2006) 0.03
    0.033337705 = product of:
      0.12223825 = sum of:
        0.010853848 = weight(_text_:information in 5978) [ClassicSimilarity], result of:
          0.010853848 = score(doc=5978,freq=4.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.1920054 = fieldWeight in 5978, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5978)
        0.032226957 = weight(_text_:retrieval in 5978) [ClassicSimilarity], result of:
          0.032226957 = score(doc=5978,freq=4.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.33085006 = fieldWeight in 5978, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5978)
        0.07915744 = weight(_text_:konstanz in 5978) [ClassicSimilarity], result of:
          0.07915744 = score(doc=5978,freq=2.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.43602353 = fieldWeight in 5978, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5978)
      0.27272728 = coord(3/11)
    
    Abstract
    Die Identifikation der Sprache bzw. der Sprachen in Textdokumenten ist einer der wichtigsten Schritte maschineller Textverarbeitung für das Information Retrieval. Der vorliegende Artikel stellt Langldent vor, ein System zur Sprachidentifikation von mono- und multilingualen elektronischen Textdokumenten. Das System bietet sowohl eine Auswahl von gängigen Algorithmen für die Sprachidentifikation monolingualer Textdokumente als auch einen neuen Algorithmus für die Sprachidentifikation multilingualer Textdokumente.
    Imprint
    Konstanz : UVK Verlagsgesellschaft
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  15. Semantik, Lexikographie und Computeranwendungen : Workshop ... (Bonn) : 1995.01.27-28 (1996) 0.03
    0.033013776 = product of:
      0.12105051 = sum of:
        0.0054820213 = weight(_text_:information in 190) [ClassicSimilarity], result of:
          0.0054820213 = score(doc=190,freq=2.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.09697737 = fieldWeight in 190, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=190)
        0.10829708 = weight(_text_:kongress in 190) [ClassicSimilarity], result of:
          0.10829708 = score(doc=190,freq=4.0), product of:
            0.21127632 = queryWeight, product of:
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.032201413 = queryNorm
            0.51258504 = fieldWeight in 190, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.0390625 = fieldNorm(doc=190)
        0.007271412 = product of:
          0.021814235 = sum of:
            0.021814235 = weight(_text_:22 in 190) [ClassicSimilarity], result of:
              0.021814235 = score(doc=190,freq=2.0), product of:
                0.11276386 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032201413 = queryNorm
                0.19345059 = fieldWeight in 190, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=190)
          0.33333334 = coord(1/3)
      0.27272728 = coord(3/11)
    
    Date
    14. 4.2007 10:04:22
    RSWK
    Lexikographie / Semantik / Kongress / Bonn <1995>
    Series
    Sprache und Information ; 33
    Subject
    Lexikographie / Semantik / Kongress / Bonn <1995>
  16. Linguistik und neue Medien (1998) 0.03
    0.031133225 = product of:
      0.34246546 = sum of:
        0.34246546 = weight(_text_:kongress in 5770) [ClassicSimilarity], result of:
          0.34246546 = score(doc=5770,freq=10.0), product of:
            0.21127632 = queryWeight, product of:
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.032201413 = queryNorm
            1.6209363 = fieldWeight in 5770, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.078125 = fieldNorm(doc=5770)
      0.09090909 = coord(1/11)
    
    Footnote
    Publikation zu einem Kongress 1997 in Leipzig
    RSWK
    Lexikographie / Neue Medien / Kongress / Leipzig <1997> (2134)
    Syntaktische Analyse / Neue Medien / Kongress / Leipzig <1997> (2134)
    Subject
    Lexikographie / Neue Medien / Kongress / Leipzig <1997> (2134)
    Syntaktische Analyse / Neue Medien / Kongress / Leipzig <1997> (2134)
  17. Kreymer, O.: ¬An evaluation of help mechanisms in natural language information retrieval systems (2002) 0.03
    0.028087616 = product of:
      0.10298792 = sum of:
        0.041907072 = weight(_text_:wide in 2557) [ClassicSimilarity], result of:
          0.041907072 = score(doc=2557,freq=2.0), product of:
            0.14267668 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.032201413 = queryNorm
            0.29372054 = fieldWeight in 2557, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=2557)
        0.017404877 = weight(_text_:information in 2557) [ClassicSimilarity], result of:
          0.017404877 = score(doc=2557,freq=14.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.3078936 = fieldWeight in 2557, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2557)
        0.043675974 = weight(_text_:retrieval in 2557) [ClassicSimilarity], result of:
          0.043675974 = score(doc=2557,freq=10.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.44838852 = fieldWeight in 2557, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2557)
      0.27272728 = coord(3/11)
    
    Abstract
    The field of natural language processing (NLP) demonstrates rapid changes in the design of information retrieval systems and human-computer interaction. While natural language is being looked on as the most effective tool for information retrieval in a contemporary information environment, the systems using it are only beginning to emerge. This study attempts to evaluate the current state of NLP information retrieval systems from the user's point of view: what techniques are used by these systems to guide their users through the search process? The analysis focused on the structure and components of the systems' help mechanisms. Results of the study demonstrated that systems which claimed to be using natural language searching in fact used a wide range of information retrieval techniques from real natural language processing to Boolean searching. As a result, the user assistance mechanisms of these systems also varied. While pseudo-NLP systems would suit a more traditional method of instruction, real NLP systems primarily utilised the methods of explanation and user-system dialogue.
    Source
    Online information review. 26(2002) no.1, S.30-39
  18. Strötgen, R.; Mandl, T.; Schneider, R.: Entwicklung und Evaluierung eines Question Answering Systems im Rahmen des Cross Language Evaluation Forum (CLEF) (2006) 0.03
    0.027832028 = product of:
      0.10205077 = sum of:
        0.0065784254 = weight(_text_:information in 5981) [ClassicSimilarity], result of:
          0.0065784254 = score(doc=5981,freq=2.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.116372846 = fieldWeight in 5981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5981)
        0.02762311 = weight(_text_:retrieval in 5981) [ClassicSimilarity], result of:
          0.02762311 = score(doc=5981,freq=4.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.2835858 = fieldWeight in 5981, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=5981)
        0.067849234 = weight(_text_:konstanz in 5981) [ClassicSimilarity], result of:
          0.067849234 = score(doc=5981,freq=2.0), product of:
            0.18154396 = queryWeight, product of:
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.032201413 = queryNorm
            0.37373447 = fieldWeight in 5981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.637764 = idf(docFreq=427, maxDocs=44218)
              0.046875 = fieldNorm(doc=5981)
      0.27272728 = coord(3/11)
    
    Abstract
    Question Answering Systeme versuchen, zu konkreten Fragen eine korrekte Antwort zu liefern. Dazu durchsuchen sie einen Dokumentenbestand und extrahieren einen Bruchteil eines Dokuments. Dieser Beitrag beschreibt die Entwicklung eines modularen Systems zum multilingualen Question Answering. Die Strategie bei der Entwicklung zielte auf eine schnellstmögliche Verwendbarkeit eines modularen Systems, das auf viele frei verfügbare Ressourcen zugreift. Das System integriert Module zur Erkennung von Eigennamen, zu Indexierung und Retrieval, elektronische Wörterbücher, Online-Übersetzungswerkzeuge sowie Textkorpora zu Trainings- und Testzwecken und implementiert eigene Ansätze zu den Bereichen der Frage- und AntwortTaxonomien, zum Passagenretrieval und zum Ranking alternativer Antworten.
    Imprint
    Konstanz : UVK Verlagsgesellschaft
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  19. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.03
    0.026862167 = product of:
      0.07387096 = sum of:
        0.03281562 = weight(_text_:web in 2541) [ClassicSimilarity], result of:
          0.03281562 = score(doc=2541,freq=6.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.3122631 = fieldWeight in 2541, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.0077527487 = weight(_text_:information in 2541) [ClassicSimilarity], result of:
          0.0077527487 = score(doc=2541,freq=4.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.13714671 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.023019256 = weight(_text_:retrieval in 2541) [ClassicSimilarity], result of:
          0.023019256 = score(doc=2541,freq=4.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.23632148 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.010283329 = product of:
          0.030849986 = sum of:
            0.030849986 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
              0.030849986 = score(doc=2541,freq=4.0), product of:
                0.11276386 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032201413 = queryNorm
                0.27358043 = fieldWeight in 2541, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.33333334 = coord(1/3)
      0.36363637 = coord(4/11)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  20. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.03
    0.026110893 = product of:
      0.071804956 = sum of:
        0.032152608 = weight(_text_:web in 4436) [ClassicSimilarity], result of:
          0.032152608 = score(doc=4436,freq=4.0), product of:
            0.10508965 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.032201413 = queryNorm
            0.3059541 = fieldWeight in 4436, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.011394167 = weight(_text_:information in 4436) [ClassicSimilarity], result of:
          0.011394167 = score(doc=4436,freq=6.0), product of:
            0.05652887 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.032201413 = queryNorm
            0.20156369 = fieldWeight in 4436, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.019532489 = weight(_text_:retrieval in 4436) [ClassicSimilarity], result of:
          0.019532489 = score(doc=4436,freq=2.0), product of:
            0.09740654 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032201413 = queryNorm
            0.20052543 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.008725693 = product of:
          0.02617708 = sum of:
            0.02617708 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
              0.02617708 = score(doc=4436,freq=2.0), product of:
                0.11276386 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032201413 = queryNorm
                0.23214069 = fieldWeight in 4436, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4436)
          0.33333334 = coord(1/3)
      0.36363637 = coord(4/11)
    
    Abstract
    Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
    Date
    16. 2.2000 14:22:39
    Source
    Journal of the American Society for Information Science. 51(2000) no.3, S.281-296

Languages

Types

  • a 418
  • m 41
  • el 30
  • s 22
  • x 11
  • p 3
  • d 2
  • b 1
  • r 1
  • More… Less…

Subjects

Classifications