Search (22 results, page 1 of 2)

  • × theme_ss:"Sprachretrieval"
  1. Strötgen, R.; Mandl, T.; Schneider, R.: Entwicklung und Evaluierung eines Question Answering Systems im Rahmen des Cross Language Evaluation Forum (CLEF) (2006) 0.02
    0.016158216 = product of:
      0.07540501 = sum of:
        0.043931052 = weight(_text_:elektronische in 5981) [ClassicSimilarity], result of:
          0.043931052 = score(doc=5981,freq=2.0), product of:
            0.14013545 = queryWeight, product of:
              4.728978 = idf(docFreq=1061, maxDocs=44218)
              0.029633347 = queryNorm
            0.3134899 = fieldWeight in 5981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.728978 = idf(docFreq=1061, maxDocs=44218)
              0.046875 = fieldNorm(doc=5981)
        0.0060537956 = weight(_text_:information in 5981) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=5981,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 5981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5981)
        0.025420163 = weight(_text_:retrieval in 5981) [ClassicSimilarity], result of:
          0.025420163 = score(doc=5981,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.2835858 = fieldWeight in 5981, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=5981)
      0.21428572 = coord(3/14)
    
    Abstract
    Question Answering Systeme versuchen, zu konkreten Fragen eine korrekte Antwort zu liefern. Dazu durchsuchen sie einen Dokumentenbestand und extrahieren einen Bruchteil eines Dokuments. Dieser Beitrag beschreibt die Entwicklung eines modularen Systems zum multilingualen Question Answering. Die Strategie bei der Entwicklung zielte auf eine schnellstmögliche Verwendbarkeit eines modularen Systems, das auf viele frei verfügbare Ressourcen zugreift. Das System integriert Module zur Erkennung von Eigennamen, zu Indexierung und Retrieval, elektronische Wörterbücher, Online-Übersetzungswerkzeuge sowie Textkorpora zu Trainings- und Testzwecken und implementiert eigene Ansätze zu den Bereichen der Frage- und AntwortTaxonomien, zum Passagenretrieval und zum Ranking alternativer Antworten.
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  2. Jensen, N.: Evaluierung von mehrsprachigem Web-Retrieval : Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF) (2006) 0.02
    0.01573399 = product of:
      0.07342529 = sum of:
        0.036238287 = weight(_text_:web in 5964) [ClassicSimilarity], result of:
          0.036238287 = score(doc=5964,freq=6.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.37471575 = fieldWeight in 5964, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
        0.0060537956 = weight(_text_:information in 5964) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=5964,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 5964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
        0.031133216 = weight(_text_:retrieval in 5964) [ClassicSimilarity], result of:
          0.031133216 = score(doc=5964,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.34732026 = fieldWeight in 5964, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
      0.21428572 = coord(3/14)
    
    Abstract
    Der vorliegende Artikel beschreibt die Experimente der Universität Hildesheim im Rahmen des ersten Web Track der CLEF-Initiative (WebCLEF) im Jahr 2005. Bei der Teilnahme konnten Erfahrungen mit einem multilingualen Web-Korpus (EuroGOV) bei der Vorverarbeitung, der Topic- bzw. Query-Entwicklung, bei sprachunabhängigen Indexierungsmethoden und multilingualen Retrieval-Strategien gesammelt werden. Aufgrund des großen Um-fangs des Korpus und der zeitlichen Einschränkungen wurden multilinguale Indizes aufgebaut. Der Artikel beschreibt die Vorgehensweise bei der Teilnahme der Universität Hildesheim und die Ergebnisse der offiziell eingereichten sowie weiterer Experimente. Für den Multilingual Task konnte das beste Ergebnis in CLEF erzielt werden.
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  3. Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.01
    0.012914326 = product of:
      0.060266852 = sum of:
        0.036238287 = weight(_text_:web in 3455) [ClassicSimilarity], result of:
          0.036238287 = score(doc=3455,freq=6.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.37471575 = fieldWeight in 3455, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
        0.0060537956 = weight(_text_:information in 3455) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=3455,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 3455, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
        0.01797477 = weight(_text_:retrieval in 3455) [ClassicSimilarity], result of:
          0.01797477 = score(doc=3455,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.20052543 = fieldWeight in 3455, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
      0.21428572 = coord(3/14)
    
    Abstract
    Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
    Source
    Journal of the American Society for Information Science and Technology. 56(2005) no.6, S.571-583
  4. Srihari, R.K.: Using speech input for image interpretation, annotation, and retrieval (1997) 0.01
    0.011258726 = product of:
      0.05254072 = sum of:
        0.00856136 = weight(_text_:information in 764) [ClassicSimilarity], result of:
          0.00856136 = score(doc=764,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16457605 = fieldWeight in 764, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=764)
        0.03594954 = weight(_text_:retrieval in 764) [ClassicSimilarity], result of:
          0.03594954 = score(doc=764,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 764, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=764)
        0.008029819 = product of:
          0.024089456 = sum of:
            0.024089456 = weight(_text_:22 in 764) [ClassicSimilarity], result of:
              0.024089456 = score(doc=764,freq=2.0), product of:
                0.103770934 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029633347 = queryNorm
                0.23214069 = fieldWeight in 764, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=764)
          0.33333334 = coord(1/3)
      0.21428572 = coord(3/14)
    
    Abstract
    Explores the interaction of textual and photographic information in an integrated text and image database environment and describes 3 different applications involving the exploitation of linguistic context in vision. Describes the practical application of these ideas in working systems. PICTION uses captions to identify human faces in a photograph, wile Show&Tell is a multimedia system for semi automatic image annotation. The system combines advances in speech recognition, natural language processing and image understanding to assist in image annotation and enhance image retrieval capabilities. Presents an extension of this work to video annotation and retrieval
    Date
    22. 9.1997 19:16:05
    Imprint
    Urbana-Champaign, IL : Illinois University at Urbana-Champaign, Department of Library and Information Science
    Source
    Digital image access and retrieval: Proceedings of the 1996 Clinic on Library Applications of Data Processing, 24-26 Mar 1996. Ed.: P.B. Heidorn u. B. Sandore
  5. Sparck Jones, K.; Jones, G.J.F.; Foote, J.T.; Young, S.J.: Experiments in spoken document retrieval (1996) 0.01
    0.008935094 = product of:
      0.06254566 = sum of:
        0.0070627616 = weight(_text_:information in 1951) [ClassicSimilarity], result of:
          0.0070627616 = score(doc=1951,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13576832 = fieldWeight in 1951, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1951)
        0.055482898 = weight(_text_:retrieval in 1951) [ClassicSimilarity], result of:
          0.055482898 = score(doc=1951,freq=14.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.61896384 = fieldWeight in 1951, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1951)
      0.14285715 = coord(2/14)
    
    Abstract
    Describes experiments in the retrieval of spoken documents in multimedia systems. Speech documents pose a particular problem for retrieval since their words as well as contents are unknown. Addresses this problem, for a video mail application, by combining state of the art speech recognition with established document retrieval technologies so as to provide an effective and efficient retrieval tool. Tests with a small spoken message collection show that retrieval precision for the spoken file can reach 90% of that obtained when the same file is used, as a benchmark, in text transcription form
    Footnote
    Wiederabdruck in: Readings in informatio retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.493-502.
    Source
    Information processing and management. 32(1996) no.4, S.399-417
  6. Kruschwitz, U.; AI-Bakour, H.: Users want more sophisticated search assistants : results of a task-based evaluation (2005) 0.01
    0.008026919 = product of:
      0.037458956 = sum of:
        0.017435152 = weight(_text_:web in 4575) [ClassicSimilarity], result of:
          0.017435152 = score(doc=4575,freq=2.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.18028519 = fieldWeight in 4575, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4575)
        0.0050448296 = weight(_text_:information in 4575) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=4575,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 4575, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4575)
        0.014978974 = weight(_text_:retrieval in 4575) [ClassicSimilarity], result of:
          0.014978974 = score(doc=4575,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.16710453 = fieldWeight in 4575, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4575)
      0.21428572 = coord(3/14)
    
    Abstract
    The Web provides a massive knowledge source, as do intranets and other electronic document collections. However, much of that knowledge is encoded implicitly and cannot be applied directly without processing into some more appropriate structures. Searching, browsing, question answering, for example, could all benefit from domain-specific knowledge contained in the documents, and in applications such as simple search we do not actually need very "deep" knowledge structures such as ontologies, but we can get a long way with a model of the domain that consists of term hierarchies. We combine domain knowledge automatically acquired by exploiting the documents' markup structure with knowledge extracted an the fly to assist a user with ad hoc search requests. Such a search system can suggest query modification options derived from the actual data and thus guide a user through the space of documents. This article gives a detailed account of a task-based evaluation that compares a search system that uses the outlined domain knowledge with a standard search system. We found that users do use the query modification suggestions proposed by the system. The main conclusion we can draw from this evaluation, however, is that users prefer a system that can suggest query modifications over a standard search engine, which simply presents a ranked list of documents. Most interestingly, we observe this user preference despite the fact that the baseline system even performs slightly better under certain criteria.
    Source
    Journal of the American Society for Information Science and Technology. 56(2005) no.13, S.1377-1393
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  7. Voorhees, E.M.: Question answering in TREC (2005) 0.01
    0.006865305 = product of:
      0.04805713 = sum of:
        0.012107591 = weight(_text_:information in 6487) [ClassicSimilarity], result of:
          0.012107591 = score(doc=6487,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 6487, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6487)
        0.03594954 = weight(_text_:retrieval in 6487) [ClassicSimilarity], result of:
          0.03594954 = score(doc=6487,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 6487, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=6487)
      0.14285715 = coord(2/14)
    
    Source
    TREC: experiment and evaluation in information retrieval. Ed.: E.M. Voorhees, u. D.K. Harman
  8. Wittbrock, M.J.; Hauptmann, A.G.: Speech recognition for a digital video library (1998) 0.01
    0.00604503 = product of:
      0.042315207 = sum of:
        0.012357258 = weight(_text_:information in 873) [ClassicSimilarity], result of:
          0.012357258 = score(doc=873,freq=12.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23754507 = fieldWeight in 873, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=873)
        0.029957948 = weight(_text_:retrieval in 873) [ClassicSimilarity], result of:
          0.029957948 = score(doc=873,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33420905 = fieldWeight in 873, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=873)
      0.14285715 = coord(2/14)
    
    Abstract
    The standard method for making the full content of audio and video material searchable is to annotate it with human-generated meta-data that describes the content in a way that search can understand, as is done in the creation of multimedia CD-ROMs. However, for the huge amounts of data that could usefully be included in digital video and audio libraries, the cost of producing the meta-data is prohibitive. In the Informedia Digital Video Library, the production of the meta-data supporting the library interface is automated using techniques derived from artificial intelligence (AI) research. By applying speech recognition together with natural language processing, information retrieval, and image analysis, an interface has been prduced that helps users locate the information they want, and navigate or browse the digital video library more effectively. Specific interface components include automatc titles, filmstrips, video skims, word location marking, and representative frames for shots. Both the user interface and the information retrieval engine within Informedia are designed for use with automatically derived meta-data, much of which depends on speech recognition for its production. Some experimental information retrieval results will be given, supporting a basic premise of the Informedia project: That speech recognition generated transcripts can make multimedia material searchable. The Informedia project emphasizes the integration of speech recognition, image processing, natural language processing, and information retrieval to compensate for deficiencies in these individual technologies
    Source
    Journal of the American Society for Information Science. 49(1998) no.7, S.619-632
  9. Lin, J.; Katz, B.: Building a reusable test collection for question answering (2006) 0.01
    0.0059455284 = product of:
      0.041618697 = sum of:
        0.0104854815 = weight(_text_:information in 5045) [ClassicSimilarity], result of:
          0.0104854815 = score(doc=5045,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.20156369 = fieldWeight in 5045, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5045)
        0.031133216 = weight(_text_:retrieval in 5045) [ClassicSimilarity], result of:
          0.031133216 = score(doc=5045,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.34732026 = fieldWeight in 5045, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=5045)
      0.14285715 = coord(2/14)
    
    Abstract
    In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language-processing capabilities, they still rely on traditional document retrieval techniques to generate an initial candidate set of documents. In this article, the authors argue that document retrieval for question answering represents a task different from retrieving documents in response to more general retrospective information needs. Thus, to guide future system development, specialized question answering test collections must be constructed. They show that the current evaluation resources have major shortcomings; to remedy the situation, they have manually created a small, reusable question answering test collection for research purposes. In this article they describe their methodology for building this test collection and discuss issues they encountered regarding the notion of "answer correctness."
    Source
    Journal of the American Society for Information Science and Technology. 57(2006) no.7, S.851-861
  10. Tartakovski, O.; Shramko, M.: Implementierung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten (2006) 0.01
    0.0056635872 = product of:
      0.03964511 = sum of:
        0.009988253 = weight(_text_:information in 5978) [ClassicSimilarity], result of:
          0.009988253 = score(doc=5978,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1920054 = fieldWeight in 5978, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5978)
        0.029656855 = weight(_text_:retrieval in 5978) [ClassicSimilarity], result of:
          0.029656855 = score(doc=5978,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33085006 = fieldWeight in 5978, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5978)
      0.14285715 = coord(2/14)
    
    Abstract
    Die Identifikation der Sprache bzw. der Sprachen in Textdokumenten ist einer der wichtigsten Schritte maschineller Textverarbeitung für das Information Retrieval. Der vorliegende Artikel stellt Langldent vor, ein System zur Sprachidentifikation von mono- und multilingualen elektronischen Textdokumenten. Das System bietet sowohl eine Auswahl von gängigen Algorithmen für die Sprachidentifikation monolingualer Textdokumente als auch einen neuen Algorithmus für die Sprachidentifikation multilingualer Textdokumente.
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  11. Young, C.W.; Eastman, C.M.; Oakman, R.L.: ¬An analysis of ill-formed input in natural language queries to document retrieval systems (1991) 0.00
    0.0048545036 = product of:
      0.033981524 = sum of:
        0.00856136 = weight(_text_:information in 5263) [ClassicSimilarity], result of:
          0.00856136 = score(doc=5263,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16457605 = fieldWeight in 5263, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5263)
        0.025420163 = weight(_text_:retrieval in 5263) [ClassicSimilarity], result of:
          0.025420163 = score(doc=5263,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.2835858 = fieldWeight in 5263, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=5263)
      0.14285715 = coord(2/14)
    
    Abstract
    Natrual language document retrieval queries from the Thomas Cooper Library, South Carolina Univ. were analysed in oder to investigate the frequency of various types of ill-formed input, such as spelling errors, cooccurrence violations, conjunctions, ellipsis, and missing or incorrect punctuation. Users were requested to write out their requests for information in complete sentences on the form normally used by the library. The primary reason for analysing ill-formed inputs was to determine whether there is a significant need to study ill-formed inputs in detail. Results indicated that most of the queries were sentence fragments and that many of them contained some type of ill-formed input. Conjunctions caused the most problems. The next most serious problem was caused by punctuation errors. Spelling errors occured in a small number of queries. The remaining types of ill-formed input considered, allipsis and cooccurrence violations, were not found in the queries
    Source
    Information processing and management. 27(1991) no.6, S.615-622
  12. Burke, R.D.: Question answering from frequently asked question files : experiences with the FAQ Finder System (1997) 0.00
    0.0045768693 = product of:
      0.032038085 = sum of:
        0.008071727 = weight(_text_:information in 1191) [ClassicSimilarity], result of:
          0.008071727 = score(doc=1191,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1551638 = fieldWeight in 1191, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=1191)
        0.023966359 = weight(_text_:retrieval in 1191) [ClassicSimilarity], result of:
          0.023966359 = score(doc=1191,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.26736724 = fieldWeight in 1191, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=1191)
      0.14285715 = coord(2/14)
    
    Abstract
    Describes FAQ Finder, a natural language question-answering system that uses files of frequently asked questions as its knowledge base. Unlike information retrieval approaches that rely on a purely lexical metric of similarity between query and document, FAQ Finder uses a semantic knowledge base (Wordnet) to improve its ability to match question and answer. Includes results from an evaluation of the system's performance and shows that a combination of semantic and statistical techniques works better than any single approach
  13. Nhongkai, S.N.; Bentz, H.-J.: Bilinguale Suche mittels Konzeptnetzen (2006) 0.00
    0.0045768693 = product of:
      0.032038085 = sum of:
        0.008071727 = weight(_text_:information in 3914) [ClassicSimilarity], result of:
          0.008071727 = score(doc=3914,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1551638 = fieldWeight in 3914, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3914)
        0.023966359 = weight(_text_:retrieval in 3914) [ClassicSimilarity], result of:
          0.023966359 = score(doc=3914,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.26736724 = fieldWeight in 3914, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3914)
      0.14285715 = coord(2/14)
    
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  14. Galitsky, B.: Can many agents answer questions better than one? (2005) 0.00
    0.0034326524 = product of:
      0.024028566 = sum of:
        0.0060537956 = weight(_text_:information in 3094) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=3094,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 3094, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3094)
        0.01797477 = weight(_text_:retrieval in 3094) [ClassicSimilarity], result of:
          0.01797477 = score(doc=3094,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.20052543 = fieldWeight in 3094, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=3094)
      0.14285715 = coord(2/14)
    
    Abstract
    The paper addresses the issue of how online natural language question answering, based on deep semantic analysis, may compete with currently popular keyword search, open domain information retrieval systems, covering a horizontal domain. We suggest the multiagent question answering approach, where each domain is represented by an agent which tries to answer questions taking into account its specific knowledge. The meta-agent controls the cooperation between question answering agents and chooses the most relevant answer(s). We argue that multiagent question answering is optimal in terms of access to business and financial knowledge, flexibility in query phrasing, and efficiency and usability of advice. The knowledge and advice encoded in the system are initially prepared by domain experts. We analyze the commercial application of multiagent question answering and the robustness of the meta-agent. The paper suggests that a multiagent architecture is optimal when a real world question answering domain combines a number of vertical ones to form a horizontal domain.
  15. Rösener, C.: ¬Die Stecknadel im Heuhaufen : Natürlichsprachlicher Zugang zu Volltextdatenbanken (2005) 0.00
    0.003419585 = product of:
      0.023937095 = sum of:
        0.0069903214 = weight(_text_:information in 548) [ClassicSimilarity], result of:
          0.0069903214 = score(doc=548,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1343758 = fieldWeight in 548, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=548)
        0.016946774 = weight(_text_:retrieval in 548) [ClassicSimilarity], result of:
          0.016946774 = score(doc=548,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.18905719 = fieldWeight in 548, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=548)
      0.14285715 = coord(2/14)
    
    Abstract
    Die Möglichkeiten, die der heutigen Informations- und Wissensgesellschaft für die Beschaffung und den Austausch von Information zur Verfügung stehen, haben kurioserweise gleichzeitig ein immer akuter werdendes, neues Problem geschaffen: Es wird für jeden Einzelnen immer schwieriger, aus der gewaltigen Fülle der angebotenen Informationen die tatsächlich relevanten zu selektieren. Diese Arbeit untersucht die Möglichkeit, mit Hilfe von natürlichsprachlichen Schnittstellen den Zugang des Informationssuchenden zu Volltextdatenbanken zu verbessern. Dabei werden zunächst die wissenschaftlichen Fragestellungen ausführlich behandelt. Anschließend beschreibt der Autor verschiedene Lösungsansätze und stellt anhand einer natürlichsprachlichen Schnittstelle für den Brockhaus Multimedial 2004 deren erfolgreiche Implementierung vor
    Content
    Enthält die Kapitel: 2: Wissensrepräsentation 2.1 Deklarative Wissensrepräsentation 2.2 Klassifikationen des BMM 2.3 Thesauri und Ontologien: existierende kommerzielle Software 2.4 Erstellung eines Thesaurus im Rahmen des LeWi-Projektes 3: Analysekomponenten 3.1 Sprachliche Phänomene in der maschinellen Textanalyse 3.2 Analysekomponenten: Lösungen und Forschungsansätze 3.3 Die Analysekomponenten im LeWi-Projekt 4: Information Retrieval 4.1 Grundlagen des Information Retrieval 4.2 Automatische Indexierungsmethoden und -verfahren 4.3 Automatische Indexierung des BMM im Rahmen des LeWi-Projektes 4.4 Suchstrategien und Suchablauf im LeWi-Kontext
  16. Schneider, R.: Question answering : das Retrieval der Zukunft? (2007) 0.00
    0.002420968 = product of:
      0.033893548 = sum of:
        0.033893548 = weight(_text_:retrieval in 5953) [ClassicSimilarity], result of:
          0.033893548 = score(doc=5953,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.37811437 = fieldWeight in 5953, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=5953)
      0.071428575 = coord(1/14)
    
    Abstract
    Der Artikel geht der Frage nach, ob und inwieweit Informations- und Recherchesysteme von der Technologie natürlich sprachlicher Frage-Antwortsysteme, so genannter Question Answering-Systeme, profitieren können. Nach einer allgemeinen Einführung in die Zielsetzung und die historische Entwicklung dieses Sonderzweigs der maschinellen Sprachverarbeitung werden dessen Abgrenzung von herkömmlichen Retrieval- und Extraktionsverfahren erläutert und die besondere Struktur von Question Answering-Systemen sowie einzelne Evaluierungsinitiativen aufgezeichnet. Zudem werden konkrete Anwendungsfelder im Bibliothekswesen vorgestellt.
  17. Hannabuss, S.: Dialogue and the search for information (1989) 0.00
    0.0012892094 = product of:
      0.01804893 = sum of:
        0.01804893 = weight(_text_:information in 2590) [ClassicSimilarity], result of:
          0.01804893 = score(doc=2590,freq=10.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3469568 = fieldWeight in 2590, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=2590)
      0.071428575 = coord(1/14)
    
    Abstract
    Knowledge of conversation theory and speech act assists us to understand how people search for information. Dialogue embodies meanings and intentionalities, and represents epistemic inquiry. There are implications for the information-processing model of cognitive psychology. Question formulation (erotetics) and turn-taking play important roles in eliciting information, while discourse analysis furnishes us with information about people's categorising, recall, and semantic skills
  18. Peters, B.F.: Online searching using speech as a man / machine interface (1989) 0.00
    0.0011531039 = product of:
      0.016143454 = sum of:
        0.016143454 = weight(_text_:information in 4637) [ClassicSimilarity], result of:
          0.016143454 = score(doc=4637,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3103276 = fieldWeight in 4637, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=4637)
      0.071428575 = coord(1/14)
    
    Source
    Information processing and management. 25(1989), S.391-406
  19. Lange, H.R.: Speech synthesis and speech recognition : tomorrow's human-computer interface? (1993) 0.00
    8.153676E-4 = product of:
      0.011415146 = sum of:
        0.011415146 = weight(_text_:information in 7224) [ClassicSimilarity], result of:
          0.011415146 = score(doc=7224,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.21943474 = fieldWeight in 7224, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=7224)
      0.071428575 = coord(1/14)
    
    Imprint
    Medford, NJ : Learned Information
    Source
    Annual review of information science and technology. 28(1993), S.153-185
  20. Thompson, L.A.; Ogden, W.C.: Visible speech improves human language understanding : implications for speech processing systems (1995) 0.00
    5.7655195E-4 = product of:
      0.008071727 = sum of:
        0.008071727 = weight(_text_:information in 3883) [ClassicSimilarity], result of:
          0.008071727 = score(doc=3883,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1551638 = fieldWeight in 3883, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3883)
      0.071428575 = coord(1/14)
    
    Theme
    Information