Search (418 results, page 2 of 21)

Korman, D.Z.; Mack, E.; Jett, J.; Renear, A.H.: Defining textual entailment (2018) 0.01

0.014362549 = product of:
  0.06702523 = sum of:
    0.03856498 = weight(_text_:wide in 4284) [ClassicSimilarity], result of:
      0.03856498 = score(doc=4284,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.29372054 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.0104854815 = weight(_text_:information in 4284) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=4284,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 4284, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.01797477 = weight(_text_:retrieval in 4284) [ClassicSimilarity], result of:
      0.01797477 = score(doc=4284,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
  0.21428572 = coord(3/14)

Abstract: Textual entailment is a relationship that obtains between fragments of text when one fragment in some sense implies the other fragment. The automation of textual entailment recognition supports a wide variety of text-based tasks, including information retrieval, information extraction, question answering, text summarization, and machine translation. Much ingenuity has been devoted to developing algorithms for identifying textual entailments, but relatively little to saying what textual entailment actually is. This article is a review of the logical and philosophical issues involved in providing an adequate definition of textual entailment. We show that many natural definitions of textual entailment are refuted by counterexamples, including the most widely cited definition of Dagan et al. We then articulate and defend the following revised definition: T textually entails H?=?df typically, a human reading T would be justified in inferring the proposition expressed by H from the proposition expressed by T. We also show that textual entailment is context-sensitive, nontransitive, and nonmonotonic.
Source: Journal of the Association for Information Science and Technology. 69(2018) no.6, S.763-772

Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.01

0.013883932 = product of:
  0.06479168 = sum of:
    0.034870304 = weight(_text_:web in 2861) [ClassicSimilarity], result of:
      0.034870304 = score(doc=2861,freq=8.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.36057037 = fieldWeight in 2861, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
    0.008737902 = weight(_text_:information in 2861) [ClassicSimilarity], result of:
      0.008737902 = score(doc=2861,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16796975 = fieldWeight in 2861, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
    0.021183468 = weight(_text_:retrieval in 2861) [ClassicSimilarity], result of:
      0.021183468 = score(doc=2861,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 2861, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
  0.21428572 = coord(3/14)

Abstract: Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.

Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.01

0.012914326 = product of:
  0.060266852 = sum of:
    0.036238287 = weight(_text_:web in 3455) [ClassicSimilarity], result of:
      0.036238287 = score(doc=3455,freq=6.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.37471575 = fieldWeight in 3455, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
    0.0060537956 = weight(_text_:information in 3455) [ClassicSimilarity], result of:
      0.0060537956 = score(doc=3455,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.116372846 = fieldWeight in 3455, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
    0.01797477 = weight(_text_:retrieval in 3455) [ClassicSimilarity], result of:
      0.01797477 = score(doc=3455,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 3455, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
  0.21428572 = coord(3/14)

Abstract: Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
Source: Journal of the American Society for Information Science and Technology. 56(2005) no.6, S.571-583

Clark, M.; Kim, Y.; Kruschwitz, U.; Song, D.; Albakour, D.; Dignum, S.; Beresi, U.C.; Fasli, M.; Roeck, A De: Automatically structuring domain knowledge from text : an overview of current research (2012) 0.01

0.012026694 = product of:
  0.056124568 = sum of:
    0.029588435 = weight(_text_:web in 2738) [ClassicSimilarity], result of:
      0.029588435 = score(doc=2738,freq=4.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.3059541 = fieldWeight in 2738, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2738)
    0.00856136 = weight(_text_:information in 2738) [ClassicSimilarity], result of:
      0.00856136 = score(doc=2738,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 2738, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2738)
    0.01797477 = weight(_text_:retrieval in 2738) [ClassicSimilarity], result of:
      0.01797477 = score(doc=2738,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 2738, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2738)
  0.21428572 = coord(3/14)

Abstract: This paper presents an overview of automatic methods for building domain knowledge structures (domain models) from text collections. Applications of domain models have a long history within knowledge engineering and artificial intelligence. In the last couple of decades they have surfaced noticeably as a useful tool within natural language processing, information retrieval and semantic web technology. Inspired by the ubiquitous propagation of domain model structures that are emerging in several research disciplines, we give an overview of the current research landscape and some techniques and approaches. We will also discuss trade-offs between different approaches and point to some recent trends.
Content: Beitrag in einem Themenheft "Soft Approaches to IA on the Web". Vgl.: doi:10.1016/j.ipm.2011.07.002.
Source: Information processing and management. 48(2012) no.3, S.552-568

Gachot, D.A.; Lange, E.; Yang, J.: ¬The SYSTRAN NLP browser : an application of machine translation technology in cross-language information retrieval (1998) 0.01

0.011891057 = product of:
  0.083237395 = sum of:
    0.020970963 = weight(_text_:information in 6213) [ClassicSimilarity], result of:
      0.020970963 = score(doc=6213,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.40312737 = fieldWeight in 6213, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=6213)
    0.06226643 = weight(_text_:retrieval in 6213) [ClassicSimilarity], result of:
      0.06226643 = score(doc=6213,freq=6.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.6946405 = fieldWeight in 6213, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=6213)
  0.14285715 = coord(2/14)

Series: The Kluwer International series on information retrieval
Source: Cross-language information retrieval. Ed.: G. Grefenstette

Liddy, E.D.: Natural language processing for information retrieval and knowledge discovery (1998) 0.01

0.01174667 = product of:
  0.054817792 = sum of:
    0.015792815 = weight(_text_:information in 2345) [ClassicSimilarity], result of:
      0.015792815 = score(doc=2345,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3035872 = fieldWeight in 2345, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2345)
    0.029656855 = weight(_text_:retrieval in 2345) [ClassicSimilarity], result of:
      0.029656855 = score(doc=2345,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33085006 = fieldWeight in 2345, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2345)
    0.009368123 = product of:
      0.028104367 = sum of:
        0.028104367 = weight(_text_:22 in 2345) [ClassicSimilarity], result of:
          0.028104367 = score(doc=2345,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.2708308 = fieldWeight in 2345, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2345)
      0.33333334 = coord(1/3)
  0.21428572 = coord(3/14)

Abstract: Natural language processing (NLP) is a powerful technology for the vital tasks of information retrieval (IR) and knowledge discovery (KD) which, in turn, feed the visualization systems of the present and future and enable knowledge workers to focus more of their time on the vital tasks of analysis and prediction
Date: 22. 9.1997 19:16:05
Imprint: Urbana-Champaign, IL : Illinois University at Urbana-Champaign, Graduate School of Library and Information Science
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

Hmeidi, I.I.; Al-Shalabi, R.F.; Al-Taani, A.T.; Najadat, H.; Al-Hazaimeh, S.A.: ¬A novel approach to the extraction of roots from Arabic words using bigrams (2010) 0.01

0.011625199 = product of:
  0.054250926 = sum of:
    0.032137483 = weight(_text_:wide in 3426) [ClassicSimilarity], result of:
      0.032137483 = score(doc=3426,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.24476713 = fieldWeight in 3426, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3426)
    0.0071344664 = weight(_text_:information in 3426) [ClassicSimilarity], result of:
      0.0071344664 = score(doc=3426,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.13714671 = fieldWeight in 3426, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3426)
    0.014978974 = weight(_text_:retrieval in 3426) [ClassicSimilarity], result of:
      0.014978974 = score(doc=3426,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.16710453 = fieldWeight in 3426, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3426)
  0.21428572 = coord(3/14)

Abstract: Root extraction is one of the most important topics in information retrieval (IR), natural language processing (NLP), text summarization, and many other important fields. In the last two decades, several algorithms have been proposed to extract Arabic roots. Most of these algorithms dealt with triliteral roots only, and some with fixed length words only. In this study, a novel approach to the extraction of roots from Arabic words using bigrams is proposed. Two similarity measures are used, the dissimilarity measure called the Manhattan distance, and Dice's measure of similarity. The proposed algorithm is tested on the Holy Qu'ran and on a corpus of 242 abstracts from the Proceedings of the Saudi Arabian National Computer Conferences. The two files used contain a wide range of data: the Holy Qu'ran contains most of the ancient Arabic words while the other file contains some modern Arabic words and some words borrowed from foreign languages in addition to the original Arabic words. The results of this study showed that combining N-grams with the Dice measure gives better results than using the Manhattan distance measure.
Source: Journal of the American Society for Information Science and Technology. 61(2010) no.3, S.583-591

Chandrasekar, R.; Bangalore, S.: Glean : using syntactic information in document filtering (2002) 0.01
```
0.0113330465 = product of:
  0.05288755 = sum of:
    0.017435152 = weight(_text_:web in 4257) [ClassicSimilarity], result of:
      0.017435152 = score(doc=4257,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.18028519 = fieldWeight in 4257, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4257)
    0.014268933 = weight(_text_:information in 4257) [ClassicSimilarity], result of:
      0.014268933 = score(doc=4257,freq=16.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.27429342 = fieldWeight in 4257, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4257)
    0.021183468 = weight(_text_:retrieval in 4257) [ClassicSimilarity], result of:
      0.021183468 = score(doc=4257,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 4257, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4257)
  0.21428572 = coord(3/14)
```
Abstract

In today's networked world, a huge amount of data is available in machine-processable form. Likewise, there are any number of search engines and specialized information retrieval (IR) programs that seek to extract relevant information from these data repositories. Most IR systems and Web search engines have been designed for speed and tend to maximize the quantity of information (recall) rather than the relevance of the information (precision) to the query. As a result, search engine users get inundated with information for practically any query, and are forced to scan a large number of potentially relevant items to get to the information of interest. The Holy Grail of IR is to somehow retrieve those and only those documents pertinent to the user's query. Polysemy and synonymy - the fact that often there are several meanings for a word or phrase, and likewise, many ways to express a conceptmake this a very hard task. While conventional IR systems provide usable solutions, there are a number of open problems to be solved, in areas such as syntactic processing, semantic analysis, and user modeling, before we develop systems that "understand" user queries and text collections. Meanwhile, we can use tools and techniques available today to improve the precision of retrieval. In particular, using the approach described in this article, we can approximate understanding using the syntactic structure and patterns of language use that is latent in documents to make IR more effective.

Source

Encyclopedia of library and information science. Vol.71, [=Suppl.34]

Pirkola, A.; Hedlund, T.; Keskustalo, H.; Järvelin, K.: Dictionary-based cross-language information retrieval : problems, methods, and research findings (2001) 0.01

0.0113271745 = product of:
  0.07929022 = sum of:
    0.019976506 = weight(_text_:information in 3908) [ClassicSimilarity], result of:
      0.019976506 = score(doc=3908,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3840108 = fieldWeight in 3908, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.109375 = fieldNorm(doc=3908)
    0.05931371 = weight(_text_:retrieval in 3908) [ClassicSimilarity], result of:
      0.05931371 = score(doc=3908,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.6617001 = fieldWeight in 3908, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=3908)
  0.14285715 = coord(2/14)

Source: Information retrieval. 4(2001), S.209-230

Belbachir, F.; Boughanem, M.: Using language models to improve opinion detection (2018) 0.01
```
0.011008398 = product of:
  0.05137252 = sum of:
    0.013948122 = weight(_text_:web in 5044) [ClassicSimilarity], result of:
      0.013948122 = score(doc=5044,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.14422815 = fieldWeight in 5044, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=5044)
    0.008071727 = weight(_text_:information in 5044) [ClassicSimilarity], result of:
      0.008071727 = score(doc=5044,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.1551638 = fieldWeight in 5044, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=5044)
    0.029352674 = weight(_text_:retrieval in 5044) [ClassicSimilarity], result of:
      0.029352674 = score(doc=5044,freq=12.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.32745665 = fieldWeight in 5044, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=5044)
  0.21428572 = coord(3/14)
```
Abstract

Opinion mining is one of the most important research tasks in the information retrieval research community. With the huge volume of opinionated data available on the Web, approaches must be developed to differentiate opinion from fact. In this paper, we present a lexicon-based approach for opinion retrieval. Generally, opinion retrieval consists of two stages: relevance to the query and opinion detection. In our work, we focus on the second state which itself focusses on detecting opinionated documents . We compare the document to be analyzed with opinionated sources that contain subjective information. We hypothesize that a document with a strong similarity to opinionated sources is more likely to be opinionated itself. Typical lexicon-based approaches treat and choose their opinion sources according to their test collection, then calculate the opinion score based on the frequency of subjective terms in the document. In our work, we use different open opinion collections without any specific treatment and consider them as a reference collection. We then use language models to determine opinion scores. The analysis document and reference collection are represented by different language models (i.e., Dirichlet, Jelinek-Mercer and two-stage models). These language models are generally used in information retrieval to represent the relationship between documents and queries. However, in our study, we modify these language models to represent opinionated documents. We carry out several experiments using Text REtrieval Conference (TREC) Blogs 06 as our analysis collection and Internet Movie Data Bases (IMDB), Multi-Perspective Question Answering (MPQA) and CHESLY as our reference collection. To improve opinion detection, we study the impact of using different language models to represent the document and reference collection alongside different combinations of opinion and retrieval scores. We then use this data to deduce the best opinion detection models. Using the best models, our approach improves on the best baseline of TREC Blog (baseline4) by 30%.

Source

Information processing and management. 54(2018) no.6, S.958-968

Schneider, R.: Web 3.0 ante portas? : Integration von Social Web und Semantic Web (2008) 0.01

0.01056412 = product of:
  0.07394883 = sum of:
    0.06458071 = weight(_text_:web in 4184) [ClassicSimilarity], result of:
      0.06458071 = score(doc=4184,freq=14.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.6677857 = fieldWeight in 4184, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4184)
    0.009368123 = product of:
      0.028104367 = sum of:
        0.028104367 = weight(_text_:22 in 4184) [ClassicSimilarity], result of:
          0.028104367 = score(doc=4184,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.2708308 = fieldWeight in 4184, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4184)
      0.33333334 = coord(1/3)
  0.14285715 = coord(2/14)

Abstract: Das Medium Internet ist im Wandel, und mit ihm ändern sich seine Publikations- und Rezeptionsbedingungen. Welche Chancen bieten die momentan parallel diskutierten Zukunftsentwürfe von Social Web und Semantic Web? Zur Beantwortung dieser Frage beschäftigt sich der Beitrag mit den Grundlagen beider Modelle unter den Aspekten Anwendungsbezug und Technologie, beleuchtet darüber hinaus jedoch auch deren Unzulänglichkeiten sowie den Mehrwert einer mediengerechten Kombination. Am Beispiel des grammatischen Online-Informationssystems grammis wird eine Strategie zur integrativen Nutzung der jeweiligen Stärken skizziert.
Date: 22. 1.2011 10:38:28
Source: Kommunikation, Partizipation und Wirkungen im Social Web, Band 1. Hrsg.: A. Zerfaß u.a
Theme: Semantic Web

Nait-Baha, L.; Jackiewicz, A.; Djioua, B.; Laublet, P.: Query reformulation for information retrieval on the Web using the point of view methodology : preliminary results (2001) 0.01

0.010169638 = product of:
  0.047458313 = sum of:
    0.020922182 = weight(_text_:web in 249) [ClassicSimilarity], result of:
      0.020922182 = score(doc=249,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.21634221 = fieldWeight in 249, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=249)
    0.00856136 = weight(_text_:information in 249) [ClassicSimilarity], result of:
      0.00856136 = score(doc=249,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 249, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=249)
    0.01797477 = weight(_text_:retrieval in 249) [ClassicSimilarity], result of:
      0.01797477 = score(doc=249,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 249, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=249)
  0.21428572 = coord(3/14)

Abstract: The work we are presenting is devoted to the information collected on the WWW. By the term collected we mean the whole process of retrieving, extracting and presenting results to the user. This research is part of the RAP (Research, Analyze, Propose) project in which we propose to combine two methods: (i) query reformulation using linguistic markers according to a given point of view; and (ii) text semantic analysis by means of contextual exploration results (Descles, 1991). The general project architecture describing the interactions between the users, the RAP system and the WWW search engines is presented in Nait-Baha et al. (1998). We will focus this paper on showing how we use linguistic markers to reformulate the queries according to a given point of view

Rettinger, A.; Schumilin, A.; Thoma, S.; Ell, B.: Learning a cross-lingual semantic representation of relations expressed in text (2015) 0.01

0.0100695435 = product of:
  0.0704868 = sum of:
    0.06039714 = weight(_text_:web in 2027) [ClassicSimilarity], result of:
      0.06039714 = score(doc=2027,freq=6.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.6245262 = fieldWeight in 2027, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=2027)
    0.010089659 = weight(_text_:information in 2027) [ClassicSimilarity], result of:
      0.010089659 = score(doc=2027,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.19395474 = fieldWeight in 2027, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=2027)
  0.14285715 = coord(2/14)

Series: Information Systems and Applications, incl. Internet/Web, and HCI; Bd. 9088
Source: The Semantic Web: latest advances and new domains. 12th European Semantic Web Conference, ESWC 2015 Portoroz, Slovenia, May 31 -- June 4, 2015. Proceedings. Eds.: F. Gandon u.a

Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.01
```
0.010053988 = product of:
  0.07037791 = sum of:
    0.013536699 = weight(_text_:information in 2502) [ClassicSimilarity], result of:
      0.013536699 = score(doc=2502,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.2602176 = fieldWeight in 2502, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2502)
    0.05684121 = weight(_text_:retrieval in 2502) [ClassicSimilarity], result of:
      0.05684121 = score(doc=2502,freq=20.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.63411707 = fieldWeight in 2502, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2502)
  0.14285715 = coord(2/14)
```
Abstract

Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.

Source

Journal of the American Society for Information Science and Technology. 55(2004) no.10, S.859-868

Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 0.01

0.009875985 = product of:
  0.046087928 = sum of:
    0.011415146 = weight(_text_:information in 7415) [ClassicSimilarity], result of:
      0.011415146 = score(doc=7415,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.21943474 = fieldWeight in 7415, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=7415)
    0.023966359 = weight(_text_:retrieval in 7415) [ClassicSimilarity], result of:
      0.023966359 = score(doc=7415,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.26736724 = fieldWeight in 7415, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=7415)
    0.010706427 = product of:
      0.032119278 = sum of:
        0.032119278 = weight(_text_:22 in 7415) [ClassicSimilarity], result of:
          0.032119278 = score(doc=7415,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.30952093 = fieldWeight in 7415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7415)
      0.33333334 = coord(1/3)
  0.21428572 = coord(3/14)

Abstract: State of the art review of natural language processing updating an earlier review published in ARIST 22(1987). Discusses important developments that have allowed for significant advances in the field of natural language processing: materials and resources; knowledge based systems and statistical approaches; and a strong emphasis on evaluation. Reviews some natural language processing applications and common problems still awaiting solution. Considers closely related applications such as language generation and th egeneration phase of machine translation which face the same problems as natural language processing. Covers natural language methodologies for information retrieval only briefly
Source: Annual review of information science and technology. 31(1996), S.83-119

McCune, B.P.; Tong, R.M.; Dean, J.S.: Rubric: a system for rule-based information retrieval (1985) 0.01

0.009709007 = product of:
  0.06796305 = sum of:
    0.01712272 = weight(_text_:information in 1945) [ClassicSimilarity], result of:
      0.01712272 = score(doc=1945,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3291521 = fieldWeight in 1945, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=1945)
    0.050840326 = weight(_text_:retrieval in 1945) [ClassicSimilarity], result of:
      0.050840326 = score(doc=1945,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.5671716 = fieldWeight in 1945, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=1945)
  0.14285715 = coord(2/14)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.440-445.

Sembok, T.M.T.; Rijsbergen, C.J. van: SILOL: a simple logical-linguistic document retrieval system (1990) 0.01

0.009539582 = product of:
  0.06677707 = sum of:
    0.008071727 = weight(_text_:information in 6684) [ClassicSimilarity], result of:
      0.008071727 = score(doc=6684,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.1551638 = fieldWeight in 6684, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=6684)
    0.05870535 = weight(_text_:retrieval in 6684) [ClassicSimilarity], result of:
      0.05870535 = score(doc=6684,freq=12.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.6549133 = fieldWeight in 6684, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=6684)
  0.14285715 = coord(2/14)

Abstract: Describes a system called SILOL which is based on a logical-linguistic model of document retrieval systems. SILOL uses a shallow semantic translation of natural language texts into a first order predicate representation in performing a document indexing and retrieval process. Some preliminary experiments have been carried out to test the retrieval effectiveness of this system. The results obtained show improvements in the level of retrieval effectiveness, which demonstrate that the approach of using a semantic theory of natural language and logic in document retrieval systems is a valid one
Source: Information processing and management. 26(1990) no.1, S.111-134

Kunze, C.: Lexikalisch-semantische Wortnetze in Sprachwissenschaft und Sprachtechnologie (2006) 0.01

0.009520924 = product of:
  0.066646464 = sum of:
    0.058574736 = weight(_text_:elektronische in 6023) [ClassicSimilarity], result of:
      0.058574736 = score(doc=6023,freq=2.0), product of:
        0.14013545 = queryWeight, product of:
          4.728978 = idf(docFreq=1061, maxDocs=44218)
          0.029633347 = queryNorm
        0.41798657 = fieldWeight in 6023, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.728978 = idf(docFreq=1061, maxDocs=44218)
          0.0625 = fieldNorm(doc=6023)
    0.008071727 = weight(_text_:information in 6023) [ClassicSimilarity], result of:
      0.008071727 = score(doc=6023,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.1551638 = fieldWeight in 6023, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=6023)
  0.14285715 = coord(2/14)

Abstract: Dieser Beitrag beschreibt die Strukturierungsprinzipien und Anwendungskontexte lexikalisch-semantischer Wortnetze, insbesondere des deutschen Wortnetzes GermaNet. Wortnetze sind zurzeit besonders populäre elektronische Lexikonressourcen, die große Abdeckungen semantisch strukturierter Datenfür verschiedene Sprachen und Sprachverbünde enthalten. In Wortnetzen sind die häufigsten und wichtigsten Konzepte einer Sprache mit ihren elementaren Bedeutungsrelationen repräsentiert. Zentrale Anwendungen für Wortnetze sind u.a. die Lesartendisambiguierung und die Informationserschließung. Der Artikel skizziert die neusten Szenarien, in denen GermaNet eingesetzt wird: die Semantische Informationserschließung und die Integration allgemeinsprachlicher Wortnetze mit terminologischen Ressourcen vordem Hintergrund der Datenkonvertierung in OWL.
Source: Information - Wissenschaft und Praxis. 57(2006) H.6/7, S.309-314

Frappaolo, C.: Artificial intelligence and text retrieval : a current perspective on the state of the art (1992) 0.01

0.009451089 = product of:
  0.066157624 = sum of:
    0.014268933 = weight(_text_:information in 7097) [ClassicSimilarity], result of:
      0.014268933 = score(doc=7097,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.27429342 = fieldWeight in 7097, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=7097)
    0.05188869 = weight(_text_:retrieval in 7097) [ClassicSimilarity], result of:
      0.05188869 = score(doc=7097,freq=6.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.5788671 = fieldWeight in 7097, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=7097)
  0.14285715 = coord(2/14)

Abstract: Brief discussion of the ways in which computerized information retrieval and database searching can be enhanced by integrating artificial intelligence with such search systems. Explores the possibility of integrating the powers and capabilities of artificial intelligence (specifically natural language processing) with text retrieval
Imprint: Medford, NJ : Learned Information Inc.

Yannakoudakis, E.J.; Daraki, J.J.: Lexical clustering and retrieval of bibliographic records (1994) 0.01

0.009356101 = product of:
  0.065492705 = sum of:
    0.014125523 = weight(_text_:information in 1045) [ClassicSimilarity], result of:
      0.014125523 = score(doc=1045,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.27153665 = fieldWeight in 1045, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1045)
    0.05136718 = weight(_text_:retrieval in 1045) [ClassicSimilarity], result of:
      0.05136718 = score(doc=1045,freq=12.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.5730491 = fieldWeight in 1045, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1045)
  0.14285715 = coord(2/14)

Abstract: Presents a new system that enables users to retrieve catalogue entries on the basis of theri lexical similarities and to cluster records in a dynamic fashion. Describes the information retrieval system developed by the Department of Informatics, Athens University of Economics and Business, Greece. The system also offers the means for cyclic retrieval of records from each cluster while allowing the user to define the field to be used in each case. The approach is based on logical keys which are derived from pertinent bibliographic fields and are used for all clustering and information retrieval functions
Source: Information retrieval: new systems and current research. Proceedings of the 15th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Glasgow 1993. Ed.: Ruben Leon

Search (418 results, page 2 of 21)

Authors

Years

Languages

Types

Themes