Search (7 results, page 1 of 1)

Lin, J.; Katz, B.: Building a reusable test collection for question answering (2006) 0.01
```
0.00609702 = product of:
  0.04267914 = sum of:
    0.01075265 = weight(_text_:information in 5045) [ClassicSimilarity], result of:
      0.01075265 = score(doc=5045,freq=6.0), product of:
        0.05334617 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030388402 = queryNorm
        0.20156369 = fieldWeight in 5045, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5045)
    0.03192649 = weight(_text_:retrieval in 5045) [ClassicSimilarity], result of:
      0.03192649 = score(doc=5045,freq=6.0), product of:
        0.091922335 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030388402 = queryNorm
        0.34732026 = fieldWeight in 5045, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5045)
  0.14285715 = coord(2/14)
```
Abstract

In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language-processing capabilities, they still rely on traditional document retrieval techniques to generate an initial candidate set of documents. In this article, the authors argue that document retrieval for question answering represents a task different from retrieving documents in response to more general retrospective information needs. Thus, to guide future system development, specialized question answering test collections must be constructed. They show that the current evaluation resources have major shortcomings; to remedy the situation, they have manually created a small, reusable question answering test collection for research purposes. In this article they describe their methodology for building this test collection and discuss issues they encountered regarding the notion of "answer correctness."

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.7, S.851-861

Lin, J.; DiCuccio, M.; Grigoryan, V.; Wilbur, W.J.: Navigating information spaces : a case study of related article search in PubMed (2008) 0.01

0.005707069 = product of:
  0.03994948 = sum of:
    0.013881613 = weight(_text_:information in 2124) [ClassicSimilarity], result of:
      0.013881613 = score(doc=2124,freq=10.0), product of:
        0.05334617 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030388402 = queryNorm
        0.2602176 = fieldWeight in 2124, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2124)
    0.026067868 = weight(_text_:retrieval in 2124) [ClassicSimilarity], result of:
      0.026067868 = score(doc=2124,freq=4.0), product of:
        0.091922335 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030388402 = queryNorm
        0.2835858 = fieldWeight in 2124, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2124)
  0.14285715 = coord(2/14)

Abstract: The concept of an "information space" provides a powerful metaphor for guiding the design of interactive retrieval systems. We present a case study of related article search, a browsing tool designed to help users navigate the information space defined by results of the PubMed® search engine. This feature leverages content-similarity links that tie MEDLINE® citations together in a vast document network. We examine the effectiveness of related article search from two perspectives: a topological analysis of networks generated from information needs represented in the TREC 2005 genomics track and a query log analysis of real PubMed users. Together, data suggest that related article search is a useful feature and that browsing related articles has become an integral part of how users interact with PubMed.
Source: Information processing and management. 44(2008) no.5, S.1771-1783
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Lin, J.: User simulations for evaluating answers to question series (2007) 0.00

0.0035201162 = product of:
  0.024640812 = sum of:
    0.0062080454 = weight(_text_:information in 914) [ClassicSimilarity], result of:
      0.0062080454 = score(doc=914,freq=2.0), product of:
        0.05334617 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030388402 = queryNorm
        0.116372846 = fieldWeight in 914, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=914)
    0.018432766 = weight(_text_:retrieval in 914) [ClassicSimilarity], result of:
      0.018432766 = score(doc=914,freq=2.0), product of:
        0.091922335 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030388402 = queryNorm
        0.20052543 = fieldWeight in 914, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=914)
  0.14285715 = coord(2/14)

Abstract: Recently, question series have become one focus of research in question answering. These series are comprised of individual factoid, list, and "other" questions organized around a central topic, and represent abstractions of user-system dialogs. Existing evaluation methodologies have yet to catch up with this richer task model, as they fail to take into account contextual dependencies and different user behaviors. This paper presents a novel simulation-based methodology for evaluating answers to question series that addresses some of these shortcomings. Using this methodology, we examine two different behavior models: a "QA-styled" user and an "IR-styled" user. Results suggest that an off-the-shelf document retrieval system is competitive with state-of-the-art QA systems in this task. Advantages and limitations of evaluations based on user simulations are also discussed.
Source: Information processing and management. 43(2007) no.3, S.717-729

Yang, C.C.; Lin, J.; Wei, C.-P.: Retaining knowledge for document management : category-tree integration by exploiting category relationships and hierarchical structures (2010) 0.00
```
5.225894E-4 = product of:
  0.0073162518 = sum of:
    0.0073162518 = weight(_text_:information in 3581) [ClassicSimilarity], result of:
      0.0073162518 = score(doc=3581,freq=4.0), product of:
        0.05334617 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030388402 = queryNorm
        0.13714671 = fieldWeight in 3581, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3581)
  0.071428575 = coord(1/14)
```
Abstract

The category-tree document-classification structure is widely used by enterprises and information providers to organize, archive, and access documents for effective knowledge management. However, category trees from various sources use different hierarchical structures, which usually make mappings between categories in different category trees difficult. In this work, we propose a category-tree integration technique. We develop a method to learn the relationships between any two categories and develop operations such as mapping, splitting, and insertion for this integration. According to the parent-child relationship of the integrating categories, the developed decision rules use integration operations to integrate categories from the source category tree with those from the master category tree. A unified category tree can accumulate knowledge from multiple resources without forfeiting the knowledge in individual category trees. Experiments have been conducted to measure the performance of the integration operations and the accuracy of the integrated category trees. The proposed category-tree integration technique achieves greater than 80% integration accuracy, and the insert operation is the most frequently utilized, followed by map and split. The insert operation achieves 77% of F1 while the map and split operations achieves 86% and 29% of F1, respectively.

Source

Journal of the American Society for Information Science and Technology. 61(2010) no.7, S.1313-1331

Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R.: Multi-candidate reduction : sentence compression as a tool for document summarization tasks (2007) 0.00

5.173371E-4 = product of:
  0.0072427196 = sum of:
    0.0072427196 = weight(_text_:information in 944) [ClassicSimilarity], result of:
      0.0072427196 = score(doc=944,freq=2.0), product of:
        0.05334617 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030388402 = queryNorm
        0.13576832 = fieldWeight in 944, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=944)
  0.071428575 = coord(1/14)

Source: Information processing and management. 43(2007) no.6, S.1549-1570

Zajic, D.M.; Dorr, B.J.; Lin, J.: Single-document and multi-document summarization techniques for email threads using sentence compression (2008) 0.00

5.173371E-4 = product of:
  0.0072427196 = sum of:
    0.0072427196 = weight(_text_:information in 2105) [ClassicSimilarity], result of:
      0.0072427196 = score(doc=2105,freq=2.0), product of:
        0.05334617 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030388402 = queryNorm
        0.13576832 = fieldWeight in 2105, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2105)
  0.071428575 = coord(1/14)

Source: Information processing and management. 44(2008) no.4, S.1600-1610

Hawes, T.; Lin, J.; Resnik, P.: Elements of a computational model for multi-party discourse : the turn-taking behavior of Supreme Court justices (2009) 0.00

5.173371E-4 = product of:
  0.0072427196 = sum of:
    0.0072427196 = weight(_text_:information in 3087) [ClassicSimilarity], result of:
      0.0072427196 = score(doc=3087,freq=2.0), product of:
        0.05334617 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030388402 = queryNorm
        0.13576832 = fieldWeight in 3087, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3087)
  0.071428575 = coord(1/14)

Source: Journal of the American Society for Information Science and Technology. 60(2009) no.8, S.1607-1615

Search (7 results, page 1 of 1)

Authors

Years

Themes