Search (443 results, page 23 of 23)

Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints 0.00
```
4.3479266E-4 = product of:
  0.004782719 = sum of:
    0.004782719 = weight(_text_:a in 4309) [ClassicSimilarity], result of:
      0.004782719 = score(doc=4309,freq=12.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.15602624 = fieldWeight in 4309, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4309)
  0.09090909 = coord(1/11)
```
Abstract

Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to the relevance of particular subjects, disregarding indicators of insufficient content representation at the document-level. Therefore, we propose a novel approach that detects documents rather than concepts where quality criteria are met. Our approach uses a deep, multi-layered regression architecture, which comprises a variety of content-based indicators. We evaluated multiple configurations using text collections from law and economics, where the available content is restricted to very short texts. Notably, we demonstrate that the proposed quality estimation technique can determine subsets of the previously unseen data where considerable gains in document-level recall can be achieved, while upholding precision at the same time. Hence, the approach effectively performs a filtering that ensures high data quality standards in operative information retrieval systems.

Content

This is an authors' manuscript version of a paper accepted for proceedings of TPDL-2018, Porto, Portugal, Sept 10-13. The nal authenticated publication is available online at https://doi.org/will be added as soon as available.

Type

a

Tague-Sutcliffe, J.: Information retrieval experimentation (2009) 0.00

4.0164427E-4 = product of:
  0.0044180867 = sum of:
    0.0044180867 = weight(_text_:a in 3801) [ClassicSimilarity], result of:
      0.0044180867 = score(doc=3801,freq=4.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.14413087 = fieldWeight in 3801, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=3801)
  0.09090909 = coord(1/11)

Abstract: Jean Tague-Sutcliffe was an important figure in information retrieval experimentation. Here, she reviews the history of IR research, and provides a description of the fundamental paradigm of information retrieval experimentation that continues to dominate the field.
Type: a

Voorhees, E.M.: Text REtrieval Conference (TREC) (2009) 0.00

4.0164427E-4 = product of:
  0.0044180867 = sum of:
    0.0044180867 = weight(_text_:a in 3890) [ClassicSimilarity], result of:
      0.0044180867 = score(doc=3890,freq=4.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.14413087 = fieldWeight in 3890, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=3890)
  0.09090909 = coord(1/11)

Abstract: This entry summarizes the history, results, and impact of the Text REtrieval Conference (TREC), a workshop series designed to support the information retrieval community by building the infrastructure necessary for large-scale evaluation of retrieval technology.
Type: a

Search (443 results, page 23 of 23)

Authors

Years

Languages

Themes