Search (4 results, page 1 of 1)

Yusuff, A.: Automatisches Indexing and Abstracting : Grundlagen und Beispiele (2002) 0.00

0.00334869 = product of:
  0.00669738 = sum of:
    0.00669738 = product of:
      0.01339476 = sum of:
        0.01339476 = weight(_text_:a in 1577) [ClassicSimilarity], result of:
          0.01339476 = score(doc=1577,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.25222903 = fieldWeight in 1577, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=1577)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Imprint: Potsdam : Fachhochschule, FB A-B-D

Salton, G.; Allan, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine readable texts (1994) 0.00

0.0023919214 = product of:
  0.0047838427 = sum of:
    0.0047838427 = product of:
      0.009567685 = sum of:
        0.009567685 = weight(_text_:a in 1949) [ClassicSimilarity], result of:
          0.009567685 = score(doc=1949,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.18016359 = fieldWeight in 1949, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=1949)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.00
```
0.0020714647 = product of:
  0.0041429293 = sum of:
    0.0041429293 = product of:
      0.008285859 = sum of:
        0.008285859 = weight(_text_:a in 5400) [ClassicSimilarity], result of:
          0.008285859 = score(doc=5400,freq=12.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.15602624 = fieldWeight in 5400, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.

Type

a
Jones, S.; Paynter, G.W.: Automatic extractionof document keyphrases for use in digital libraries : evaluations and applications (2002) 0.00
```
0.0011959607 = product of:
  0.0023919214 = sum of:
    0.0023919214 = product of:
      0.0047838427 = sum of:
        0.0047838427 = weight(_text_:a in 601) [ClassicSimilarity], result of:
          0.0047838427 = score(doc=601,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.090081796 = fieldWeight in 601, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=601)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.

Type

a

Search (4 results, page 1 of 1)

Authors

Years

Languages

Types