Search (55 results, page 3 of 3)

Mansour, N.; Haraty, R.A.; Daher, W.; Houri, M.: ¬An auto-indexing method for Arabic text (2008) 0.00

0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 2103) [ClassicSimilarity], result of:
          0.018809535 = score(doc=2103,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 2103, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=2103)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Correa, C.A.; Kobashi, N.Y.: ¬A hybrid model of automatic indexing based on paraconsitent logic 0.00

0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 3537) [ClassicSimilarity], result of:
          0.018809535 = score(doc=3537,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 3537, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=3537)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Paradigms and conceptual systems in knowledge organization: Proceedings of the Eleventh International ISKO conference, Rome, 23-26 February 2010, ed. Claudio Gnoli, Indeks, Frankfurt M

Banerjee, K.; Johnson, M.: Improving access to archival collections with automated entity extraction (2015) 0.00

0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 2144) [ClassicSimilarity], result of:
          0.018809535 = score(doc=2144,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 2144, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=2144)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Fauzi, F.; Belkhatir, M.: Multifaceted conceptual image indexing on the world wide web (2013) 0.00

0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 2721) [ClassicSimilarity], result of:
          0.018809535 = score(doc=2721,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 2721, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=2721)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Snajder, J.; Dalbelo Basic, B.D.; Tadic, M.: Automatic acquisition of inflectional lexica for morphological normalisation (2008) 0.00

0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 2910) [ClassicSimilarity], result of:
          0.018809535 = score(doc=2910,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 2910, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=2910)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Zhitomirsky-Geffet, M.; Prebor, G.; Bloch, O.: Improving proverb search and retrieval with a generic multidimensional ontology (2017) 0.00

0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 3320) [ClassicSimilarity], result of:
          0.018809535 = score(doc=3320,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 3320, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=3320)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.00
```
0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 720) [ClassicSimilarity], result of:
          0.018809535 = score(doc=720,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 720, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.

Asula, M.; Makke, J.; Freienthal, L.; Kuulmets, H.-A.; Sirel, R.: Kratt: developing an automatic subject indexing tool for the National Library of Estonia : how to transfer metadata information among work cluster members (2021) 0.00

0.0047023837 = product of:
  0.009404767 = sum of:
    0.009404767 = product of:
      0.018809535 = sum of:
        0.018809535 = weight(_text_:m in 723) [ClassicSimilarity], result of:
          0.018809535 = score(doc=723,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.1649624 = fieldWeight in 723, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.046875 = fieldNorm(doc=723)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.00

0.003918653 = product of:
  0.007837306 = sum of:
    0.007837306 = product of:
      0.015674612 = sum of:
        0.015674612 = weight(_text_:m in 2918) [ClassicSimilarity], result of:
          0.015674612 = score(doc=2918,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.13746867 = fieldWeight in 2918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2918)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Donahue, J.; Hendricks, L.A.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description (2014) 0.00

0.003918653 = product of:
  0.007837306 = sum of:
    0.007837306 = product of:
      0.015674612 = sum of:
        0.015674612 = weight(_text_:m in 1873) [ClassicSimilarity], result of:
          0.015674612 = score(doc=1873,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.13746867 = fieldWeight in 1873, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1873)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Golub, K.; Soergel, D.; Buchanan, G.; Tudhope, D.; Lykke, M.; Hiom, D.: ¬A framework for evaluating automatic indexing or classification in the context of retrieval (2016) 0.00

0.003918653 = product of:
  0.007837306 = sum of:
    0.007837306 = product of:
      0.015674612 = sum of:
        0.015674612 = weight(_text_:m in 3311) [ClassicSimilarity], result of:
          0.015674612 = score(doc=3311,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.13746867 = fieldWeight in 3311, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3311)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints 0.00

0.003918653 = product of:
  0.007837306 = sum of:
    0.007837306 = product of:
      0.015674612 = sum of:
        0.015674612 = weight(_text_:m in 4309) [ClassicSimilarity], result of:
          0.015674612 = score(doc=4309,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.13746867 = fieldWeight in 4309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4309)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Ahmed, M.: Automatic indexing for agriculture : designing a framework by deploying Agrovoc, Agris and Annif (2023) 0.00

0.003918653 = product of:
  0.007837306 = sum of:
    0.007837306 = product of:
      0.015674612 = sum of:
        0.015674612 = weight(_text_:m in 1024) [ClassicSimilarity], result of:
          0.015674612 = score(doc=1024,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.13746867 = fieldWeight in 1024, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1024)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.00

0.0031349224 = product of:
  0.0062698447 = sum of:
    0.0062698447 = product of:
      0.012539689 = sum of:
        0.012539689 = weight(_text_:m in 3081) [ClassicSimilarity], result of:
          0.012539689 = score(doc=3081,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.10997493 = fieldWeight in 3081, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Needham, R.M.; Sparck Jones, K.: Keywords and clumps (1985) 0.00
```
0.002743057 = product of:
  0.005486114 = sum of:
    0.005486114 = product of:
      0.010972228 = sum of:
        0.010972228 = weight(_text_:m in 3645) [ClassicSimilarity], result of:
          0.010972228 = score(doc=3645,freq=2.0), product of:
            0.114023164 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.045820985 = queryNorm
            0.09622806 = fieldWeight in 3645, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3645)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The selection that follows was chosen as it represents "a very early paper an the possibilities allowed by computers an documentation." In the early 1960s computers were being used to provide simple automatic indexing systems wherein keywords were extracted from documents. The problem with such systems was that they lacked vocabulary control, thus documents related in subject matter were not always collocated in retrieval. To improve retrieval by improving recall is the raison d'être of vocabulary control tools such as classifications and thesauri. The question arose whether it was possible by automatic means to construct classes of terms, which when substituted, one for another, could be used to improve retrieval performance? One of the first theoretical approaches to this question was initiated by R. M. Needham and Karen Sparck Jones at the Cambridge Language Research Institute in England.t The question was later pursued using experimental methodologies by Sparck Jones, who, as a Senior Research Associate in the Computer Laboratory at the University of Cambridge, has devoted her life's work to research in information retrieval and automatic naturai language processing. Based an the principles of numerical taxonomy, automatic classification techniques start from the premise that two objects are similar to the degree that they share attributes in common. When these two objects are keywords, their similarity is measured in terms of the number of documents they index in common. Step 1 in automatic classification is to compute mathematically the degree to which two terms are similar. Step 2 is to group together those terms that are "most similar" to each other, forming equivalence classes of intersubstitutable terms. The technique for forming such classes varies and is the factor that characteristically distinguishes different approaches to automatic classification. The technique used by Needham and Sparck Jones, that of clumping, is described in the selection that follows. Questions that must be asked are whether the use of automatically generated classes really does improve retrieval performance and whether there is a true eco nomic advantage in substituting mechanical for manual labor. Several years after her work with clumping, Sparck Jones was to observe that while it was not wholly satisfactory in itself, it was valuable in that it stimulated research into automatic classification. To this it might be added that it was valuable in that it introduced to libraryl information science the methods of numerical taxonomy, thus stimulating us to think again about the fundamental nature and purpose of classification. In this connection it might be useful to review how automatically derived classes differ from those of manually constructed classifications: 1) the manner of their derivation is purely a posteriori, the ultimate operationalization of the principle of literary warrant; 2) the relationship between members forming such classes is essentially statistical; the members of a given class are similar to each other not because they possess the class-defining characteristic but by virtue of sharing a family resemblance; and finally, 3) automatically derived classes are not related meaningfully one to another, that is, they are not ordered in traditional hierarchical and precedence relationships.

Search (55 results, page 3 of 3)

Authors

Years

Types

Themes

Classifications