Search (29 results, page 1 of 2)

Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.04
```
0.039735876 = product of:
  0.07947175 = sum of:
    0.07947175 = sum of:
      0.0444101 = weight(_text_:cataloging in 1794) [ClassicSimilarity], result of:
        0.0444101 = score(doc=1794,freq=2.0), product of:
          0.20397975 = queryWeight, product of:
            3.9411201 = idf(docFreq=2334, maxDocs=44218)
            0.051756795 = queryNorm
          0.21771818 = fieldWeight in 1794, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.9411201 = idf(docFreq=2334, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
      0.035061657 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
        0.035061657 = score(doc=1794,freq=2.0), product of:
          0.18124348 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051756795 = queryNorm
          0.19345059 = fieldWeight in 1794, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
  0.5 = coord(1/2)
```
Abstract

In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document

Date

11. 9.2000 19:53:22

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03

0.028049326 = product of:
  0.05609865 = sum of:
    0.05609865 = product of:
      0.1121973 = sum of:
        0.1121973 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.1121973 = score(doc=402,freq=2.0), product of:
            0.18124348 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051756795 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information processing and management. 22(1986) no.6, S.465-476

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02

0.024543159 = product of:
  0.049086317 = sum of:
    0.049086317 = product of:
      0.098172635 = sum of:
        0.098172635 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.098172635 = score(doc=6265,freq=2.0), product of:
            0.18124348 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051756795 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information outlook. 9(2005) no.8, S.22-23

Junger, U.: Can indexing be automated? : the example of the Deutsche Nationalbibliothek (2014) 0.02

0.021981878 = product of:
  0.043963756 = sum of:
    0.043963756 = product of:
      0.08792751 = sum of:
        0.08792751 = weight(_text_:cataloging in 1969) [ClassicSimilarity], result of:
          0.08792751 = score(doc=1969,freq=4.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.43106002 = fieldWeight in 1969, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1969)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: The German Integrated Authority File (Gemeinsame Normdatei, GND), provides a broad controlled vocabulary for indexing documents on all subjects. Traditionally used for intellectual subject cataloging primarily for books, the Deutsche Nationalbibliothek (DNB, German National Library) has been working on developing and implementing procedures for automated assignment of subject headings for online publications. This project, its results, and problems are outlined in this article.
Source: Cataloging and classification quarterly. 52(2014) no.1, S.102-109

Lichtenstein, A.; Plank, M.; Neumann, J.: TIB's portal for audiovisual media : combining manual and automatic indexing (2014) 0.02
```
0.021981878 = product of:
  0.043963756 = sum of:
    0.043963756 = product of:
      0.08792751 = sum of:
        0.08792751 = weight(_text_:cataloging in 1981) [ClassicSimilarity], result of:
          0.08792751 = score(doc=1981,freq=4.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.43106002 = fieldWeight in 1981, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1981)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The German National Library of Science and Technology (TIB) developed a Web-based platform for audiovisual media. The audiovisual portal optimizes access to scientific videos such as computer animations and lecture and conference recordings. TIB's AV-Portal combines traditional cataloging and automatic indexing of audiovisual media. The article describes metadata standards for audiovisual media and introduces the TIB's metadata schema in comparison to other metadata standards for non-textual materials. Additionally, we give an overview of multimedia retrieval technologies used for the Portal and present the AV-Portal in detail as well as the additional value for libraries and their users.

Source

Cataloging and classification quarterly. 52(2014) no.5, S.562-577
Short, M.: Text mining and subject analysis for fiction; or, using machine learning and information extraction to assign subject headings to dime novels (2019) 0.02
```
0.021981878 = product of:
  0.043963756 = sum of:
    0.043963756 = product of:
      0.08792751 = sum of:
        0.08792751 = weight(_text_:cataloging in 5481) [ClassicSimilarity], result of:
          0.08792751 = score(doc=5481,freq=4.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.43106002 = fieldWeight in 5481, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5481)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.

Source

Cataloging and classification quarterly. 57(2019) no.5, S.315-336
Moulaison-Sandy, H.; Adkins, D.; Bossaller, J.; Cho, H.: ¬An automated approach to describing fiction : a methodology to use book reviews to identify affect (2021) 0.02
```
0.021981878 = product of:
  0.043963756 = sum of:
    0.043963756 = product of:
      0.08792751 = sum of:
        0.08792751 = weight(_text_:cataloging in 710) [ClassicSimilarity], result of:
          0.08792751 = score(doc=710,freq=4.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.43106002 = fieldWeight in 710, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=710)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Subject headings and genre terms are notoriously difficult to apply, yet are important for fiction. The current project functions as a proof of concept, using a text-mining methodology to identify affective information (emotion and tone) about fiction titles from professional book reviews as a potential first step in automating the subject analysis process. Findings are presented and discussed, comparing results to the range of aboutness and isness information in library cataloging records. The methodology is likewise presented, and how future work might expand on the current project to enhance catalog records through text-mining is explored.

Source

Cataloging and classification quarterly. 59(2021) no.8, p.794-814

Oliver, C.: Leveraging KOS to extend our reach with automated processes (2021) 0.02

0.01776404 = product of:
  0.03552808 = sum of:
    0.03552808 = product of:
      0.07105616 = sum of:
        0.07105616 = weight(_text_:cataloging in 722) [ClassicSimilarity], result of:
          0.07105616 = score(doc=722,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.3483491 = fieldWeight in 722, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0625 = fieldNorm(doc=722)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Cataloging and classification quarterly. 59(2021) no.8, p.868-874

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.02

0.017530829 = product of:
  0.035061657 = sum of:
    0.035061657 = product of:
      0.070123315 = sum of:
        0.070123315 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.070123315 = score(doc=1952,freq=2.0), product of:
            0.18124348 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051756795 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 16. 8.1998 12:51:22

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02

0.017530829 = product of:
  0.035061657 = sum of:
    0.035061657 = product of:
      0.070123315 = sum of:
        0.070123315 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.070123315 = score(doc=4157,freq=2.0), product of:
            0.18124348 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051756795 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.02

0.017530829 = product of:
  0.035061657 = sum of:
    0.035061657 = product of:
      0.070123315 = sum of:
        0.070123315 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
          0.070123315 = score(doc=2759,freq=2.0), product of:
            0.18124348 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051756795 = queryNorm
            0.38690117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 2.2016 18:25:22

Keller, A.: Attitudes among German- and English-speaking librarians toward (automatic) subject indexing (2015) 0.02

0.015543535 = product of:
  0.03108707 = sum of:
    0.03108707 = product of:
      0.06217414 = sum of:
        0.06217414 = weight(_text_:cataloging in 2629) [ClassicSimilarity], result of:
          0.06217414 = score(doc=2629,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.30480546 = fieldWeight in 2629, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2629)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Cataloging and classification quarterly. 53(2015) no.8, S.895-904

Golub, K.: Automated subject indexing : an overview (2021) 0.02

0.015543535 = product of:
  0.03108707 = sum of:
    0.03108707 = product of:
      0.06217414 = sum of:
        0.06217414 = weight(_text_:cataloging in 718) [ClassicSimilarity], result of:
          0.06217414 = score(doc=718,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.30480546 = fieldWeight in 718, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=718)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Cataloging and classification quarterly. 59(2021) no.8, p.702-719

Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.02

0.015543535 = product of:
  0.03108707 = sum of:
    0.03108707 = product of:
      0.06217414 = sum of:
        0.06217414 = weight(_text_:cataloging in 1139) [ClassicSimilarity], result of:
          0.06217414 = score(doc=1139,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.30480546 = fieldWeight in 1139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1139)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Cataloging and classification quarterly. 60(2022) no.8, p.807-835

Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.01

0.014024663 = product of:
  0.028049326 = sum of:
    0.028049326 = product of:
      0.05609865 = sum of:
        0.05609865 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
          0.05609865 = score(doc=4709,freq=2.0), product of:
            0.18124348 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051756795 = queryNorm
            0.30952093 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01

0.014024663 = product of:
  0.028049326 = sum of:
    0.028049326 = product of:
      0.05609865 = sum of:
        0.05609865 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
          0.05609865 = score(doc=6752,freq=2.0), product of:
            0.18124348 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051756795 = queryNorm
            0.30952093 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 6. 3.1997 16:22:15

Bloomfield, M.: Indexing : neglected and poorly understood (2001) 0.01

0.0133230295 = product of:
  0.026646059 = sum of:
    0.026646059 = product of:
      0.053292118 = sum of:
        0.053292118 = weight(_text_:cataloging in 5439) [ClassicSimilarity], result of:
          0.053292118 = score(doc=5439,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.26126182 = fieldWeight in 5439, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.046875 = fieldNorm(doc=5439)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Cataloging and classification quarterly. 33(2001) no.1, S.63-75

Medelyan, O.; Witten, I.H.: Domain-independent automatic keyphrase indexing with small training sets (2008) 0.01
```
0.0133230295 = product of:
  0.026646059 = sum of:
    0.026646059 = product of:
      0.053292118 = sum of:
        0.053292118 = weight(_text_:cataloging in 1871) [ClassicSimilarity], result of:
          0.053292118 = score(doc=1871,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.26126182 = fieldWeight in 1871, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.046875 = fieldNorm(doc=1871)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Keyphrases are widely used in both physical and digital libraries as a brief, but precise, summary of documents. They help organize material based on content, provide thematic access, represent search results, and assist with navigation. Manual assignment is expensive because trained human indexers must reach an understanding of the document and select appropriate descriptors according to defined cataloging rules. We propose a new method that enhances automatic keyphrase extraction by using semantic information about terms and phrases gleaned from a domain-specific thesaurus. The key advantage of the new approach is that it performs well with very little training data. We evaluate it on a large set of manually indexed documents in the domain of agriculture, compare its consistency with a group of six professional indexers, and explore its performance on smaller collections of documents in other domains and of French and Spanish documents.

Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.01

0.0133230295 = product of:
  0.026646059 = sum of:
    0.026646059 = product of:
      0.053292118 = sum of:
        0.053292118 = weight(_text_:cataloging in 720) [ClassicSimilarity], result of:
          0.053292118 = score(doc=720,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.26126182 = fieldWeight in 720, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Cataloging and classification quarterly. 59(2021) no.8, p.815-834

Asula, M.; Makke, J.; Freienthal, L.; Kuulmets, H.-A.; Sirel, R.: Kratt: developing an automatic subject indexing tool for the National Library of Estonia : how to transfer metadata information among work cluster members (2021) 0.01

0.0133230295 = product of:
  0.026646059 = sum of:
    0.026646059 = product of:
      0.053292118 = sum of:
        0.053292118 = weight(_text_:cataloging in 723) [ClassicSimilarity], result of:
          0.053292118 = score(doc=723,freq=2.0), product of:
            0.20397975 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.051756795 = queryNorm
            0.26126182 = fieldWeight in 723, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.046875 = fieldNorm(doc=723)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Cataloging and classification quarterly. 59(2021) no.8, p.775-793

Search (29 results, page 1 of 2)

Authors

Years

Themes