Search (152 results, page 3 of 8)

Lopes Martins, D.; Silva Lemos, D.L. da; Rosa de Oliveira, L.F.; Siqueira, J.; Carmo, D. do; Nunes Medeiros, V.: Information organization and representation in digital cultural heritage in Brazil : systematic mapping of information infrastructure in digital collections for data science applications (2023) 0.03

0.029886894 = product of:
  0.08966068 = sum of:
    0.08966068 = weight(_text_:systematic in 968) [ClassicSimilarity], result of:
      0.08966068 = score(doc=968,freq=2.0), product of:
        0.28397155 = queryWeight, product of:
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.049684696 = queryNorm
        0.31573826 = fieldWeight in 968, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.0390625 = fieldNorm(doc=968)
  0.33333334 = coord(1/3)

Tian, W.; Cai, R.; Fang, Z.; Geng, Y.; Wang, X.; Hu, Z.: Understanding co-corresponding authorship : a bibliometric analysis and detailed overview (2024) 0.03
```
0.029886894 = product of:
  0.08966068 = sum of:
    0.08966068 = weight(_text_:systematic in 1196) [ClassicSimilarity], result of:
      0.08966068 = score(doc=1196,freq=2.0), product of:
        0.28397155 = queryWeight, product of:
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.049684696 = queryNorm
        0.31573826 = fieldWeight in 1196, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1196)
  0.33333334 = coord(1/3)
```
Abstract

The phenomenon of co-corresponding authorship is becoming more and more common. To understand the practice of authorship credit sharing among multiple corresponding authors, we comprehensively analyzed the characteristics of the phenomenon of co-corresponding authorships from the perspectives of countries, disciplines, journals, and articles. This researcher was based on a dataset of nearly 8 million articles indexed in the Web of Science, which provides systematic, cross-disciplinary, and large-scale evidence for understanding the phenomenon of co-corresponding authorship for the first time. Our findings reveal that higher proportions of co-corresponding authorship exist in Asian countries, especially in China. From the perspective of disciplines, there is a relatively higher proportion of co-corresponding authorship in the fields of engineering and medicine, while a lower proportion exists in the humanities, social sciences, and computer science fields. From the perspective of journals, high-quality journals usually have higher proportions of co-corresponding authorship. At the level of the article, our findings proved that, compared to articles with a single corresponding author, articles with multiple corresponding authors have a significant citation advantage.
Cheti, A.; Viti, E.: Functionality and merits of a faceted thesaurus : the case of the Nuovo soggettario (2023) 0.03
```
0.029550051 = product of:
  0.08865015 = sum of:
    0.08865015 = sum of:
      0.048260607 = weight(_text_:indexing in 1181) [ClassicSimilarity], result of:
        0.048260607 = score(doc=1181,freq=2.0), product of:
          0.19018644 = queryWeight, product of:
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.049684696 = queryNorm
          0.2537542 = fieldWeight in 1181, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.046875 = fieldNorm(doc=1181)
      0.04038954 = weight(_text_:22 in 1181) [ClassicSimilarity], result of:
        0.04038954 = score(doc=1181,freq=2.0), product of:
          0.17398734 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.049684696 = queryNorm
          0.23214069 = fieldWeight in 1181, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=1181)
  0.33333334 = coord(1/3)
```
Abstract

The Nuovo soggettario, the official Italian subject indexing system edited by the National Central Library of Florence, is made up of interactive components, the core of which is a general thesaurus and some rules of a conventional syntax for subject string construction. The Nuovo soggettario Thesaurus is in compliance with ISO 25964: 2011-2013, IFLA LRM, and FAIR principle (findability, accessibility, interoperability, and reusability). Its open data are available in the Zthes, MARC21, and in SKOS formats and allow for interoperability with l library, archive, and museum databases. The Thesaurus's macrostructure is organized into four fundamental macro-categories, thirteen categories, and facets. The facets allow for the orderly development of hierarchies, thereby limiting polyhierarchies and promoting the grouping of homogenous concepts. This paper addresses the main features and peculiarities which have characterized the consistent development of this categorical structure and its effects on the syntactic sphere in a predominantly pre-coordinated usage context.

Date

26.11.2023 18:59:22

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.03

0.026304156 = product of:
  0.07891247 = sum of:
    0.07891247 = product of:
      0.2367374 = sum of:
        0.2367374 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.2367374 = score(doc=862,freq=2.0), product of:
            0.4212274 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.049684696 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Hjoerland, B.: Table of contents (ToC) (2022) 0.02
```
0.02462504 = product of:
  0.07387512 = sum of:
    0.07387512 = sum of:
      0.04021717 = weight(_text_:indexing in 1096) [ClassicSimilarity], result of:
        0.04021717 = score(doc=1096,freq=2.0), product of:
          0.19018644 = queryWeight, product of:
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.049684696 = queryNorm
          0.21146181 = fieldWeight in 1096, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1096)
      0.033657953 = weight(_text_:22 in 1096) [ClassicSimilarity], result of:
        0.033657953 = score(doc=1096,freq=2.0), product of:
          0.17398734 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.049684696 = queryNorm
          0.19345059 = fieldWeight in 1096, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1096)
  0.33333334 = coord(1/3)
```
Abstract

A table of contents (ToC) is a kind of document representation as well as a paratext and a kind of finding device to the document it represents. TOCs are very common in books and some other kinds of documents, but not in all kinds. This article discusses the definition and functions of ToC, normative guidelines for their design, and the history and forms of ToC in different kinds of documents and media. A main part of the article is about the role of ToC in information searching, in current awareness services and as items added to bibliographical records. The introduction and the conclusion focus on the core theoretical issues concerning ToCs. Should they be document-oriented or request-oriented, neutral, or policy-oriented, objective, or subjective? It is concluded that because of the special functions of ToCs, the arguments for the request-oriented (policy-oriented, subjective) view are weaker than they are in relation to indexing and knowledge organization in general. Apart from level of granularity, the evaluation of a ToC is difficult to separate from the evaluation of the structuring and naming of the elements of the structure of the document it represents.

Date

18.11.2023 13:47:22
Hobert, A.; Jahn, N.; Mayr, P.; Schmidt, B.; Taubert, N.: Open access uptake in Germany 2010-2018 : adoption in a diverse research landscape (2021) 0.02
```
0.023909515 = product of:
  0.07172854 = sum of:
    0.07172854 = weight(_text_:systematic in 250) [ClassicSimilarity], result of:
      0.07172854 = score(doc=250,freq=2.0), product of:
        0.28397155 = queryWeight, product of:
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.049684696 = queryNorm
        0.2525906 = fieldWeight in 250, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.03125 = fieldNorm(doc=250)
  0.33333334 = coord(1/3)
```
Content

This study investigates the development of open access (OA) to journal articles from authors affiliated with German universities and non-university research institutions in the period 2010-2018. Beyond determining the overall share of openly available articles, a systematic classification of distinct categories of OA publishing allowed us to identify different patterns of adoption of OA. Taking into account the particularities of the German research landscape, variations in terms of productivity, OA uptake and approaches to OA are examined at the meso-level and possible explanations are discussed. The development of the OA uptake is analysed for the different research sectors in Germany (universities, non-university research institutes of the Helmholtz Association, Fraunhofer Society, Max Planck Society, Leibniz Association, and government research agencies). Combining several data sources (incl. Web of Science, Unpaywall, an authority file of standardised German affiliation information, the ISSN-Gold-OA 3.0 list, and OpenDOAR), the study confirms the growth of the OA share mirroring the international trend reported in related studies. We found that 45% of all considered articles during the observed period were openly available at the time of analysis. Our findings show that subject-specific repositories are the most prevalent type of OA. However, the percentages for publication in fully OA journals and OA via institutional repositories show similarly steep increases. Enabling data-driven decision-making regarding the implementation of OA in Germany at the institutional level, the results of this study furthermore can serve as a baseline to assess the impact recent transformative agreements with major publishers will likely have on scholarly communication.
Bedford, D.: Knowledge architectures : structures and semantics (2021) 0.02
```
0.019700034 = product of:
  0.0591001 = sum of:
    0.0591001 = sum of:
      0.032173738 = weight(_text_:indexing in 566) [ClassicSimilarity], result of:
        0.032173738 = score(doc=566,freq=2.0), product of:
          0.19018644 = queryWeight, product of:
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.049684696 = queryNorm
          0.16916946 = fieldWeight in 566, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.03125 = fieldNorm(doc=566)
      0.026926363 = weight(_text_:22 in 566) [ClassicSimilarity], result of:
        0.026926363 = score(doc=566,freq=2.0), product of:
          0.17398734 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.049684696 = queryNorm
          0.15476047 = fieldWeight in 566, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=566)
  0.33333334 = coord(1/3)
```
Content

Section 1 Context and purpose of knowledge architecture -- 1 Making the case for knowledge architecture -- 2 The landscape of knowledge assets -- 3 Knowledge architecture and design -- 4 Knowledge architecture reference model -- 5 Knowledge architecture segments -- Section 2 Designing for availability -- 6 Knowledge object modeling -- 7 Knowledge structures for encoding, formatting, and packaging -- 8 Functional architecture for identification and distinction -- 9 Functional architectures for knowledge asset disposition and destruction -- 10 Functional architecture designs for knowledge preservation and conservation -- Section 3 Designing for accessibility -- 11 Functional architectures for knowledge seeking and discovery -- 12 Functional architecture for knowledge search -- 13 Functional architecture for knowledge categorization -- 14 Functional architectures for indexing and keywording -- 15 Functional architecture for knowledge semantics -- 16 Functional architecture for knowledge abstraction and surrogation -- Section 4 Functional architectures to support knowledge consumption -- 17 Functional architecture for knowledge augmentation, derivation, and synthesis -- 18 Functional architecture to manage risk and harm -- 19 Functional architectures for knowledge authentication and provenance -- 20 Functional architectures for securing knowledge assets -- 21 Functional architectures for authorization and asset management -- Section 5 Pulling it all together - the big picture knowledge architecture -- 22 Functional architecture for knowledge metadata and metainformation -- 23 The whole knowledge architecture - pulling it all together
Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.02
```
0.018768014 = product of:
  0.05630404 = sum of:
    0.05630404 = product of:
      0.11260808 = sum of:
        0.11260808 = weight(_text_:indexing in 1139) [ClassicSimilarity], result of:
          0.11260808 = score(doc=1139,freq=8.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.5920931 = fieldWeight in 1139, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1139)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In light of AI (Artificial Intelligence) and NLP (Natural language processing) technologies, this article examines the feasibility of using AI/NLP models to enhance the subject indexing of digital resources. While BERT (Bidirectional Encoder Representations from Transformers) models are widely used in scholarly communities, the authors assess whether BERT models can be used in machine-assisted indexing in the Project Gutenberg collection, through suggesting Library of Congress subject headings filtered by certain Library of Congress Classification subclass labels. The findings of this study are informative for further research on BERT models to assist with automatic subject indexing for digital library collections.

Manzoni, L.: Nuovo Soggettario and semantic indexing of cartographic resources in Italy : an exploratory study (2022) 0.02

0.018575516 = product of:
  0.055726547 = sum of:
    0.055726547 = product of:
      0.11145309 = sum of:
        0.11145309 = weight(_text_:indexing in 1138) [ClassicSimilarity], result of:
          0.11145309 = score(doc=1138,freq=6.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.5860202 = fieldWeight in 1138, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0625 = fieldNorm(doc=1138)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: The paper focuses on the potential use of Nuovo soggettario, the semantic indexing tool adopted by the National Central Library of Florence (Biblioteca nazionale centrale di Firenze), for indexing cartographic resources. Particular attention is paid to the treatment of place names, the use of formal subjects, and the different ways of constructing subject strings for general and thematic maps.

Golub, K.: Automated subject indexing : an overview (2021) 0.02
```
0.016253578 = product of:
  0.04876073 = sum of:
    0.04876073 = product of:
      0.09752146 = sum of:
        0.09752146 = weight(_text_:indexing in 718) [ClassicSimilarity], result of:
          0.09752146 = score(doc=718,freq=6.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.5127677 = fieldWeight in 718, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=718)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In the face of the ever-increasing document volume, libraries around the globe are more and more exploring (semi-) automated approaches to subject indexing. This helps sustain bibliographic objectives, enrich metadata, and establish more connections across documents from various collections, effectively leading to improved information retrieval and access. However, generally accepted automated approaches that are functional in operative systems are lacking. This article aims to provide an overview of basic principles used for automated subject indexing, major approaches in relation to their possible application in actual library systems, existing working examples, as well as related challenges calling for further research.
Ali, C.B.; Haddad, H.; Slimani, Y.: Multi-word terms selection for information retrieval (2022) 0.01
```
0.014988055 = product of:
  0.044964164 = sum of:
    0.044964164 = product of:
      0.08992833 = sum of:
        0.08992833 = weight(_text_:indexing in 900) [ClassicSimilarity], result of:
          0.08992833 = score(doc=900,freq=10.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.47284302 = fieldWeight in 900, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=900)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Purpose A number of approaches and algorithms have been proposed over the years as a basis for automatic indexing. Many of these approaches suffer from precision inefficiency at low recall. The choice of indexing units has a great impact on search system effectiveness. The authors dive beyond simple terms indexing to propose a framework for multi-word terms (MWT) filtering and indexing. Design/methodology/approach In this paper, the authors rely on ranking MWT to filter them, keeping the most effective ones for the indexing process. The proposed model is based on filtering MWT according to their ability to capture the document topic and distinguish between different documents from the same collection. The authors rely on the hypothesis that the best MWT are those that achieve the greatest association degree. The experiments are carried out with English and French languages data sets. Findings The results indicate that this approach achieved precision enhancements at low recall, and it performed better than more advanced models based on terms dependencies. Originality/value Using and testing different association measures to select MWT that best describe the documents to enhance the precision in the first retrieved documents.
Asula, M.; Makke, J.; Freienthal, L.; Kuulmets, H.-A.; Sirel, R.: Kratt: developing an automatic subject indexing tool for the National Library of Estonia : how to transfer metadata information among work cluster members (2021) 0.01
```
0.013931636 = product of:
  0.041794907 = sum of:
    0.041794907 = product of:
      0.083589815 = sum of:
        0.083589815 = weight(_text_:indexing in 723) [ClassicSimilarity], result of:
          0.083589815 = score(doc=723,freq=6.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.4395151 = fieldWeight in 723, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.046875 = fieldNorm(doc=723)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Manual subject indexing in libraries is a time-consuming and costly process and the quality of the assigned subjects is affected by the cataloger's knowledge on the specific topics contained in the book. Trying to solve these issues, we exploited the opportunities arising from artificial intelligence to develop Kratt: a prototype of an automatic subject indexing tool. Kratt is able to subject index a book independent of its extent and genre with a set of keywords present in the Estonian Subject Thesaurus. It takes Kratt approximately one minute to subject index a book, outperforming humans 10-15 times. Although the resulting keywords were not considered satisfactory by the catalogers, the ratings of a small sample of regular library users showed more promise. We also argue that the results can be enhanced by including a bigger corpus for training the model and applying more careful preprocessing techniques.
Ahmed, M.; Mukhopadhyay, M.; Mukhopadhyay, P.: Automated knowledge organization : AI ML based subject indexing system for libraries (2023) 0.01
```
0.0134057235 = product of:
  0.04021717 = sum of:
    0.04021717 = product of:
      0.08043434 = sum of:
        0.08043434 = weight(_text_:indexing in 977) [ClassicSimilarity], result of:
          0.08043434 = score(doc=977,freq=8.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.42292362 = fieldWeight in 977, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=977)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The research study as reported here is an attempt to explore the possibilities of an AI/ML-based semi-automated indexing system in a library setup to handle large volumes of documents. It uses the Python virtual environment to install and configure an open source AI environment (named Annif) to feed the LOD (Linked Open Data) dataset of Library of Congress Subject Headings (LCSH) as a standard KOS (Knowledge Organisation System). The framework deployed the Turtle format of LCSH after cleaning the file with Skosify, applied an array of backend algorithms (namely TF-IDF, Omikuji, and NN-Ensemble) to measure relative performance, and selected Snowball as an analyser. The training of Annif was conducted with a large set of bibliographic records populated with subject descriptors (MARC tag 650$a) and indexed by trained LIS professionals. The training dataset is first treated with MarcEdit to export it in a format suitable for OpenRefine, and then in OpenRefine it undergoes many steps to produce a bibliographic record set suitable to train Annif. The framework, after training, has been tested with a bibliographic dataset to measure indexing efficiencies, and finally, the automated indexing framework is integrated with data wrangling software (OpenRefine) to produce suggested headings on a mass scale. The entire framework is based on open-source software, open datasets, and open standards.
Golub, K.; Tyrkkö, J.; Hansson, J.; Ahlström, I.: Subject indexing in humanities : a comparison between a local university repository and an international bibliographic service (2020) 0.01
```
0.011609698 = product of:
  0.03482909 = sum of:
    0.03482909 = product of:
      0.06965818 = sum of:
        0.06965818 = weight(_text_:indexing in 5982) [ClassicSimilarity], result of:
          0.06965818 = score(doc=5982,freq=6.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.3662626 = fieldWeight in 5982, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5982)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

As the humanities develop in the realm of increasingly more pronounced digital scholarship, it is important to provide quality subject access to a vast range of heterogeneous information objects in digital services. The study aims to paint a representative picture of the current state of affairs of the use of subject index terms in humanities journal articles with particular reference to the well-established subject access needs of humanities researchers, with the purpose of identifying which improvements are needed in this context. Design/methodology/approach The comparison of subject metadata on a sample of 649 peer-reviewed journal articles from across the humanities is conducted in a university repository, against Scopus, the former reflecting local and national policies and the latter being the most comprehensive international abstract and citation database of research output. Findings The study shows that established bibliographic objectives to ensure subject access for humanities journal articles are not supported in either the world's largest commercial abstract and citation database Scopus or the local repository of a public university in Sweden. The indexing policies in the two services do not seem to address the needs of humanities scholars for highly granular subject index terms with appropriate facets; no controlled vocabularies for any humanities discipline are used whatsoever. Originality/value In all, not much has changed since 1990s when indexing for the humanities was shown to lag behind the sciences. The community of researchers and information professionals, today working together on digital humanities projects, as well as interdisciplinary research teams, should demand that their subject access needs be fulfilled, especially in commercial services like Scopus and discovery services.
Ahmed, M.: Automatic indexing for agriculture : designing a framework by deploying Agrovoc, Agris and Annif (2023) 0.01
```
0.011609698 = product of:
  0.03482909 = sum of:
    0.03482909 = product of:
      0.06965818 = sum of:
        0.06965818 = weight(_text_:indexing in 1024) [ClassicSimilarity], result of:
          0.06965818 = score(doc=1024,freq=6.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.3662626 = fieldWeight in 1024, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1024)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

There are several ways to employ machine learning for automating subject indexing. One popular strategy is to utilize a supervised learning algorithm to train a model on a set of documents that have been manually indexed by subject matter using a standard vocabulary. The resulting model can then predict the subject of new and previously unseen documents by identifying patterns learned from the training data. To do this, the first step is to gather a large dataset of documents and manually assign each document a set of subject keywords/descriptors from a controlled vocabulary (e.g., from Agrovoc). Next, the dataset (obtained from Agris) can be divided into - i) a training dataset, and ii) a test dataset. The training dataset is used to train the model, while the test dataset is used to evaluate the model's performance. Machine learning can be a powerful tool for automating the process of subject indexing. This research is an attempt to apply Annif (http://annif. org/), an open-source AI/ML framework, to autogenerate subject keywords/descriptors for documentary resources in the domain of agriculture. The training dataset is obtained from Agris, which applies the Agrovoc thesaurus as a vocabulary tool (https://www.fao.org/agris/download).
Bodoff, D.; Richter-Levin, Y.: Viewpoints in indexing term assignment (2020) 0.01
```
0.011375135 = product of:
  0.034125403 = sum of:
    0.034125403 = product of:
      0.068250805 = sum of:
        0.068250805 = weight(_text_:indexing in 5765) [ClassicSimilarity], result of:
          0.068250805 = score(doc=5765,freq=4.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.3588626 = fieldWeight in 5765, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.046875 = fieldNorm(doc=5765)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The literature on assigned indexing considers three possible viewpoints-the author's viewpoint as evidenced in the title, the users' viewpoint, and the indexer's viewpoint-and asks whether and which of those views should be reflected in an indexer's choice of terms to assign to an item. We study this question empirically, as opposed to normatively. Based on the literature that discusses whose viewpoints should be reflected, we construct a research model that includes those same three viewpoints as factors that might be influencing term assignment in actual practice. In the unique study design that we employ, the records of term assignments made by identified indexers in academic libraries are cross-referenced with the results of a survey that those same indexers completed on political views. Our results indicate that in our setting, variance in term assignment was best explained by indexers' personal political views.

Araújo, P.C. de; Gutierres Castanha, R.C.; Hjoerland, B.: Citation indexing and indexes (2021) 0.01

0.011375135 = product of:
  0.034125403 = sum of:
    0.034125403 = product of:
      0.068250805 = sum of:
        0.068250805 = weight(_text_:indexing in 444) [ClassicSimilarity], result of:
          0.068250805 = score(doc=444,freq=4.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.3588626 = fieldWeight in 444, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.046875 = fieldNorm(doc=444)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Theme: Citation indexing

Fugmann, R.: What is information? : an information veteran looks back (2022) 0.01

0.011219318 = product of:
  0.033657953 = sum of:
    0.033657953 = product of:
      0.06731591 = sum of:
        0.06731591 = weight(_text_:22 in 1085) [ClassicSimilarity], result of:
          0.06731591 = score(doc=1085,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.38690117 = fieldWeight in 1085, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1085)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 18. 8.2022 19:22:57

Grabus, S.; Logan, P.M.; Greenberg, J.: Temporal concept drift and alignment : an empirical approach to comparing knowledge organization systems over time (2022) 0.01
```
0.009479279 = product of:
  0.028437834 = sum of:
    0.028437834 = product of:
      0.05687567 = sum of:
        0.05687567 = weight(_text_:indexing in 1100) [ClassicSimilarity], result of:
          0.05687567 = score(doc=1100,freq=4.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.29905218 = fieldWeight in 1100, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1100)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This research explores temporal concept drift and temporal alignment in knowledge organization systems (KOS). A comparative analysis is pursued using the 1910 Library of Congress Subject Headings, 2020 FAST Topical, and automatic indexing. The use case involves a sample of 90 nineteenth-century Encyclopedia Britannica entries. The entries were indexed using two approaches: 1) full-text indexing; 2) Named Entity Recognition was performed upon the entries with Stanza, Stanford's NLP toolkit, and entities were automatically indexed with the Helping Interdisciplinary Vocabulary application (HIVE), using both 1910 LCSH and FAST Topical. The analysis focused on three goals: 1) identifying results that were exclusive to the 1910 LCSH output; 2) identifying terms in the exclusive set that have been deprecated from the contemporary LCSH, demonstrating temporal concept drift; and 3) exploring the historical significance of these deprecated terms. Results confirm that historical vocabularies can be used to generate anachronistic subject headings representing conceptual drift across time in KOS and historical resources. A methodological contribution is made demonstrating how to study changes in KOS over time and improve the contextualization historical humanities resources.
Frické, M.: Boolean logic (2021) 0.01
```
0.009384007 = product of:
  0.02815202 = sum of:
    0.02815202 = product of:
      0.05630404 = sum of:
        0.05630404 = weight(_text_:indexing in 231) [ClassicSimilarity], result of:
          0.05630404 = score(doc=231,freq=2.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.29604656 = fieldWeight in 231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=231)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The article describes and explains Boolean logic (or Boolean algebra) in its two principal forms: that of truth-values and the Boolean connectives and, or, and not, and that of set membership and the set operations of intersection, union and complement. The main application areas of Boolean logic to knowledge organization, namely post-coordinate indexing and search, are introduced and discussed. Some wider application areas are briefly mentioned, such as: propositional logic, the Shannon-style approach to electrical switching and logic gates, computer programming languages, probability theory, and database queries. An analysis is offered of shortcomings that Boolean logic has in terms of potential uses in knowledge organization.

Search (152 results, page 3 of 8)

Authors

Types

Themes

Subjects

Classifications