Search (36 results, page 1 of 2)

Suominen, O.; Koskenniemi, I.: Annif Analyzer Shootout : comparing text lemmatization methods for automated subject indexing (2022) 0.03

0.02826747 = product of:
  0.13191485 = sum of:
    0.03675035 = weight(_text_:subject in 658) [ClassicSimilarity], result of:
      0.03675035 = score(doc=658,freq=6.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.34222013 = fieldWeight in 658, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0390625 = fieldNorm(doc=658)
    0.04758225 = weight(_text_:classification in 658) [ClassicSimilarity], result of:
      0.04758225 = score(doc=658,freq=16.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49761042 = fieldWeight in 658, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=658)
    0.04758225 = weight(_text_:classification in 658) [ClassicSimilarity], result of:
      0.04758225 = score(doc=658,freq=16.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49761042 = fieldWeight in 658, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=658)
  0.21428572 = coord(3/14)

Abstract: Automated text classification is an important function for many AI systems relevant to libraries, including automated subject indexing and classification. When implemented using the traditional natural language processing (NLP) paradigm, one key part of the process is the normalization of words using stemming or lemmatization, which reduces the amount of linguistic variation and often improves the quality of classification. In this paper, we compare the output of seven different text lemmatization algorithms as well as two baseline methods. We measure how the choice of method affects the quality of text classification using example corpora in three languages. The experiments have been performed using the open source Annif toolkit for automated subject indexing and classification, but should generalize also to other NLP toolkits and similar text classification tasks. The results show that lemmatization methods in most cases outperform baseline methods in text classification particularly for Finnish and Swedish text, but not English, where baseline methods are most effective. The differences between lemmatization methods are quite small. The systematic comparison will help optimize text classification pipelines and inform the further development of the Annif toolkit to incorporate a wider choice of normalization methods.

Almeida, P. de; Gnoli, C.: Fiction in a phenomenon-based classification (2021) 0.03

0.025019486 = product of:
  0.1167576 = sum of:
    0.036007844 = weight(_text_:subject in 712) [ClassicSimilarity], result of:
      0.036007844 = score(doc=712,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.33530587 = fieldWeight in 712, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=712)
    0.04037488 = weight(_text_:classification in 712) [ClassicSimilarity], result of:
      0.04037488 = score(doc=712,freq=8.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42223644 = fieldWeight in 712, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=712)
    0.04037488 = weight(_text_:classification in 712) [ClassicSimilarity], result of:
      0.04037488 = score(doc=712,freq=8.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42223644 = fieldWeight in 712, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=712)
  0.21428572 = coord(3/14)

Abstract: In traditional classification, fictional works are indexed only by their form, genre, and language, while their subject content is believed to be irrelevant. However, recent research suggests that this may not be the best approach. We tested indexing of a small sample of selected fictional works by Integrative Levels Classification (ILC2), a freely faceted system based on phenomena instead of disciplines and considered the structure of the resulting classmarks. Issues in the process of subject analysis, such as selection of relevant vs. non-relevant themes and citation order of relevant ones, are identified and discussed. Some phenomena that are covered in scholarly literature can also be identified as relevant themes in fictional literature and expressed in classmarks. This can allow for hybrid search and retrieval systems covering both fiction and nonfiction, which will result in better leveraging of the knowledge contained in fictional works.
Source: Cataloging and classification quarterly. 59(2021) no.5, p.477-491

Hudon, M.: ¬The status of knowledge organization in library and information science master's programs (2021) 0.02

0.020640023 = product of:
  0.09632011 = sum of:
    0.029704956 = weight(_text_:subject in 697) [ClassicSimilarity], result of:
      0.029704956 = score(doc=697,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.27661324 = fieldWeight in 697, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0546875 = fieldNorm(doc=697)
    0.033307575 = weight(_text_:classification in 697) [ClassicSimilarity], result of:
      0.033307575 = score(doc=697,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.34832728 = fieldWeight in 697, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=697)
    0.033307575 = weight(_text_:classification in 697) [ClassicSimilarity], result of:
      0.033307575 = score(doc=697,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.34832728 = fieldWeight in 697, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=697)
  0.21428572 = coord(3/14)

Abstract: The content of master's programs accredited by the American Library Association was examined to assess the status of knowledge organization (KO) as a subject in current training. Data collected show that KO remains very visible in a majority of programs, mainly in the form of required and electives courses focusing on descriptive cataloging, classification, and metadata. Observed tendencies include, however, the recent elimination of the required KO course in several programs, the reality that one third of KO electives listed in course catalogs have not been scheduled in the past three years, and the fact that two-thirds of those teaching KO specialize in other areas of information science.
Source: Cataloging and classification quarterly. 59(2021) no.6, p.576-596

Broughton, V.: Faceted classification in support of diversity : the role of concepts and terms in representing religion (2020) 0.02

0.017691448 = product of:
  0.08256009 = sum of:
    0.02546139 = weight(_text_:subject in 5992) [ClassicSimilarity], result of:
      0.02546139 = score(doc=5992,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 5992, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=5992)
    0.028549349 = weight(_text_:classification in 5992) [ClassicSimilarity], result of:
      0.028549349 = score(doc=5992,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.29856625 = fieldWeight in 5992, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=5992)
    0.028549349 = weight(_text_:classification in 5992) [ClassicSimilarity], result of:
      0.028549349 = score(doc=5992,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.29856625 = fieldWeight in 5992, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=5992)
  0.21428572 = coord(3/14)

Abstract: The paper examines the development of facet analysis as a methodology and the role it plays in building classifications and other knowledge-organization tools. The use of categorical analysis in areas other than library and information science is also considered. The suitability of the faceted approach for humanities documentation is explored through a critical description of the FATKS (Facet Analytical Theory in Managing Knowledge Structure for Humanities) project carried out at University College London. This research focused on building a conceptual model for the subject of religion together with a relational database and search-and-browse interfaces that would support some degree of automatic classification. The paper concludes with a discussion of the differences between the conceptual model and the vocabulary used to populate it, and how, in the case of religion, the choice of terminology can create an apparent bias in the system.

Aitchison, C.R.: Cataloging virtual reality artworks: challenges and future prospects (2021) 0.02

0.017635275 = product of:
  0.08229795 = sum of:
    0.023552012 = weight(_text_:classification in 711) [ClassicSimilarity], result of:
      0.023552012 = score(doc=711,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.24630459 = fieldWeight in 711, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=711)
    0.035193928 = weight(_text_:bibliographic in 711) [ClassicSimilarity], result of:
      0.035193928 = score(doc=711,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.30108726 = fieldWeight in 711, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.0546875 = fieldNorm(doc=711)
    0.023552012 = weight(_text_:classification in 711) [ClassicSimilarity], result of:
      0.023552012 = score(doc=711,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.24630459 = fieldWeight in 711, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=711)
  0.21428572 = coord(3/14)

Abstract: In 2019, Pepperdine Libraries acquired two virtual reality artworks by filmmaker and artist Paisley Smith: Homestay and Unceded Territories. To bring awareness to these pieces, Pepperdine Libraries added these works to the library catalog, creating bibliographic records for both films. There were many challenges and considerations in cataloging virtual reality art, including factors such as the nature of the work, the limits found in Resource Description and Access (RDA) and MARC, and providing access to these works. This paper discusses these topics, as well as provides recommendations for potential future standards for cataloging virtual works.
Source: Cataloging and classification quarterly. 59(2021) no.5, p.492-509

Rockelle Strader, C.: Cataloging to support information literacy : the IFLA Library Reference Model's user tasks in the context of the Framework for Information Literacy for Higher Education (2021) 0.02

0.01511595 = product of:
  0.0705411 = sum of:
    0.02018744 = weight(_text_:classification in 713) [ClassicSimilarity], result of:
      0.02018744 = score(doc=713,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 713, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=713)
    0.030166224 = weight(_text_:bibliographic in 713) [ClassicSimilarity], result of:
      0.030166224 = score(doc=713,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.2580748 = fieldWeight in 713, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.046875 = fieldNorm(doc=713)
    0.02018744 = weight(_text_:classification in 713) [ClassicSimilarity], result of:
      0.02018744 = score(doc=713,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 713, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=713)
  0.21428572 = coord(3/14)

Abstract: Cataloging practices, as exemplified by the five user tasks of the IFLA Library Reference Model, can support information literacy practices. The six frames of the Framework for Information Literacy for Higher Education are used as lenses to examine the user tasks. Two themes emerge from this examination: context matters, and catalogers must tailor bibliographic descriptions to meet users' expectations and information needs. Catalogers need to solicit feedback from various user communities to reform cataloging practices to remain current and viable. Such conversations will enrich the catalog and enhance (reclaim?) its position as a primary tool for research and learning. Supplemental data for this article is available online at https://doi.org/10.1080/01639374.2021.1939828.
Source: Cataloging and classification quarterly. 59(2021) no.5, p.442-476

Dietz, K.: en.wikipedia.org > 6 Mio. Artikel (2020) 0.01
```
0.014192857 = product of:
  0.09934999 = sum of:
    0.039739996 = product of:
      0.11921998 = sum of:
        0.11921998 = weight(_text_:3a in 5669) [ClassicSimilarity], result of:
          0.11921998 = score(doc=5669,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.46834838 = fieldWeight in 5669, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5669)
      0.33333334 = coord(1/3)
    0.05960999 = product of:
      0.11921998 = sum of:
        0.11921998 = weight(_text_:3a in 5669) [ClassicSimilarity], result of:
          0.11921998 = score(doc=5669,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.46834838 = fieldWeight in 5669, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5669)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)
```
Content

"Die Englischsprachige Wikipedia verfügt jetzt über mehr als 6 Millionen Artikel. An zweiter Stelle kommt die deutschsprachige Wikipedia mit 2.3 Millionen Artikeln, an dritter Stelle steht die französischsprachige Wikipedia mit 2.1 Millionen Artikeln (via Researchbuzz: Firehose <https://rbfirehose.com/2020/01/24/techcrunch-wikipedia-now-has-more-than-6-million-articles-in-english/> und Techcrunch <https://techcrunch.com/2020/01/23/wikipedia-english-six-million-articles/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Techcrunch+%28TechCrunch%29&guccounter=1&guce_referrer=aHR0cHM6Ly9yYmZpcmVob3NlLmNvbS8yMDIwLzAxLzI0L3RlY2hjcnVuY2gtd2lraXBlZGlhLW5vdy1oYXMtbW9yZS10aGFuLTYtbWlsbGlvbi1hcnRpY2xlcy1pbi1lbmdsaXNoLw&guce_referrer_sig=AQAAAK0zHfjdDZ_spFZBF_z-zDjtL5iWvuKDumFTzm4HvQzkUfE2pLXQzGS6FGB_y-VISdMEsUSvkNsg2U_NWQ4lwWSvOo3jvXo1I3GtgHpP8exukVxYAnn5mJspqX50VHIWFADHhs5AerkRn3hMRtf_R3F1qmEbo8EROZXp328HMC-o>). 250120 via digithek ch = #fineBlog s.a.: Angesichts der Veröffentlichung des 6-millionsten Artikels vergangene Woche in der englischsprachigen Wikipedia hat die Community-Zeitungsseite "Wikipedia Signpost" ein Moratorium bei der Veröffentlichung von Unternehmensartikeln gefordert. Das sei kein Vorwurf gegen die Wikimedia Foundation, aber die derzeitigen Maßnahmen, um die Enzyklopädie gegen missbräuchliches undeklariertes Paid Editing zu schützen, funktionierten ganz klar nicht. *"Da die ehrenamtlichen Autoren derzeit von Werbung in Gestalt von Wikipedia-Artikeln überwältigt werden, und da die WMF nicht in der Lage zu sein scheint, dem irgendetwas entgegenzusetzen, wäre der einzige gangbare Weg für die Autoren, fürs erste die Neuanlage von Artikeln über Unternehmen zu untersagen"*, schreibt der Benutzer Smallbones in seinem Editorial <https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2020-01-27/From_the_editor> zur heutigen Ausgabe."
Hobert, A.; Jahn, N.; Mayr, P.; Schmidt, B.; Taubert, N.: Open access uptake in Germany 2010-2018 : adoption in a diverse research landscape (2021) 0.01
```
0.009405181 = product of:
  0.043890845 = sum of:
    0.016974261 = weight(_text_:subject in 250) [ClassicSimilarity], result of:
      0.016974261 = score(doc=250,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.15806471 = fieldWeight in 250, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03125 = fieldNorm(doc=250)
    0.013458292 = weight(_text_:classification in 250) [ClassicSimilarity], result of:
      0.013458292 = score(doc=250,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 250, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=250)
    0.013458292 = weight(_text_:classification in 250) [ClassicSimilarity], result of:
      0.013458292 = score(doc=250,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 250, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=250)
  0.21428572 = coord(3/14)
```
Content

This study investigates the development of open access (OA) to journal articles from authors affiliated with German universities and non-university research institutions in the period 2010-2018. Beyond determining the overall share of openly available articles, a systematic classification of distinct categories of OA publishing allowed us to identify different patterns of adoption of OA. Taking into account the particularities of the German research landscape, variations in terms of productivity, OA uptake and approaches to OA are examined at the meso-level and possible explanations are discussed. The development of the OA uptake is analysed for the different research sectors in Germany (universities, non-university research institutes of the Helmholtz Association, Fraunhofer Society, Max Planck Society, Leibniz Association, and government research agencies). Combining several data sources (incl. Web of Science, Unpaywall, an authority file of standardised German affiliation information, the ISSN-Gold-OA 3.0 list, and OpenDOAR), the study confirms the growth of the OA share mirroring the international trend reported in related studies. We found that 45% of all considered articles during the observed period were openly available at the time of analysis. Our findings show that subject-specific repositories are the most prevalent type of OA. However, the percentages for publication in fully OA journals and OA via institutional repositories show similarly steep increases. Enabling data-driven decision-making regarding the implementation of OA in Germany at the institutional level, the results of this study furthermore can serve as a baseline to assess the impact recent transformative agreements with major publishers will likely have on scholarly communication.

Franz, S.; Lopatka, T.; Kunze, G.; Meyn, N.; Strupler, N.: Un/Doing Classification : Bibliothekarische Klassifikationssysteme zwischen Universalitätsanspruch und reduktionistischer Wissensorganisation (2022) 0.01

0.007690453 = product of:
  0.053833168 = sum of:
    0.026916584 = weight(_text_:classification in 675) [ClassicSimilarity], result of:
      0.026916584 = score(doc=675,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 675, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=675)
    0.026916584 = weight(_text_:classification in 675) [ClassicSimilarity], result of:
      0.026916584 = score(doc=675,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 675, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=675)
  0.14285715 = coord(2/14)

Tramullas, J.; Garrido-Picazo, P.; Sánchez-Casabón, A.I.: Use of Wikipedia categories on information retrieval research : a brief review (2020) 0.01

0.00576784 = product of:
  0.04037488 = sum of:
    0.02018744 = weight(_text_:classification in 5365) [ClassicSimilarity], result of:
      0.02018744 = score(doc=5365,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 5365, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=5365)
    0.02018744 = weight(_text_:classification in 5365) [ClassicSimilarity], result of:
      0.02018744 = score(doc=5365,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 5365, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=5365)
  0.14285715 = coord(2/14)

Abstract: Wikipedia categories, a classification scheme built for organizing and describing Wikpedia articles, are being applied in computer science research. This paper adopts a systematic literature review approach, in order to identify different approaches and uses of Wikipedia categories in information retrieval research. Several types of work are identified, depending on the intrinsic study of the categories structure, or its use as a tool for the processing and analysis of other documentary corpus different to Wikipedia. Information retrieval is identified as one of the major areas of use, in particular its application in the refinement and improvement of search expressions, and the construction of textual corpus. However, the set of available works shows that in many cases research approaches applied and results obtained can be integrated into a comprehensive and inclusive concept of information retrieval.

Machado, L.; Martínez-Ávila, D.; Barcellos Almeida, M.; Borges, M.M.: Towards a moderate realistic foundation for ontological knowledge organization systems : the question of the naturalness of classifications (2023) 0.01
```
0.00576784 = product of:
  0.04037488 = sum of:
    0.02018744 = weight(_text_:classification in 894) [ClassicSimilarity], result of:
      0.02018744 = score(doc=894,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 894, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=894)
    0.02018744 = weight(_text_:classification in 894) [ClassicSimilarity], result of:
      0.02018744 = score(doc=894,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 894, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=894)
  0.14285715 = coord(2/14)
```
Abstract

Several authors emphasize the need for a change in classification theory due to the influence of a dogmatic and monistic ontology supported by an outdated essentialism. These claims tend to focus on the fallibility of knowledge, the need for a pluralistic view, and the theoretical burden of observations. Regardless of the legitimacy of these concerns, there is the risk, when not moderate, to fall into the opposite relativistic extreme. Based on a narrative review of the literature, we aim to reflectively discuss the theoretical foundations that can serve as a basis for a realist position supporting pluralistic ontological classifications. The goal is to show that, against rather conventional solutions, objective scientific-based approaches to natural classifications are presented to be viable, allowing a proper distinction between ontological and taxonomic questions. Supported by critical scientific realism, we consider that such an approach is suitable for the development of ontological Knowledge Organization Systems (KOS). We believe that ontological perspectivism can provide the necessary adaptation to the different granularities of reality.
Daquino, M.; Peroni, S.; Shotton, D.; Colavizza, G.; Ghavimi, B.; Lauscher, A.; Mayr, P.; Romanello, M.; Zumstein, P.: ¬The OpenCitations Data Model (2020) 0.00
```
0.003047249 = product of:
  0.042661484 = sum of:
    0.042661484 = weight(_text_:bibliographic in 38) [ClassicSimilarity], result of:
      0.042661484 = score(doc=38,freq=4.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.3649729 = fieldWeight in 38, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.046875 = fieldNorm(doc=38)
  0.071428575 = coord(1/14)
```
Abstract

A variety of schemas and ontologies are currently used for the machine-readable description of bibliographic entities and citations. This diversity, and the reuse of the same ontology terms with different nuances, generates inconsistencies in data. Adoption of a single data model would facilitate data integration tasks regardless of the data supplier or context application. In this paper we present the OpenCitations Data Model (OCDM), a generic data model for describing bibliographic entities and citations, developed using Semantic Web technologies. We also evaluate the effective reusability of OCDM according to ontology evaluation practices, mention existing users of OCDM, and discuss the use and impact of OCDM in the wider open science community.
Positionspapier der DMV zur Verwendung bibliometrischer Daten (2020) 0.00
```
0.0024926048 = product of:
  0.034896467 = sum of:
    0.034896467 = product of:
      0.069792934 = sum of:
        0.069792934 = weight(_text_:texts in 5738) [ClassicSimilarity], result of:
          0.069792934 = score(doc=5738,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.42399842 = fieldWeight in 5738, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5738)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)
```
Abstract

Bibliometrische Daten werden heute zunehmend in der Evaluation von Forschungsergebnissen benutzt. Diese Anwendungen reichen von der (indirekten) Verwendung bei der Peer-Evaluation von Drittmittelanträgen über die Beurteilung von Bewerbungen in Berufungskommissionen oder Anträgen für Forschungszulagen bis hin zur systematischen Erhebung von forschungsorientierten Kennzahlen von Institutionen. Mit diesem Dokument will die DMV ihren Mitgliedern eine Diskussionsgrundlage zur Verwendung bibliometrischer Daten im Zusammenhang mit der Evaluation von Personen und Institutionen im Fachgebiet Mathematik zur Verfügung stellen, insbesondere auch im Vergleich zu anderen Fächern. Am Ende des Texts befindet sich ein Glossar, in dem die wichtigsten Begriffe kurz erläutert werden.
Collard, J.; Paiva, V. de; Fong, B.; Subrahmanian, E.: Extracting mathematical concepts from text (2022) 0.00
```
0.0024926048 = product of:
  0.034896467 = sum of:
    0.034896467 = product of:
      0.069792934 = sum of:
        0.069792934 = weight(_text_:texts in 668) [ClassicSimilarity], result of:
          0.069792934 = score(doc=668,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.42399842 = fieldWeight in 668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=668)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)
```
Abstract

We investigate different systems for extracting mathematical entities from English texts in the mathematical field of category theory as a first step for constructing a mathematical knowledge graph. We consider four different term extractors and compare their results. This small experiment showcases some of the issues with the construction and evaluation of terms extracted from noisy domain text. We also make available two open corpora in research mathematics, in particular in category theory: a small corpus of 755 abstracts from the journal TAC (3188 sentences), and a larger corpus from the nLab community wiki (15,000 sentences).
Hausser, R.: Grammatical disambiguation : the linear complexity hypothesis for natural language (2020) 0.00
```
0.0018186709 = product of:
  0.02546139 = sum of:
    0.02546139 = weight(_text_:subject in 22) [ClassicSimilarity], result of:
      0.02546139 = score(doc=22,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 22, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=22)
  0.071428575 = coord(1/14)
```
Abstract

DBS uses a strictly time-linear derivation order. Therefore the basic computational complexity degree of DBS is linear time. The only way to increase DBS complexity above linear is repeating ambiguity. In natural language, however, repeating ambiguity is prevented by grammatical disambiguation. A classic example of a grammatical ambiguity is the 'garden path' sentence The horse raced by the barn fell. The continuation horse+raced introduces an ambiguity between horse which raced and horse which was raced, leading to two parallel derivation strands up to The horse raced by the barn. Depending on whether the continuation is interpunctuation or a verb, they are grammatically disambiguated, resulting in unambiguous output. A repeated ambiguity occurs in The man who loves the woman who feeds Lucy who Peter loves., with who serving as subject or as object. These readings are grammatically disambiguated by continuing after who with a verb or a noun.

Jaeger, L.: Wissenschaftler versus Wissenschaft (2020) 0.00

0.0017434291 = product of:
  0.024408007 = sum of:
    0.024408007 = product of:
      0.048816014 = sum of:
        0.048816014 = weight(_text_:22 in 4156) [ClassicSimilarity], result of:
          0.048816014 = score(doc=4156,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.46428138 = fieldWeight in 4156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4156)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)

Date: 2. 3.2020 14:08:22

Ostrzinski, U.: Deutscher MeSH : ZB MED veröffentlicht aktuelle Jahresversion 2022 - freier Zugang und FAIRe Dateiformate (2022) 0.00
```
0.0017146593 = product of:
  0.024005229 = sum of:
    0.024005229 = weight(_text_:subject in 625) [ClassicSimilarity], result of:
      0.024005229 = score(doc=625,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.22353725 = fieldWeight in 625, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03125 = fieldNorm(doc=625)
  0.071428575 = coord(1/14)
```
Abstract

Die aktuelle Ausgabe 2022 der deutschen Ausgabe der Medical Subject Headings (MeSH) steht ab sofort zum Download in verschiedenen FAIRen Dateiformaten sowie als XML- und CSV-Datei unter der CC BY 4.0-Lizenz bereit. Zu den semantisch FAIRen Formaten, die ZB MED anbietet, zählen beispielsweise RDF/XML oder JSON-LD. Sie ermöglichen es etwa Software-Lösungen zur Datenanalyse - auch mit künstlicher Intelligenz -, die Daten direkt zu nutzen. Sie müssen nicht zusätzlich konvertiert und aufbereitet werden.

Content

"ZB MED bietet für die deutschsprachigen MeSH-Begriffe einen Internationalized Resource Identifier (IRI) an. Der IRI-Service stellt auf einer HTML-Seite alle Informationen für einen deutschen MeSH-Term bereit und ermöglicht so die Versionierung. Die Sichtbarkeit veralteter, aber in der Vergangenheit genutzter Terme ist im Sinne der FAIR-Prinzipien dadurch weiterhin gewährleistet. Für die Übersetzung der Medical Subject Headings nutzt ZB MED den eigens entwickelten TermCurator. Das semiautomatische Übersetzungstool bietet unter anderem einen integrierten mehrstufigen Kuratierungsprozess. Der MeSH-Thesaurus als polyhierarchisches, konzeptbasiertes Schlagwortregister für biomedizinische Fachbegriffe umfasst das Vokabular, welches in den NLM-Datenbanken, beispielsweise MEDLINE oder PubMed, erscheint. Darüber hinaus ist er eine der wichtigsten Quellen für ein kontrolliertes biomedizinisches Fachvokabular - beispielsweise für die Kategorisierung und Analyse von Literatur- und Datenquellen.
Zhai, X.: ChatGPT user experience: : implications for education (2022) 0.00
```
0.0015155592 = product of:
  0.021217827 = sum of:
    0.021217827 = weight(_text_:subject in 849) [ClassicSimilarity], result of:
      0.021217827 = score(doc=849,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.19758089 = fieldWeight in 849, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0390625 = fieldNorm(doc=849)
  0.071428575 = coord(1/14)
```
Abstract

ChatGPT, a general-purpose conversation chatbot released on November 30, 2022, by OpenAI, is expected to impact every aspect of society. However, the potential impacts of this NLP tool on education remain unknown. Such impact can be enormous as the capacity of ChatGPT may drive changes to educational learning goals, learning activities, and assessment and evaluation practices. This study was conducted by piloting ChatGPT to write an academic paper, titled Artificial Intelligence for Education (see Appendix A). The piloting result suggests that ChatGPT is able to help researchers write a paper that is coherent, (partially) accurate, informative, and systematic. The writing is extremely efficient (2-3 hours) and involves very limited professional knowledge from the author. Drawing upon the user experience, I reflect on the potential impacts of ChatGPT, as well as similar AI tools, on education. The paper concludes by suggesting adjusting learning goals-students should be able to use AI tools to conduct subject-domain tasks and education should focus on improving students' creativity and critical thinking rather than general skills. To accomplish the learning goals, researchers should design AI-involved learning tasks to engage students in solving real-world problems. ChatGPT also raises concerns that students may outsource assessment tasks. This paper concludes that new formats of assessments are needed to focus on creativity and critical thinking that AI cannot substitute.

Wagner, E.: Über Impfstoffe zur digitalen Identität? (2020) 0.00

0.0014528577 = product of:
  0.020340007 = sum of:
    0.020340007 = product of:
      0.040680014 = sum of:
        0.040680014 = weight(_text_:22 in 5846) [ClassicSimilarity], result of:
          0.040680014 = score(doc=5846,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.38690117 = fieldWeight in 5846, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=5846)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)

Date: 4. 5.2020 17:22:40

Engel, B.: Corona-Gesundheitszertifikat als Exitstrategie (2020) 0.00

0.0014528577 = product of:
  0.020340007 = sum of:
    0.020340007 = product of:
      0.040680014 = sum of:
        0.040680014 = weight(_text_:22 in 5906) [ClassicSimilarity], result of:
          0.040680014 = score(doc=5906,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.38690117 = fieldWeight in 5906, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=5906)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)

Date: 4. 5.2020 17:22:28

Search (36 results, page 1 of 2)

Authors

Languages

Types

Themes