Search (9 results, page 1 of 1)

  • × theme_ss:"Formalerschließung"
  • × year_i:[2020 TO 2030}
  1. Das, S.; Paik, J.H.: Gender tagging of named entities using retrieval-assisted multi-context aggregation : an unsupervised approach (2023) 0.01
    0.005982068 = product of:
      0.020937236 = sum of:
        0.006214436 = product of:
          0.03107218 = sum of:
            0.03107218 = weight(_text_:retrieval in 941) [ClassicSimilarity], result of:
              0.03107218 = score(doc=941,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2835858 = fieldWeight in 941, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=941)
          0.2 = coord(1/5)
        0.0147228 = product of:
          0.0294456 = sum of:
            0.0294456 = weight(_text_:22 in 941) [ClassicSimilarity], result of:
              0.0294456 = score(doc=941,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23214069 = fieldWeight in 941, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=941)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Inferring the gender of named entities present in a text has several practical applications in information sciences. Existing approaches toward name gender identification rely exclusively on using the gender distributions from labeled data. In the absence of such labeled data, these methods fail. In this article, we propose a two-stage model that is able to infer the gender of names present in text without requiring explicit name-gender labels. We use coreference resolution as the backbone for our proposed model. To aid coreference resolution where the existing contextual information does not suffice, we use a retrieval-assisted context aggregation framework. We demonstrate that state-of-the-art name gender inference is possible without supervision. Our proposed method matches or outperforms several supervised approaches and commercially used methods on five English language datasets from different domains.
    Date
    22. 3.2023 12:00:14
  2. Zhang, L.; Lu, W.; Yang, J.: LAGOS-AND : a large gold standard dataset for scholarly author name disambiguation (2023) 0.00
    0.004639679 = product of:
      0.016238876 = sum of:
        0.003969876 = product of:
          0.01984938 = sum of:
            0.01984938 = weight(_text_:system in 883) [ClassicSimilarity], result of:
              0.01984938 = score(doc=883,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.17398985 = fieldWeight in 883, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=883)
          0.2 = coord(1/5)
        0.0122690005 = product of:
          0.024538001 = sum of:
            0.024538001 = weight(_text_:22 in 883) [ClassicSimilarity], result of:
              0.024538001 = score(doc=883,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.19345059 = fieldWeight in 883, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=883)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    In this article, we present a method to automatically build large labeled datasets for the author ambiguity problem in the academic world by leveraging the authoritative academic resources, ORCID and DOI. Using the method, we built LAGOS-AND, two large, gold-standard sub-datasets for author name disambiguation (AND), of which LAGOS-AND-BLOCK is created for clustering-based AND research and LAGOS-AND-PAIRWISE is created for classification-based AND research. Our LAGOS-AND datasets are substantially different from the existing ones. The initial versions of the datasets (v1.0, released in February 2021) include 7.5 M citations authored by 798 K unique authors (LAGOS-AND-BLOCK) and close to 1 M instances (LAGOS-AND-PAIRWISE). And both datasets show close similarities to the whole Microsoft Academic Graph (MAG) across validations of six facets. In building the datasets, we reveal the variation degrees of last names in three literature databases, PubMed, MAG, and Semantic Scholar, by comparing author names hosted to the authors' official last names shown on the ORCID pages. Furthermore, we evaluate several baseline disambiguation methods as well as the MAG's author IDs system on our datasets, and the evaluation helps identify several interesting findings. We hope the datasets and findings will bring new insights for future studies. The code and datasets are publicly available.
    Date
    22. 1.2023 18:40:36
  3. Morris, V.: Automated language identification of bibliographic resources (2020) 0.00
    0.0028043431 = product of:
      0.0196304 = sum of:
        0.0196304 = product of:
          0.0392608 = sum of:
            0.0392608 = weight(_text_:22 in 5749) [ClassicSimilarity], result of:
              0.0392608 = score(doc=5749,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.30952093 = fieldWeight in 5749, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5749)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    2. 3.2020 19:04:22
  4. Kim, J.(im); Kim, J.(enna): Effect of forename string on author name disambiguation (2020) 0.00
    0.0017527144 = product of:
      0.0122690005 = sum of:
        0.0122690005 = product of:
          0.024538001 = sum of:
            0.024538001 = weight(_text_:22 in 5930) [ClassicSimilarity], result of:
              0.024538001 = score(doc=5930,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.19345059 = fieldWeight in 5930, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5930)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    11. 7.2020 13:22:58
  5. Díez Platas, M.L.; Muñoz, S.R.; González-Blanco, E.; Ruiz Fabo, P.; Álvarez Mellado, E.: Medieval Spanish (12th-15th centuries) named entity recognition and attribute annotation system based on contextual information (2021) 0.00
    0.0011342503 = product of:
      0.007939752 = sum of:
        0.007939752 = product of:
          0.03969876 = sum of:
            0.03969876 = weight(_text_:system in 93) [ClassicSimilarity], result of:
              0.03969876 = score(doc=93,freq=8.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.3479797 = fieldWeight in 93, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=93)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Abstract
    The recognition of named entities in Spanish medieval texts presents great complexity, involving specific challenges: First, the complex morphosyntactic characteristics in proper-noun use in medieval texts. Second, the lack of strict orthographic standards. Finally, diachronic and geographical variations in Spanish from the 12th to 15th century. In this period, named entities usually appear as complex text structure. For example, it was frequent to add nicknames and information about the persons role in society and geographic origin. To tackle this complexity, named entity recognition and classification system has been implemented. The system uses contextual cues based on semantics to detect entities and assign a type. Given the occurrence of entities with attached attributes, entity contexts are also parsed to determine entity-type-specific dependencies for these attributes. Moreover, it uses a variant generator to handle the diachronic evolution of Spanish medieval terms from a phonetic and morphosyntactic viewpoint. The tool iteratively enriches its proper lexica, dictionaries, and gazetteers. The system was evaluated on a corpus of over 3,000 manually annotated entities of different types and periods, obtaining F1 scores between 0.74 and 0.87. Attribute annotation was evaluated for a person and role name attributes with an overall F1 of 0.75.
  6. Zakaria, M.S.: Measuring typographical errors in online catalogs of academic libraries using Ballard's list : a case study from Egypt (2023) 0.00
    9.0608327E-4 = product of:
      0.0063425824 = sum of:
        0.0063425824 = product of:
          0.031712912 = sum of:
            0.031712912 = weight(_text_:retrieval in 1184) [ClassicSimilarity], result of:
              0.031712912 = score(doc=1184,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.28943354 = fieldWeight in 1184, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1184)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Typographical errors in bibliographic records of online library catalogs are a common troublesome phenomenon, spread all over the world. They can affect the retrieval and identification of items in information retrieval systems and thus prevent users from finding the documents they need. The present study was conducted to measure typographical errors in the online catalog of the Egyptian Universities Libraries Consortium (EULC). The investigation depended on Terry Ballard's typographical error terms list. The EULC catalog was searched to identify matched erroneous records. The study found that the total number of erroneous records reached 1686, whereas the mean error rate for each record is 11.24, which is very high. About 396 erroneous records (23.49%) have been retrieved from Section C of Ballard's list (Moderate Probability). The typographical errors found within the abstracts of the study's sample records represented 35.82%. Omissions were the first common type of errors with 54.51%, followed by transpositions at 17.08%. Regarding the analysis of parts of speech, the study found that 63.46% of errors occur in noun terms. The results of the study indicated that typographical errors still pose a serious challenge for information retrieval systems, especially for library systems in the Arab environment. The study proposes some solutions for Egyptian university libraries in order to avoid typographic mistakes in the future.
  7. Farmer, L.S.J.: Cataloging children's materials : issues and solutions (2021) 0.00
    7.9397525E-4 = product of:
      0.0055578267 = sum of:
        0.0055578267 = product of:
          0.027789133 = sum of:
            0.027789133 = weight(_text_:system in 701) [ClassicSimilarity], result of:
              0.027789133 = score(doc=701,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2435858 = fieldWeight in 701, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=701)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Library catalogs remain challenging for children to use, especially because children have difficulty with multi-step processes, have less semantic and technical knowledge, and often search differently from adults. Child-friendly catalogs should have clear, simple protocols and visual guides that are standardized yet include flexible options for differentiated manipulation. Materials should be described accurately and in ways that connect meaningfully to children. More fundamentally, cataloging children's materials needs to be done in light of children as potential users and limitations of the integrated library management system itself. Getting children's feedback in the process can optimize the results.
  8. Boruah, B.B.; Ravikumar, S.; Gayang, F.L.: Consistency, extent, and validation of the utilization of the MARC 21 bibliographic standard in the college libraries of Assam in India (2023) 0.00
    7.323784E-4 = product of:
      0.0051266486 = sum of:
        0.0051266486 = product of:
          0.025633242 = sum of:
            0.025633242 = weight(_text_:retrieval in 1183) [ClassicSimilarity], result of:
              0.025633242 = score(doc=1183,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23394634 = fieldWeight in 1183, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1183)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Abstract
    This paper brings light to the existing practice of cataloging in the college libraries of Assam in terms of utilizing the MARC 21 standard and its structure, i.e., the tags, subfield codes, and indicators. Catalog records from six college libraries are collected and a survey is conducted to understand the local users' information requirements for the catalog. Places, where libraries have scope to improve and which divisions of tags could be more helpful for them in information retrieval, are identified and suggested. This study fulfilled the need for local-level assessment of the catalogs.
  9. Lackner, K.; Schilhan, L.: ¬Der Einzug der EDV im österreichischen Bibliothekswesen am Beispiel der Universitätsbibliothek Graz (2022) 0.00
    6.2775286E-4 = product of:
      0.00439427 = sum of:
        0.00439427 = product of:
          0.02197135 = sum of:
            0.02197135 = weight(_text_:retrieval in 476) [ClassicSimilarity], result of:
              0.02197135 = score(doc=476,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.20052543 = fieldWeight in 476, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=476)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Durch den Einsatz von EDV-Systemen kam es ab den 1970er Jahren zu einem radikalen Wandel in der Benutzung und Verwaltung von Universitätsbibliotheken. Die Universitätsbibliothek Graz war die erste Bibliothek in Österreich, die ein elektronisches Bibliothekssystem entwickelte und einsetzte, womit sie zu den Vorreitern in Europa zählte. Dieser Artikel liefert einen historischen Überblick über die Anfänge, die Entwicklung und Verbreitung der elektronischen Bibliothekssysteme im Allgemeinen sowie an der Universitätsbibliothek Graz im Speziellen. Vorgestellt werden die im Lauf der Jahrzehnte an der UB Graz eingesetzten Bibliothekssysteme GRIBS, EMILE, FBInfo, BIBOS, ALEPH und ALMA sowie die Entwicklung von den ersten Online- über die CD-ROM-Datenbanken bis hin zum modernen Datenbank-Retrieval.