Search (33 results, page 2 of 2)

  • × theme_ss:"Automatisches Indexieren"
  • × type_ss:"a"
  • × year_i:[2010 TO 2020}
  1. Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.01
    0.009479279 = product of:
      0.028437834 = sum of:
        0.028437834 = product of:
          0.05687567 = sum of:
            0.05687567 = weight(_text_:indexing in 2918) [ClassicSimilarity], result of:
              0.05687567 = score(doc=2918,freq=4.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.29905218 = fieldWeight in 2918, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2918)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval. Design/methodology/approach - Over 11.000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings. Findings - The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology. Originality/value - No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.
  2. Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.01
    0.009479279 = product of:
      0.028437834 = sum of:
        0.028437834 = product of:
          0.05687567 = sum of:
            0.05687567 = weight(_text_:indexing in 2895) [ClassicSimilarity], result of:
              0.05687567 = score(doc=2895,freq=4.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.29905218 = fieldWeight in 2895, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2895)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    The article presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection, and Word-Sense Disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH. Relation Extraction (RE) performance benefits from a syntactic-based definition of RE patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive natural language processing (NLP) modules relating to Word-Sense Disambiguation, Negation Detection, and Noun Phrase Validation, together with controlled thesaurus expansion. The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven RE rules for the recognition of semantic relationships from phrases of unstructured text.
  3. Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.01
    0.009479279 = product of:
      0.028437834 = sum of:
        0.028437834 = product of:
          0.05687567 = sum of:
            0.05687567 = weight(_text_:indexing in 3627) [ClassicSimilarity], result of:
              0.05687567 = score(doc=3627,freq=4.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.29905218 = fieldWeight in 3627, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3627)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).
  4. Lepsky, K.; Müller, T.; Wille, J.: Metadata improvement for image information retrieval (2010) 0.01
    0.009384007 = product of:
      0.02815202 = sum of:
        0.02815202 = product of:
          0.05630404 = sum of:
            0.05630404 = weight(_text_:indexing in 4995) [ClassicSimilarity], result of:
              0.05630404 = score(doc=4995,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.29604656 = fieldWeight in 4995, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4995)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper discusses the goals and results of the research project Perseus-a as an attempt to improve information retrieval of digital images by automatically connecting them with text-based descriptions. The development uses the image collection of prometheus, the distributed digital image archive for research and studies, the articles of the digitized Reallexikon zur Deutschen Kunstgeschichte, art historical terminological resources and classification data, and an open source system for linguistic and statistic automatic indexing called lingo.
  5. Simões, M. da Graça; Machado, L.M.; Souza, R.R.; Almeida, M.B.; Tavares Lopes, A.: Automatic indexing and ontologies : the consistency of research chronology and authoring in the context of Information Science (2018) 0.01
    0.009384007 = product of:
      0.02815202 = sum of:
        0.02815202 = product of:
          0.05630404 = sum of:
            0.05630404 = weight(_text_:indexing in 5909) [ClassicSimilarity], result of:
              0.05630404 = score(doc=5909,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.29604656 = fieldWeight in 5909, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5909)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
  6. Kasprzik, A.: Voraussetzungen und Anwendungspotentiale einer präzisen Sacherschließung aus Sicht der Wissenschaft (2018) 0.01
    0.007853523 = product of:
      0.023560567 = sum of:
        0.023560567 = product of:
          0.047121134 = sum of:
            0.047121134 = weight(_text_:22 in 5195) [ClassicSimilarity], result of:
              0.047121134 = score(doc=5195,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.2708308 = fieldWeight in 5195, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5195)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Große Aufmerksamkeit richtet sich im Moment auf das Potential von automatisierten Methoden in der Sacherschließung und deren Interaktionsmöglichkeiten mit intellektuellen Methoden. In diesem Kontext befasst sich der vorliegende Beitrag mit den folgenden Fragen: Was sind die Anforderungen an bibliothekarische Metadaten aus Sicht der Wissenschaft? Was wird gebraucht, um den Informationsbedarf der Fachcommunities zu bedienen? Und was bedeutet das entsprechend für die Automatisierung der Metadatenerstellung und -pflege? Dieser Beitrag fasst die von der Autorin eingenommene Position in einem Impulsvortrag und der Podiumsdiskussion beim Workshop der FAG "Erschließung und Informationsvermittlung" des GBV zusammen. Der Workshop fand im Rahmen der 22. Verbundkonferenz des GBV statt.
  7. Franke-Maier, M.: Anforderungen an die Qualität der Inhaltserschließung im Spannungsfeld von intellektuell und automatisch erzeugten Metadaten (2018) 0.01
    0.007853523 = product of:
      0.023560567 = sum of:
        0.023560567 = product of:
          0.047121134 = sum of:
            0.047121134 = weight(_text_:22 in 5344) [ClassicSimilarity], result of:
              0.047121134 = score(doc=5344,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.2708308 = fieldWeight in 5344, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5344)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Spätestens seit dem Deutschen Bibliothekartag 2018 hat sich die Diskussion zu den automatischen Verfahren der Inhaltserschließung der Deutschen Nationalbibliothek von einer politisch geführten Diskussion in eine Qualitätsdiskussion verwandelt. Der folgende Beitrag beschäftigt sich mit Fragen der Qualität von Inhaltserschließung in digitalen Zeiten, wo heterogene Erzeugnisse unterschiedlicher Verfahren aufeinandertreffen und versucht, wichtige Anforderungen an Qualität zu definieren. Dieser Tagungsbeitrag fasst die vom Autor als Impulse vorgetragenen Ideen beim Workshop der FAG "Erschließung und Informationsvermittlung" des GBV am 29. August 2018 in Kiel zusammen. Der Workshop fand im Rahmen der 22. Verbundkonferenz des GBV statt.
  8. Busch, D.: Domänenspezifische hybride automatische Indexierung von bibliographischen Metadaten (2019) 0.01
    0.0067315903 = product of:
      0.02019477 = sum of:
        0.02019477 = product of:
          0.04038954 = sum of:
            0.04038954 = weight(_text_:22 in 5628) [ClassicSimilarity], result of:
              0.04038954 = score(doc=5628,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.23214069 = fieldWeight in 5628, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5628)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    B.I.T.online. 22(2019) H.6, S.465-469
  9. Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.01
    0.0067028617 = product of:
      0.020108584 = sum of:
        0.020108584 = product of:
          0.04021717 = sum of:
            0.04021717 = weight(_text_:indexing in 3105) [ClassicSimilarity], result of:
              0.04021717 = score(doc=3105,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.21146181 = fieldWeight in 3105, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3105)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
  10. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.01
    0.0067028617 = product of:
      0.020108584 = sum of:
        0.020108584 = product of:
          0.04021717 = sum of:
            0.04021717 = weight(_text_:indexing in 5400) [ClassicSimilarity], result of:
              0.04021717 = score(doc=5400,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.21146181 = fieldWeight in 5400, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5400)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
  11. Junger, U.; Schwens, U.: ¬Die inhaltliche Erschließung des schriftlichen kulturellen Erbes auf dem Weg in die Zukunft : Automatische Vergabe von Schlagwörtern in der Deutschen Nationalbibliothek (2017) 0.01
    0.005609659 = product of:
      0.016828977 = sum of:
        0.016828977 = product of:
          0.033657953 = sum of:
            0.033657953 = weight(_text_:22 in 3780) [ClassicSimilarity], result of:
              0.033657953 = score(doc=3780,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.19345059 = fieldWeight in 3780, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3780)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    19. 8.2017 9:24:22
  12. Martins, A.L.; Souza, R.R.; Ribeiro de Mello, H.: ¬The use of noun phrases in information retrieval : proposing a mechanism for automatic classification (2014) 0.00
    0.0044877273 = product of:
      0.013463181 = sum of:
        0.013463181 = product of:
          0.026926363 = sum of:
            0.026926363 = weight(_text_:22 in 1441) [ClassicSimilarity], result of:
              0.026926363 = score(doc=1441,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.15476047 = fieldWeight in 1441, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1441)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  13. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.00
    0.0044877273 = product of:
      0.013463181 = sum of:
        0.013463181 = product of:
          0.026926363 = sum of:
            0.026926363 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
              0.026926363 = score(doc=5499,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.15476047 = fieldWeight in 5499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5499)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    20. 1.2015 18:30:22