Search (15 results, page 1 of 1)

  • × year_i:[2010 TO 2020}
  • × theme_ss:"Automatisches Indexieren"
  1. Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.07
    0.07061749 = product of:
      0.14123498 = sum of:
        0.11509455 = weight(_text_:social in 2918) [ClassicSimilarity], result of:
          0.11509455 = score(doc=2918,freq=16.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.6230592 = fieldWeight in 2918, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2918)
        0.026140431 = product of:
          0.052280862 = sum of:
            0.052280862 = weight(_text_:aspects in 2918) [ClassicSimilarity], result of:
              0.052280862 = score(doc=2918,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.2496898 = fieldWeight in 2918, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2918)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval. Design/methodology/approach - Over 11.000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings. Findings - The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology. Originality/value - No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.
    Theme
    Social tagging
  2. Mesquita, L.A.P.; Souza, R.R.; Baracho Porto, R.M.A.: Noun phrases in automatic indexing: : a structural analysis of the distribution of relevant terms in doctoral theses (2014) 0.03
    0.034468703 = product of:
      0.068937406 = sum of:
        0.056384586 = weight(_text_:social in 1442) [ClassicSimilarity], result of:
          0.056384586 = score(doc=1442,freq=6.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.30523545 = fieldWeight in 1442, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.03125 = fieldNorm(doc=1442)
        0.012552816 = product of:
          0.025105633 = sum of:
            0.025105633 = weight(_text_:22 in 1442) [ClassicSimilarity], result of:
              0.025105633 = score(doc=1442,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.15476047 = fieldWeight in 1442, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1442)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The main objective of this research was to analyze whether there was a characteristic distribution behavior of relevant terms over a scientific text that could contribute as a criterion for their process of automatic indexing. The terms considered in this study were only full noun phrases contained in the texts themselves. The texts were considered a total of 98 doctoral theses of the eight areas of knowledge in a same university. Initially, 20 full noun phrases were automatically extracted from each text as candidates to be the most relevant terms, and each author of each text assigned a relevance value 0-6 (not relevant and highly relevant, respectively) for each of the 20 noun phrases sent. Only, 22.1 % of noun phrases were considered not relevant. A relevance values of the terms assigned by the authors were associated with their positions in the text. Each full noun phrases found in the text was considered as a valid linear position. The results that were obtained showed values resulting from this distribution by considering two types of position: linear, with values consolidated into ten equal consecutive parts; and structural, considering parts of the text (such as introduction, development and conclusion). As a result of considerable importance, all areas of knowledge related to the Natural Sciences showed a characteristic behavior in the distribution of relevant terms, as well as all areas of knowledge related to Social Sciences showed the same characteristic behavior of distribution, but distinct from the Natural Sciences. The difference of the distribution behavior between the Natural and Social Sciences can be clearly visualized through graphs. All behaviors, including the general behavior of all areas of knowledge together, were characterized in polynomial equations and can be applied in future as criteria for automatic indexing. Until the present date this work has become inedited of for two reasons: to present a method for characterizing the distribution of relevant terms in a scientific text, and also, through this method, pointing out a quantitative trait difference between the Natural and Social Sciences.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  3. Moreno, J.M.T.: Automatic text summarization (2014) 0.01
    0.010173016 = product of:
      0.040692065 = sum of:
        0.040692065 = weight(_text_:social in 1518) [ClassicSimilarity], result of:
          0.040692065 = score(doc=1518,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.22028469 = fieldWeight in 1518, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1518)
      0.25 = coord(1/4)
    
    Abstract
    This new textbook examines the motivations and the different algorithms for automatic document summarization (ADS). We performed a recent state of the art. The book shows the main problems of ADS, difficulties and the solutions provided by the community. It presents recent advances in ADS, as well as current applications and trends. The approaches are statistical, linguistic and symbolic. Several exemples are included in order to clarify the theoretical concepts. The books currently available in the area of Automatic Document Summarization are not recent. Powerful algorithms have been developed in recent years that include several applications of ADS. The development of recent technology has impacted on the development of algorithms and their applications. The massive use of social networks and the new forms of the technology requires the adaptation of the classical methods of text summarizers. This is a new textbook on Automatic Text Summarization, based on teaching materials used in two or one-semester courses. It presents a extensive state-of-art and describes the new systems on the subject. Previous automatic summarization books have been either collections of specialized papers, or else authored books with only a chapter or two devoted to the field as a whole. In other hand, the classic books on the subject are not recent.
  4. Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C.: On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages (2015) 0.01
    0.010173016 = product of:
      0.040692065 = sum of:
        0.040692065 = weight(_text_:social in 2161) [ClassicSimilarity], result of:
          0.040692065 = score(doc=2161,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.22028469 = fieldWeight in 2161, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2161)
      0.25 = coord(1/4)
    
    Abstract
    Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focused on determining whether the content of a text is objective or subjective, and in the latter case, if it conveys a positive or a negative opinion. Most polarity detection techniques tend to take into account individual terms in the text and even some degree of linguistic knowledge, but they do not usually consider syntactic relations between words. This article explores how relating lexical, syntactic, and psychometric information can be helpful to perform polarity classification on Spanish tweets. We provide an evaluation for both shallow and deep linguistic perspectives. Empirical results show an improved performance of syntactic approaches over pure lexical models when using large training sets to create a classifier, but this tendency is reversed when small training collections are used.
  5. Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.01
    0.010173016 = product of:
      0.040692065 = sum of:
        0.040692065 = weight(_text_:social in 3627) [ClassicSimilarity], result of:
          0.040692065 = score(doc=3627,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.22028469 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
      0.25 = coord(1/4)
    
    Abstract
    A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).
  6. Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019) 0.01
    0.009414612 = product of:
      0.03765845 = sum of:
        0.03765845 = product of:
          0.0753169 = sum of:
            0.0753169 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
              0.0753169 = score(doc=5629,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.46428138 = fieldWeight in 5629, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5629)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    B.I.T.online. 22(2019) H.2, S.163-166
  7. Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.01
    0.009149151 = product of:
      0.036596604 = sum of:
        0.036596604 = product of:
          0.07319321 = sum of:
            0.07319321 = weight(_text_:aspects in 5236) [ClassicSimilarity], result of:
              0.07319321 = score(doc=5236,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.3495657 = fieldWeight in 5236, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5236)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    The Black Book Interactive Project at the University of Kansas (KU) is developing an expanded corpus of novels by African American authors, with an emphasis on lesser known writers and a goal of expanding research in this field. Using a custom metadata schema with an emphasis on race-related elements, each novel is analyzed for a variety of elements such as literary style, targeted content analysis, historical context, and other areas. Librarians at KU have worked to develop a variety of computational text analysis processes designed to assist with specific aspects of this metadata collection, including text mining and natural language processing, automated subject extraction based on word sense disambiguation, harvesting data from Wikidata, and other actions.
  8. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.01
    0.007845511 = product of:
      0.031382043 = sum of:
        0.031382043 = product of:
          0.062764086 = sum of:
            0.062764086 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
              0.062764086 = score(doc=2759,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.38690117 = fieldWeight in 2759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2759)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    1. 2.2016 18:25:22
  9. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.01
    0.006276408 = product of:
      0.025105633 = sum of:
        0.025105633 = product of:
          0.050211266 = sum of:
            0.050211266 = weight(_text_:22 in 401) [ClassicSimilarity], result of:
              0.050211266 = score(doc=401,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.30952093 = fieldWeight in 401, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=401)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    11. 9.2012 19:43:22
  10. Kasprzik, A.: Voraussetzungen und Anwendungspotentiale einer präzisen Sacherschließung aus Sicht der Wissenschaft (2018) 0.01
    0.005491857 = product of:
      0.021967428 = sum of:
        0.021967428 = product of:
          0.043934856 = sum of:
            0.043934856 = weight(_text_:22 in 5195) [ClassicSimilarity], result of:
              0.043934856 = score(doc=5195,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.2708308 = fieldWeight in 5195, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5195)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Große Aufmerksamkeit richtet sich im Moment auf das Potential von automatisierten Methoden in der Sacherschließung und deren Interaktionsmöglichkeiten mit intellektuellen Methoden. In diesem Kontext befasst sich der vorliegende Beitrag mit den folgenden Fragen: Was sind die Anforderungen an bibliothekarische Metadaten aus Sicht der Wissenschaft? Was wird gebraucht, um den Informationsbedarf der Fachcommunities zu bedienen? Und was bedeutet das entsprechend für die Automatisierung der Metadatenerstellung und -pflege? Dieser Beitrag fasst die von der Autorin eingenommene Position in einem Impulsvortrag und der Podiumsdiskussion beim Workshop der FAG "Erschließung und Informationsvermittlung" des GBV zusammen. Der Workshop fand im Rahmen der 22. Verbundkonferenz des GBV statt.
  11. Franke-Maier, M.: Anforderungen an die Qualität der Inhaltserschließung im Spannungsfeld von intellektuell und automatisch erzeugten Metadaten (2018) 0.01
    0.005491857 = product of:
      0.021967428 = sum of:
        0.021967428 = product of:
          0.043934856 = sum of:
            0.043934856 = weight(_text_:22 in 5344) [ClassicSimilarity], result of:
              0.043934856 = score(doc=5344,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.2708308 = fieldWeight in 5344, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5344)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Spätestens seit dem Deutschen Bibliothekartag 2018 hat sich die Diskussion zu den automatischen Verfahren der Inhaltserschließung der Deutschen Nationalbibliothek von einer politisch geführten Diskussion in eine Qualitätsdiskussion verwandelt. Der folgende Beitrag beschäftigt sich mit Fragen der Qualität von Inhaltserschließung in digitalen Zeiten, wo heterogene Erzeugnisse unterschiedlicher Verfahren aufeinandertreffen und versucht, wichtige Anforderungen an Qualität zu definieren. Dieser Tagungsbeitrag fasst die vom Autor als Impulse vorgetragenen Ideen beim Workshop der FAG "Erschließung und Informationsvermittlung" des GBV am 29. August 2018 in Kiel zusammen. Der Workshop fand im Rahmen der 22. Verbundkonferenz des GBV statt.
  12. Busch, D.: Domänenspezifische hybride automatische Indexierung von bibliographischen Metadaten (2019) 0.00
    0.004707306 = product of:
      0.018829225 = sum of:
        0.018829225 = product of:
          0.03765845 = sum of:
            0.03765845 = weight(_text_:22 in 5628) [ClassicSimilarity], result of:
              0.03765845 = score(doc=5628,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.23214069 = fieldWeight in 5628, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5628)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    B.I.T.online. 22(2019) H.6, S.465-469
  13. Junger, U.; Schwens, U.: ¬Die inhaltliche Erschließung des schriftlichen kulturellen Erbes auf dem Weg in die Zukunft : Automatische Vergabe von Schlagwörtern in der Deutschen Nationalbibliothek (2017) 0.00
    0.0039227554 = product of:
      0.015691021 = sum of:
        0.015691021 = product of:
          0.031382043 = sum of:
            0.031382043 = weight(_text_:22 in 3780) [ClassicSimilarity], result of:
              0.031382043 = score(doc=3780,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.19345059 = fieldWeight in 3780, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3780)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    19. 8.2017 9:24:22
  14. Martins, A.L.; Souza, R.R.; Ribeiro de Mello, H.: ¬The use of noun phrases in information retrieval : proposing a mechanism for automatic classification (2014) 0.00
    0.003138204 = product of:
      0.012552816 = sum of:
        0.012552816 = product of:
          0.025105633 = sum of:
            0.025105633 = weight(_text_:22 in 1441) [ClassicSimilarity], result of:
              0.025105633 = score(doc=1441,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.15476047 = fieldWeight in 1441, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1441)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  15. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.00
    0.003138204 = product of:
      0.012552816 = sum of:
        0.012552816 = product of:
          0.025105633 = sum of:
            0.025105633 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
              0.025105633 = score(doc=5499,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.15476047 = fieldWeight in 5499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5499)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    20. 1.2015 18:30:22