Search (17 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Automatisches Indexieren"
  • × year_i:[2010 TO 2020}
  1. Fauzi, F.; Belkhatir, M.: Multifaceted conceptual image indexing on the world wide web (2013) 0.06
    0.056143478 = product of:
      0.16843043 = sum of:
        0.043577533 = weight(_text_:web in 2721) [ClassicSimilarity], result of:
          0.043577533 = score(doc=2721,freq=6.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.37471575 = fieldWeight in 2721, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2721)
        0.034899916 = weight(_text_:world in 2721) [ClassicSimilarity], result of:
          0.034899916 = score(doc=2721,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.25480178 = fieldWeight in 2721, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.046875 = fieldNorm(doc=2721)
        0.046375446 = weight(_text_:wide in 2721) [ClassicSimilarity], result of:
          0.046375446 = score(doc=2721,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.29372054 = fieldWeight in 2721, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=2721)
        0.043577533 = weight(_text_:web in 2721) [ClassicSimilarity], result of:
          0.043577533 = score(doc=2721,freq=6.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.37471575 = fieldWeight in 2721, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2721)
      0.33333334 = coord(4/12)
    
    Abstract
    In this paper, we describe a user-centered design of an automated multifaceted concept-based indexing framework which analyzes the semantics of the Web image contextual information and classifies it into five broad semantic concept facets: signal, object, abstract, scene, and relational; and identifies the semantic relationships between the concepts. An important aspect of our indexing model is that it relates to the users' levels of image descriptions. Also, a major contribution relies on the fact that the classification is performed automatically with the raw image contextual information extracted from any general webpage and is not solely based on image tags like state-of-the-art solutions. Human Language Technology techniques and an external knowledge base are used to analyze the information both syntactically and semantically. Experimental results on a human-annotated Web image collection and corresponding contextual information indicate that our method outperforms empirical frameworks employing tf-idf and location-based tf-idf weighting schemes as well as n-gram indexing in a recall/precision based evaluation framework.
  2. Daudaravicius, V.: ¬A framework for keyphrase extraction from scientific journals (2016) 0.05
    0.051175587 = product of:
      0.15352675 = sum of:
        0.02935275 = weight(_text_:web in 2930) [ClassicSimilarity], result of:
          0.02935275 = score(doc=2930,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 2930, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2930)
        0.040716566 = weight(_text_:world in 2930) [ClassicSimilarity], result of:
          0.040716566 = score(doc=2930,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.29726875 = fieldWeight in 2930, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2930)
        0.05410469 = weight(_text_:wide in 2930) [ClassicSimilarity], result of:
          0.05410469 = score(doc=2930,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.342674 = fieldWeight in 2930, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2930)
        0.02935275 = weight(_text_:web in 2930) [ClassicSimilarity], result of:
          0.02935275 = score(doc=2930,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 2930, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2930)
      0.33333334 = coord(4/12)
    
    Content
    Vortrag, "Semantics, Analytics, Visualisation: Enhancing Scholarly Data Workshop co-located with the 25th International World Wide Web Conference April 11, 2016 - Montreal, Canada", Montreal 2016.
  3. Gábor, K.; Zargayouna, H.; Tellier, I.; Buscaldi, D.; Charnois, T.: ¬A typology of semantic relations dedicated to scientific literature analysis (2016) 0.05
    0.051175587 = product of:
      0.15352675 = sum of:
        0.02935275 = weight(_text_:web in 2933) [ClassicSimilarity], result of:
          0.02935275 = score(doc=2933,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 2933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2933)
        0.040716566 = weight(_text_:world in 2933) [ClassicSimilarity], result of:
          0.040716566 = score(doc=2933,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.29726875 = fieldWeight in 2933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2933)
        0.05410469 = weight(_text_:wide in 2933) [ClassicSimilarity], result of:
          0.05410469 = score(doc=2933,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.342674 = fieldWeight in 2933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2933)
        0.02935275 = weight(_text_:web in 2933) [ClassicSimilarity], result of:
          0.02935275 = score(doc=2933,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 2933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2933)
      0.33333334 = coord(4/12)
    
    Content
    Vortrag, "Semantics, Analytics, Visualisation: Enhancing Scholarly Data Workshop co-located with the 25th International World Wide Web Conference April 11, 2016 - Montreal, Canada", Montreal 2016.
  4. Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.02
    0.015128385 = product of:
      0.18154062 = sum of:
        0.18154062 = weight(_text_:tagging in 2918) [ClassicSimilarity], result of:
          0.18154062 = score(doc=2918,freq=14.0), product of:
            0.21038401 = queryWeight, product of:
              5.9038734 = idf(docFreq=327, maxDocs=44218)
              0.035634913 = queryNorm
            0.8629013 = fieldWeight in 2918, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.9038734 = idf(docFreq=327, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2918)
      0.083333336 = coord(1/12)
    
    Abstract
    Purpose - The purpose of this paper is to explore the potential of applying the Dewey Decimal Classification (DDC) as an established knowledge organization system (KOS) for enhancing social tagging, with the ultimate purpose of improving subject indexing and information retrieval. Design/methodology/approach - Over 11.000 Intute metadata records in politics were used. Totally, 28 politics students were each given four tasks, in which a total of 60 resources were tagged in two different configurations, one with uncontrolled social tags only and another with uncontrolled social tags as well as suggestions from a controlled vocabulary. The controlled vocabulary was DDC comprising also mappings from the Library of Congress Subject Headings. Findings - The results demonstrate the importance of controlled vocabulary suggestions for indexing and retrieval: to help produce ideas of which tags to use, to make it easier to find focus for the tagging, to ensure consistency and to increase the number of access points in retrieval. The value and usefulness of the suggestions proved to be dependent on the quality of the suggestions, both as to conceptual relevance to the user and as to appropriateness of the terminology. Originality/value - No research has investigated the enhancement of social tagging with suggestions from the DDC, an established KOS, in a user trial, comparing social tagging only and social tagging enhanced with the suggestions. This paper is a final reflection on all aspects of the study.
    Theme
    Social tagging
  5. Martins, A.L.; Souza, R.R.; Ribeiro de Mello, H.: ¬The use of noun phrases in information retrieval : proposing a mechanism for automatic classification (2014) 0.01
    0.010758134 = product of:
      0.0645488 = sum of:
        0.054892723 = weight(_text_:tagging in 1441) [ClassicSimilarity], result of:
          0.054892723 = score(doc=1441,freq=2.0), product of:
            0.21038401 = queryWeight, product of:
              5.9038734 = idf(docFreq=327, maxDocs=44218)
              0.035634913 = queryNorm
            0.2609168 = fieldWeight in 1441, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.9038734 = idf(docFreq=327, maxDocs=44218)
              0.03125 = fieldNorm(doc=1441)
        0.009656077 = product of:
          0.019312155 = sum of:
            0.019312155 = weight(_text_:22 in 1441) [ClassicSimilarity], result of:
              0.019312155 = score(doc=1441,freq=2.0), product of:
                0.12478739 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.035634913 = queryNorm
                0.15476047 = fieldWeight in 1441, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1441)
          0.5 = coord(1/2)
      0.16666667 = coord(2/12)
    
    Abstract
    This paper presents a research on syntactic structures known as noun phrases (NP) being applied to increase the effectiveness and efficiency of the mechanisms for the document's classification. Our hypothesis is the fact that the NP can be used instead of single words as a semantic aggregator to reduce the number of words that will be used for the classification system without losing its semantic coverage, increasing its efficiency. The experiment divided the documents classification process in three phases: a) NP preprocessing b) system training; and c) classification experiments. In the first step, a corpus of digitalized texts was submitted to a natural language processing platform1 in which the part-of-speech tagging was done, and them PERL scripts pertaining to the PALAVRAS package were used to extract the Noun Phrases. The preprocessing also involved the tasks of a) removing NP low meaning pre-modifiers, as quantifiers; b) identification of synonyms and corresponding substitution for common hyperonyms; and c) stemming of the relevant words contained in the NP, for similitude checking with other NPs. The first tests with the resulting documents have demonstrated its effectiveness. We have compared the structural similarity of the documents before and after the whole pre-processing steps of phase one. The texts maintained the consistency with the original and have kept the readability. The second phase involves submitting the modified documents to a SVM algorithm to identify clusters and classify the documents. The classification rules are to be established using a machine learning approach. Finally, tests will be conducted to check the effectiveness of the whole process.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  6. Junger, U.: Can indexing be automated? : the example of the Deutsche Nationalbibliothek (2012) 0.01
    0.0097842505 = product of:
      0.0587055 = sum of:
        0.02935275 = weight(_text_:web in 1717) [ClassicSimilarity], result of:
          0.02935275 = score(doc=1717,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 1717, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1717)
        0.02935275 = weight(_text_:web in 1717) [ClassicSimilarity], result of:
          0.02935275 = score(doc=1717,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 1717, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1717)
      0.16666667 = coord(2/12)
    
    Content
    Beitrag für die Tagung: Beyond libraries - subject metadata in the digital environment and semantic web. IFLA Satellite Post-Conference, 17-18 August 2012, Tallinn. Vgl.: http://http://www.nlib.ee/index.php?id=17763.
  7. Junger, U.: Can indexing be automated? : the example of the Deutsche Nationalbibliothek (2014) 0.01
    0.0097842505 = product of:
      0.0587055 = sum of:
        0.02935275 = weight(_text_:web in 1969) [ClassicSimilarity], result of:
          0.02935275 = score(doc=1969,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 1969, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1969)
        0.02935275 = weight(_text_:web in 1969) [ClassicSimilarity], result of:
          0.02935275 = score(doc=1969,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 1969, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1969)
      0.16666667 = coord(2/12)
    
    Footnote
    Contribution in a special issue "Beyond libraries: Subject metadata in the digital environment and Semantic Web" - Enthält Beiträge der gleichnamigen IFLA Satellite Post-Conference, 17-18 August 2012, Tallinn.
  8. Lichtenstein, A.; Plank, M.; Neumann, J.: TIB's portal for audiovisual media : combining manual and automatic indexing (2014) 0.01
    0.0097842505 = product of:
      0.0587055 = sum of:
        0.02935275 = weight(_text_:web in 1981) [ClassicSimilarity], result of:
          0.02935275 = score(doc=1981,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 1981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1981)
        0.02935275 = weight(_text_:web in 1981) [ClassicSimilarity], result of:
          0.02935275 = score(doc=1981,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.25239927 = fieldWeight in 1981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1981)
      0.16666667 = coord(2/12)
    
    Abstract
    The German National Library of Science and Technology (TIB) developed a Web-based platform for audiovisual media. The audiovisual portal optimizes access to scientific videos such as computer animations and lecture and conference recordings. TIB's AV-Portal combines traditional cataloging and automatic indexing of audiovisual media. The article describes metadata standards for audiovisual media and introduces the TIB's metadata schema in comparison to other metadata standards for non-textual materials. Additionally, we give an overview of multimedia retrieval technologies used for the Portal and present the AV-Portal in detail as well as the additional value for libraries and their users.
  9. Zhitomirsky-Geffet, M.; Prebor, G.; Bloch, O.: Improving proverb search and retrieval with a generic multidimensional ontology (2017) 0.01
    0.0083865 = product of:
      0.050318997 = sum of:
        0.025159499 = weight(_text_:web in 3320) [ClassicSimilarity], result of:
          0.025159499 = score(doc=3320,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.21634221 = fieldWeight in 3320, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3320)
        0.025159499 = weight(_text_:web in 3320) [ClassicSimilarity], result of:
          0.025159499 = score(doc=3320,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.21634221 = fieldWeight in 3320, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3320)
      0.16666667 = coord(2/12)
    
    Abstract
    The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large-scale analysis. Because proverbs can be described and indexed by multiple characteristics and criteria, we built a multidimensional ontology suitable for proverb classification. To evaluate the effectiveness of the constructed ontology for improving search and retrieval of proverbs, a large-scale user experiment was arranged with 70 users who were asked to search a proverb repository using ontology-based and free-text search interfaces. The comparative analysis of the results shows that the use of this ontology helped to substantially improve the search recall, precision, user satisfaction, and efficiency and to minimize user effort during the search process. A practical contribution of this work is an automated web-based proverb search and retrieval system which incorporates the proposed ontological scheme and an initial corpus of ontology-based annotated proverbs.
  10. Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.01
    0.0069887503 = product of:
      0.0419325 = sum of:
        0.02096625 = weight(_text_:web in 3627) [ClassicSimilarity], result of:
          0.02096625 = score(doc=3627,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
        0.02096625 = weight(_text_:web in 3627) [ClassicSimilarity], result of:
          0.02096625 = score(doc=3627,freq=2.0), product of:
            0.11629491 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.035634913 = queryNorm
            0.18028519 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
      0.16666667 = coord(2/12)
    
    Abstract
    A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).
  11. Souza, R.R.; Gil-Leiva, I.: Automatic indexing of scientific texts : a methodological comparison (2016) 0.00
    0.0038777683 = product of:
      0.04653322 = sum of:
        0.04653322 = weight(_text_:world in 4913) [ClassicSimilarity], result of:
          0.04653322 = score(doc=4913,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.33973572 = fieldWeight in 4913, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0625 = fieldNorm(doc=4913)
      0.083333336 = coord(1/12)
    
    Source
    Knowledge organization for a sustainable world: challenges and perspectives for cultural, scientific, and technological sharing in a connected society : proceedings of the Fourteenth International ISKO Conference 27-29 September 2016, Rio de Janeiro, Brazil / organized by International Society for Knowledge Organization (ISKO), ISKO-Brazil, São Paulo State University ; edited by José Augusto Chaves Guimarães, Suellen Oliveira Milani, Vera Dodebei
  12. Golub, K.: Automatic subject indexing of text (2019) 0.00
    0.0032205172 = product of:
      0.038646206 = sum of:
        0.038646206 = weight(_text_:wide in 5268) [ClassicSimilarity], result of:
          0.038646206 = score(doc=5268,freq=2.0), product of:
            0.1578897 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.035634913 = queryNorm
            0.24476713 = fieldWeight in 5268, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5268)
      0.083333336 = coord(1/12)
    
    Abstract
    Automatic subject indexing addresses problems of scale and sustainability and can be at the same time used to enrich existing metadata records, establish more connections across and between resources from various metadata and resource collec-tions, and enhance consistency of the metadata. In this work, au-tomatic subject indexing focuses on assigning index terms or classes from established knowledge organization systems (KOSs) for subject indexing like thesauri, subject headings systems and classification systems. The following major approaches are dis-cussed, in terms of their similarities and differences, advantages and disadvantages for automatic assigned indexing from KOSs: "text categorization," "document clustering," and "document classification." Text categorization is perhaps the most wide-spread, machine-learning approach with what seems generally good reported performance. Document clustering automatically both creates groups of related documents and extracts names of subjects depicting the group at hand. Document classification re-uses the intellectual effort invested into creating a KOS for sub-ject indexing and even simple string-matching algorithms have been reported to achieve good results, because one concept can be described using a number of different terms, including equiv-alent, related, narrower and broader terms. Finally, applicability of automatic subject indexing to operative information systems and challenges of evaluation are outlined, suggesting the need for more research.
  13. Li, X.; Zhang, A.; Li, C.; Ouyang, J.; Cai, Y.: Exploring coherent topics by topic modeling with term weighting (2018) 0.00
    0.0024236054 = product of:
      0.029083263 = sum of:
        0.029083263 = weight(_text_:world in 5045) [ClassicSimilarity], result of:
          0.029083263 = score(doc=5045,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.21233483 = fieldWeight in 5045, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5045)
      0.083333336 = coord(1/12)
    
    Abstract
    Topic models often produce unexplainable topics that are filled with noisy words. The reason is that words in topic modeling have equal weights. High frequency words dominate the top topic word lists, but most of them are meaningless words, e.g., domain-specific stopwords. To address this issue, in this paper we aim to investigate how to weight words, and then develop a straightforward but effective term weighting scheme, namely entropy weighting (EW). The proposed EW scheme is based on conditional entropy measured by word co-occurrences. Compared with existing term weighting schemes, the highlight of EW is that it can automatically reward informative words. For more robust word weight, we further suggest a combination form of EW (CEW) with two existing weighting schemes. Basically, our CEW assigns meaningless words lower weights and informative words higher weights, leading to more coherent topics during topic modeling inference. We apply CEW to Dirichlet multinomial mixture and latent Dirichlet allocation, and evaluate it by topic quality, document clustering and classification tasks on 8 real world data sets. Experimental results show that weighting words can effectively improve the topic modeling performance over both short texts and normal long texts. More importantly, the proposed CEW significantly outperforms the existing term weighting schemes, since it further considers which words are informative.
  14. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.00
    0.0020116828 = product of:
      0.024140194 = sum of:
        0.024140194 = product of:
          0.048280388 = sum of:
            0.048280388 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
              0.048280388 = score(doc=2759,freq=2.0), product of:
                0.12478739 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.035634913 = queryNorm
                0.38690117 = fieldWeight in 2759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2759)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Date
    1. 2.2016 18:25:22
  15. Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D.: ¬A picture is worth a thousand (coherent) words : building a natural description of images (2014) 0.00
    0.0016965236 = product of:
      0.020358283 = sum of:
        0.020358283 = weight(_text_:world in 1874) [ClassicSimilarity], result of:
          0.020358283 = score(doc=1874,freq=2.0), product of:
            0.13696888 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.035634913 = queryNorm
            0.14863437 = fieldWeight in 1874, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1874)
      0.083333336 = coord(1/12)
    
    Content
    "People can summarize a complex scene in a few words without thinking twice. It's much more difficult for computers. But we've just gotten a bit closer -- we've developed a machine-learning system that can automatically produce captions (like the three above) to accurately describe images the first time it sees them. This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images. Recent research has greatly improved object detection, classification, and labeling. But accurately describing a complex scene requires a deeper representation of what's going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. Many efforts to construct computer-generated natural descriptions of images propose combining current state-of-the-art techniques in both computer vision and natural language processing to form a complete image description approach. But what if we instead merged recent computer vision and language models into a single jointly trained system, taking an image and directly producing a human readable sequence of words to describe it? This idea comes from recent advances in machine translation between languages, where a Recurrent Neural Network (RNN) transforms, say, a French sentence into a vector representation, and a second RNN uses that vector representation to generate a target sentence in German. Now, what if we replaced that first RNN and its input words with a deep Convolutional Neural Network (CNN) trained to classify objects in images? Normally, the CNN's last layer is used in a final Softmax among known classes of objects, assigning a probability that each object might be in the image. But if we remove that final layer, we can instead feed the CNN's rich encoding of the image into a RNN designed to produce phrases. We can then train the whole system directly on images and their captions, so it maximizes the likelihood that descriptions it produces best match the training descriptions for each image.
  16. Mesquita, L.A.P.; Souza, R.R.; Baracho Porto, R.M.A.: Noun phrases in automatic indexing: : a structural analysis of the distribution of relevant terms in doctoral theses (2014) 0.00
    8.046731E-4 = product of:
      0.009656077 = sum of:
        0.009656077 = product of:
          0.019312155 = sum of:
            0.019312155 = weight(_text_:22 in 1442) [ClassicSimilarity], result of:
              0.019312155 = score(doc=1442,freq=2.0), product of:
                0.12478739 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.035634913 = queryNorm
                0.15476047 = fieldWeight in 1442, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1442)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  17. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.00
    8.046731E-4 = product of:
      0.009656077 = sum of:
        0.009656077 = product of:
          0.019312155 = sum of:
            0.019312155 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
              0.019312155 = score(doc=5499,freq=2.0), product of:
                0.12478739 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.035634913 = queryNorm
                0.15476047 = fieldWeight in 5499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5499)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Date
    20. 1.2015 18:30:22