Search (121 results, page 1 of 7)

  • × theme_ss:"Automatisches Indexieren"
  1. Jones, S.; Paynter, G.W.: Automatic extractionof document keyphrases for use in digital libraries : evaluations and applications (2002) 0.04
    0.037442096 = product of:
      0.07488419 = sum of:
        0.064335 = weight(_text_:interfaces in 601) [ClassicSimilarity], result of:
          0.064335 = score(doc=601,freq=2.0), product of:
            0.22349821 = queryWeight, product of:
              5.2107263 = idf(docFreq=655, maxDocs=44218)
              0.04289195 = queryNorm
            0.28785467 = fieldWeight in 601, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2107263 = idf(docFreq=655, maxDocs=44218)
              0.0390625 = fieldNorm(doc=601)
        0.010549186 = product of:
          0.031647556 = sum of:
            0.031647556 = weight(_text_:systems in 601) [ClassicSimilarity], result of:
              0.031647556 = score(doc=601,freq=4.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.24009174 = fieldWeight in 601, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=601)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.
  2. Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.03
    0.028264118 = product of:
      0.11305647 = sum of:
        0.11305647 = sum of:
          0.031329483 = weight(_text_:systems in 2673) [ClassicSimilarity], result of:
            0.031329483 = score(doc=2673,freq=2.0), product of:
              0.13181444 = queryWeight, product of:
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.04289195 = queryNorm
              0.23767869 = fieldWeight in 2673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2673)
          0.041048124 = weight(_text_:29 in 2673) [ClassicSimilarity], result of:
            0.041048124 = score(doc=2673,freq=2.0), product of:
              0.15088047 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04289195 = queryNorm
              0.27205724 = fieldWeight in 2673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2673)
          0.040678866 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
            0.040678866 = score(doc=2673,freq=2.0), product of:
              0.15020029 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04289195 = queryNorm
              0.2708308 = fieldWeight in 2673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2673)
      0.25 = coord(1/4)
    
    Date
    1. 8.1996 22:08:06
    Source
    Computer networks and ISDN systems. 29(1997) no.8, S.1147-1156
  3. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02
    0.024002783 = product of:
      0.09601113 = sum of:
        0.09601113 = product of:
          0.1440167 = sum of:
            0.062658966 = weight(_text_:systems in 6265) [ClassicSimilarity], result of:
              0.062658966 = score(doc=6265,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.47535738 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
            0.08135773 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
              0.08135773 = score(doc=6265,freq=2.0), product of:
                0.15020029 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04289195 = queryNorm
                0.5416616 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  4. Galvez, C.; Moya-Anegón, F. de: ¬An evaluation of conflation accuracy using finite-state transducers (2006) 0.02
    0.019300502 = product of:
      0.07720201 = sum of:
        0.07720201 = weight(_text_:interfaces in 5599) [ClassicSimilarity], result of:
          0.07720201 = score(doc=5599,freq=2.0), product of:
            0.22349821 = queryWeight, product of:
              5.2107263 = idf(docFreq=655, maxDocs=44218)
              0.04289195 = queryNorm
            0.3454256 = fieldWeight in 5599, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2107263 = idf(docFreq=655, maxDocs=44218)
              0.046875 = fieldNorm(doc=5599)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - To evaluate the accuracy of conflation methods based on finite-state transducers (FSTs). Design/methodology/approach - Incorrectly lemmatized and stemmed forms may lead to the retrieval of inappropriate documents. Experimental studies to date have focused on retrieval performance, but very few on conflation performance. The process of normalization we used involved a linguistic toolbox that allowed us to construct, through graphic interfaces, electronic dictionaries represented internally by FSTs. The lexical resources developed were applied to a Spanish test corpus for merging term variants in canonical lemmatized forms. Conflation performance was evaluated in terms of an adaptation of recall and precision measures, based on accuracy and coverage, not actual retrieval. The results were compared with those obtained using a Spanish version of the Porter algorithm. Findings - The conclusion is that the main strength of lemmatization is its accuracy, whereas its main limitation is the underanalysis of variant forms. Originality/value - The report outlines the potential of transducers in their application to normalization processes.
  5. Zhitomirsky-Geffet, M.; Prebor, G.; Bloch, O.: Improving proverb search and retrieval with a generic multidimensional ontology (2017) 0.02
    0.019300502 = product of:
      0.07720201 = sum of:
        0.07720201 = weight(_text_:interfaces in 3320) [ClassicSimilarity], result of:
          0.07720201 = score(doc=3320,freq=2.0), product of:
            0.22349821 = queryWeight, product of:
              5.2107263 = idf(docFreq=655, maxDocs=44218)
              0.04289195 = queryNorm
            0.3454256 = fieldWeight in 3320, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2107263 = idf(docFreq=655, maxDocs=44218)
              0.046875 = fieldNorm(doc=3320)
      0.25 = coord(1/4)
    
    Abstract
    The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large-scale analysis. Because proverbs can be described and indexed by multiple characteristics and criteria, we built a multidimensional ontology suitable for proverb classification. To evaluate the effectiveness of the constructed ontology for improving search and retrieval of proverbs, a large-scale user experiment was arranged with 70 users who were asked to search a proverb repository using ontology-based and free-text search interfaces. The comparative analysis of the results shows that the use of this ontology helped to substantially improve the search recall, precision, user satisfaction, and efficiency and to minimize user effort during the search process. A practical contribution of this work is an automated web-based proverb search and retrieval system which incorporates the proposed ontological scheme and an initial corpus of ontology-based annotated proverbs.
  6. Salton, G.: Another look at automatic text-retrieval systems (1986) 0.02
    0.017232765 = product of:
      0.06893106 = sum of:
        0.06893106 = product of:
          0.10339659 = sum of:
            0.044756405 = weight(_text_:systems in 1356) [ClassicSimilarity], result of:
              0.044756405 = score(doc=1356,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.339541 = fieldWeight in 1356, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1356)
            0.058640182 = weight(_text_:29 in 1356) [ClassicSimilarity], result of:
              0.058640182 = score(doc=1356,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.38865322 = fieldWeight in 1356, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1356)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Source
    Communications of the Association for Computing Machinery. 29(1986), S.648-656
  7. Koryconski, C.; Newell, A.F.: Natural-language processing and automatic indexing (1990) 0.01
    0.013786211 = product of:
      0.055144843 = sum of:
        0.055144843 = product of:
          0.08271726 = sum of:
            0.03580512 = weight(_text_:systems in 2313) [ClassicSimilarity], result of:
              0.03580512 = score(doc=2313,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.2716328 = fieldWeight in 2313, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2313)
            0.04691214 = weight(_text_:29 in 2313) [ClassicSimilarity], result of:
              0.04691214 = score(doc=2313,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.31092256 = fieldWeight in 2313, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2313)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    The task of producing satisfactory indexes by automatic means has been tackled on two fronts: by statistical analysis of text and by attempting content analysis of the text in much the same way as a human indexer does. Though statistical techniques have a lot to offer for free-text database systems, neither method has had much success with back-of-the-book indexing. This review examines some problems associated with the application of natural-language processing techniques to book texts. - Vgl. auch die Erwiderung von K.P. Jones
    Source
    Indexer. 17(1990), S.21-29
  8. Kim, P.K.: ¬An automatic indexing of compound words based on mutual information for Korean text retrieval (1995) 0.01
    0.013786211 = product of:
      0.055144843 = sum of:
        0.055144843 = product of:
          0.08271726 = sum of:
            0.03580512 = weight(_text_:systems in 620) [ClassicSimilarity], result of:
              0.03580512 = score(doc=620,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.2716328 = fieldWeight in 620, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0625 = fieldNorm(doc=620)
            0.04691214 = weight(_text_:29 in 620) [ClassicSimilarity], result of:
              0.04691214 = score(doc=620,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.31092256 = fieldWeight in 620, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=620)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Presents an automatic indexing technique for compound words suitable for an agglutinative language, specifically Korean. Discusses some construction conditions for compound words and the rules for decomposing compound words to enhance the exhaustivity of indexing, demonstrating that this system, mutual information, enhances both the exhaustivity of indexing and the specifity of terms. Suggests that the construction conditions and rules for decomposition presented may be used in multilingual information retrieval systems to translate the indexing terms of the specific language into those of the language required
    Source
    Library and information science. 1995, no.34, S.29-38
  9. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.01
    0.013715876 = product of:
      0.054863505 = sum of:
        0.054863505 = product of:
          0.082295254 = sum of:
            0.03580512 = weight(_text_:systems in 3581) [ClassicSimilarity], result of:
              0.03580512 = score(doc=3581,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.2716328 = fieldWeight in 3581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3581)
            0.046490133 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
              0.046490133 = score(doc=3581,freq=2.0), product of:
                0.15020029 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04289195 = queryNorm
                0.30952093 = fieldWeight in 3581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3581)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Lingo ist ein frei verfügbares System (open source) zur automatischen Indexierung der deutschen Sprache. Bei der Entwicklung von lingo standen hohe Konfigurierbarkeit und Flexibilität des Systems für unterschiedliche Einsatzmöglichkeiten im Vordergrund. Der Beitrag zeigt den Nutzen einer linguistisch basierten automatischen Indexierung für das Information Retrieval auf. Die für eine Retrievalverbesserung zur Verfügung stehende linguistische Funktionalität von lingo wird vorgestellt und an Beispielen erläutert: Grundformerkennung, Kompositumerkennung bzw. Kompositumzerlegung, Wortrelationierung, lexikalische und algorithmische Mehrwortgruppenerkennung, OCR-Fehlerkorrektur. Der offene Systemaufbau von lingo wird beschrieben, mögliche Einsatzszenarien und Anwendungsgrenzen werden benannt.
    Date
    24. 3.2006 12:22:02
  10. Franke-Maier, M.: Anforderungen an die Qualität der Inhaltserschließung im Spannungsfeld von intellektuell und automatisch erzeugten Metadaten (2018) 0.01
    0.013621165 = product of:
      0.05448466 = sum of:
        0.05448466 = product of:
          0.08172699 = sum of:
            0.041048124 = weight(_text_:29 in 5344) [ClassicSimilarity], result of:
              0.041048124 = score(doc=5344,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.27205724 = fieldWeight in 5344, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5344)
            0.040678866 = weight(_text_:22 in 5344) [ClassicSimilarity], result of:
              0.040678866 = score(doc=5344,freq=2.0), product of:
                0.15020029 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04289195 = queryNorm
                0.2708308 = fieldWeight in 5344, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5344)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Spätestens seit dem Deutschen Bibliothekartag 2018 hat sich die Diskussion zu den automatischen Verfahren der Inhaltserschließung der Deutschen Nationalbibliothek von einer politisch geführten Diskussion in eine Qualitätsdiskussion verwandelt. Der folgende Beitrag beschäftigt sich mit Fragen der Qualität von Inhaltserschließung in digitalen Zeiten, wo heterogene Erzeugnisse unterschiedlicher Verfahren aufeinandertreffen und versucht, wichtige Anforderungen an Qualität zu definieren. Dieser Tagungsbeitrag fasst die vom Autor als Impulse vorgetragenen Ideen beim Workshop der FAG "Erschließung und Informationsvermittlung" des GBV am 29. August 2018 in Kiel zusammen. Der Workshop fand im Rahmen der 22. Verbundkonferenz des GBV statt.
  11. Schulz, K.U.; Brunner, L.: Vollautomatische thematische Verschlagwortung großer Textkollektionen mittels semantischer Netze (2017) 0.01
    0.012062935 = product of:
      0.04825174 = sum of:
        0.04825174 = product of:
          0.07237761 = sum of:
            0.031329483 = weight(_text_:systems in 3493) [ClassicSimilarity], result of:
              0.031329483 = score(doc=3493,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.23767869 = fieldWeight in 3493, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3493)
            0.041048124 = weight(_text_:29 in 3493) [ClassicSimilarity], result of:
              0.041048124 = score(doc=3493,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.27205724 = fieldWeight in 3493, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3493)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Source
    Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber
  12. Böhm, A.; Seifert, C.; Schlötterer, J.; Granitzer, M.: Identifying tweets from the economic domain (2017) 0.01
    0.012062935 = product of:
      0.04825174 = sum of:
        0.04825174 = product of:
          0.07237761 = sum of:
            0.031329483 = weight(_text_:systems in 3495) [ClassicSimilarity], result of:
              0.031329483 = score(doc=3495,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.23767869 = fieldWeight in 3495, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3495)
            0.041048124 = weight(_text_:29 in 3495) [ClassicSimilarity], result of:
              0.041048124 = score(doc=3495,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.27205724 = fieldWeight in 3495, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3495)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Source
    Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber
  13. Kempf, A.O.: Neue Verfahrenswege der Wissensorganisation : eine Evaluation automatischer Indexierung in der sozialwissenschaftlichen Fachinformation (2017) 0.01
    0.012062935 = product of:
      0.04825174 = sum of:
        0.04825174 = product of:
          0.07237761 = sum of:
            0.031329483 = weight(_text_:systems in 3497) [ClassicSimilarity], result of:
              0.031329483 = score(doc=3497,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.23767869 = fieldWeight in 3497, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3497)
            0.041048124 = weight(_text_:29 in 3497) [ClassicSimilarity], result of:
              0.041048124 = score(doc=3497,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.27205724 = fieldWeight in 3497, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3497)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Source
    Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber
  14. Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.01
    0.010339659 = product of:
      0.041358635 = sum of:
        0.041358635 = product of:
          0.062037952 = sum of:
            0.026853843 = weight(_text_:systems in 1660) [ClassicSimilarity], result of:
              0.026853843 = score(doc=1660,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.2037246 = fieldWeight in 1660, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1660)
            0.035184108 = weight(_text_:29 in 1660) [ClassicSimilarity], result of:
              0.035184108 = score(doc=1660,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.23319192 = fieldWeight in 1660, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1660)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    A corpus of 242 abstracts of Arabic documents on computer science and information systems using the Proceedings of the Saudi Arabian National Conferences as a source was put together. Reports on the design and building of an automatic information retrieval system from scratch to handle Arabic data. Both automatic and manual indexing techniques were implemented. Experiments using measures of recall and precision has demonstrated that automatic indexing is at least as effective as manual indexing and more effective in some cases. Automatic indexing is both cheaper and faster. Results suggests that a wider coverage of the literature can be achieved with less money and produce as good results as with manual indexing. Compares the retrieval results using words as index terms versus stems and roots, and confirms the results obtained by Al-Kharashi and Abu-Salem with smaller corpora that root indexing is more effective than word indexing
    Date
    29. 7.1998 17:40:01
  15. Ward, M.L.: ¬The future of the human indexer (1996) 0.01
    0.010286908 = product of:
      0.04114763 = sum of:
        0.04114763 = product of:
          0.061721444 = sum of:
            0.026853843 = weight(_text_:systems in 7244) [ClassicSimilarity], result of:
              0.026853843 = score(doc=7244,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.2037246 = fieldWeight in 7244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7244)
            0.0348676 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
              0.0348676 = score(doc=7244,freq=2.0), product of:
                0.15020029 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04289195 = queryNorm
                0.23214069 = fieldWeight in 7244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7244)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)
    Date
    9. 2.1997 18:44:22
  16. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.01
    0.010161275 = product of:
      0.0406451 = sum of:
        0.0406451 = product of:
          0.060967647 = sum of:
            0.031647556 = weight(_text_:systems in 5400) [ClassicSimilarity], result of:
              0.031647556 = score(doc=5400,freq=4.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.24009174 = fieldWeight in 5400, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5400)
            0.029320091 = weight(_text_:29 in 5400) [ClassicSimilarity], result of:
              0.029320091 = score(doc=5400,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.19432661 = fieldWeight in 5400, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5400)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
    Date
    29. 9.2019 12:18:42
    Footnote
    Beitrag eines Special Issue: Research Information Systems and Science Classifications; including papers from "Trajectories for Research: Fathoming the Promise of the NARCIS Classification," 27-28 September 2018, The Hague, The Netherlands.
  17. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.01
    0.009841698 = product of:
      0.039366793 = sum of:
        0.039366793 = product of:
          0.059050187 = sum of:
            0.03580512 = weight(_text_:systems in 5499) [ClassicSimilarity], result of:
              0.03580512 = score(doc=5499,freq=8.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.2716328 = fieldWeight in 5499, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5499)
            0.023245066 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
              0.023245066 = score(doc=5499,freq=2.0), product of:
                0.15020029 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04289195 = queryNorm
                0.15476047 = fieldWeight in 5499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5499)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Purpose Modern mathematicians and scientists of math-related disciplines often use Document Preparation Systems (DPS) to write and Computer Algebra Systems (CAS) to calculate mathematical expressions. Usually, they translate the expressions manually between DPS and CAS. This process is time-consuming and error-prone. The purpose of this paper is to automate this translation. This paper uses Maple and Mathematica as the CAS, and LaTeX as the DPS. Design/methodology/approach Bruce Miller at the National Institute of Standards and Technology (NIST) developed a collection of special LaTeX macros that create links from mathematical symbols to their definitions in the NIST Digital Library of Mathematical Functions (DLMF). The authors are using these macros to perform rule-based translations between the formulae in the DLMF and CAS. Moreover, the authors develop software to ease the creation of new rules and to discover inconsistencies. Findings The authors created 396 mappings and translated 58.8 percent of DLMF formulae (2,405 expressions) successfully between Maple and DLMF. For a significant percentage, the special function definitions in Maple and the DLMF were different. An atomic symbol in one system maps to a composite expression in the other system. The translator was also successfully used for automatic verification of mathematical online compendia and CAS. The evaluation techniques discovered two errors in the DLMF and one defect in Maple. Originality/value This paper introduces the first translation tool for special functions between LaTeX and CAS. The approach improves error-prone manual translations and can be used to verify mathematical online compendia and CAS.
    Date
    20. 1.2015 18:30:22
  18. Kuhlen, R.: Morphologische Relationen durch Reduktionsalgorithmen (1974) 0.01
    0.009675137 = product of:
      0.038700547 = sum of:
        0.038700547 = product of:
          0.11610164 = sum of:
            0.11610164 = weight(_text_:29 in 4251) [ClassicSimilarity], result of:
              0.11610164 = score(doc=4251,freq=4.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.7694941 = fieldWeight in 4251, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4251)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    29. 1.2011 14:56:29
  19. Milstead, J.L.: Thesauri in a full-text world (1998) 0.01
    0.008572424 = product of:
      0.034289695 = sum of:
        0.034289695 = product of:
          0.05143454 = sum of:
            0.022378203 = weight(_text_:systems in 2337) [ClassicSimilarity], result of:
              0.022378203 = score(doc=2337,freq=2.0), product of:
                0.13181444 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.04289195 = queryNorm
                0.1697705 = fieldWeight in 2337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2337)
            0.029056335 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
              0.029056335 = score(doc=2337,freq=2.0), product of:
                0.15020029 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04289195 = queryNorm
                0.19345059 = fieldWeight in 2337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2337)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Despite early claims to the contemporary, thesauri continue to find use as access tools for information in the full-text environment. Their mode of use is changing, but this change actually represents an expansion rather than a contrdiction of their utility. Thesauri and similar vocabulary tools can complement full-text access by aiding users in focusing their searches, by supplementing the linguistic analysis of the text search engine, and even by serving as one of the tools used by the linguistic engine for its analysis. While human indexing contunues to be used for many databases, the trend is to increase the use of machine aids for this purpose. All machine-aided indexing (MAI) systems rely on thesauri as the basis for term selection. In the 21st century, the balance of effort between human and machine will change at both input and output, but thesauri will continue to play an important role for the foreseeable future
    Date
    22. 9.1997 19:16:05
  20. Panyr, J.: STEINADLER: ein Verfahren zur automatischen Deskribierung und zur automatischen thematischen Klassifikation (1978) 0.01
    0.0078186905 = product of:
      0.031274762 = sum of:
        0.031274762 = product of:
          0.09382428 = sum of:
            0.09382428 = weight(_text_:29 in 5169) [ClassicSimilarity], result of:
              0.09382428 = score(doc=5169,freq=2.0), product of:
                0.15088047 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04289195 = queryNorm
                0.6218451 = fieldWeight in 5169, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.125 = fieldNorm(doc=5169)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Source
    Nachrichten für Dokumentation. 29(1978), S.92-96

Years

Languages

Types

  • a 107
  • el 9
  • m 4
  • x 3
  • s 2
  • d 1
  • p 1
  • More… Less…