Search (86 results, page 2 of 5)

  • × theme_ss:"Automatisches Indexieren"
  1. Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.02
    0.017658776 = product of:
      0.03531755 = sum of:
        0.03531755 = product of:
          0.0706351 = sum of:
            0.0706351 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
              0.0706351 = score(doc=1952,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38690117 = fieldWeight in 1952, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1952)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    16. 8.1998 12:51:22
  2. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02
    0.017658776 = product of:
      0.03531755 = sum of:
        0.03531755 = product of:
          0.0706351 = sum of:
            0.0706351 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.0706351 = score(doc=4157,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  3. Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.02
    0.017658776 = product of:
      0.03531755 = sum of:
        0.03531755 = product of:
          0.0706351 = sum of:
            0.0706351 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
              0.0706351 = score(doc=374,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38690117 = fieldWeight in 374, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=374)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 4.2002 10:22:41
  4. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.02
    0.017658776 = product of:
      0.03531755 = sum of:
        0.03531755 = product of:
          0.0706351 = sum of:
            0.0706351 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
              0.0706351 = score(doc=2759,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38690117 = fieldWeight in 2759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2759)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  5. Salton, G.; Araya, J.: On the use of clustered file organizations in information search and retrieval (1990) 0.02
    0.01752632 = product of:
      0.03505264 = sum of:
        0.03505264 = product of:
          0.07010528 = sum of:
            0.07010528 = weight(_text_:classification in 2409) [ClassicSimilarity], result of:
              0.07010528 = score(doc=2409,freq=2.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.42223644 = fieldWeight in 2409, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2409)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Library classification and its functions. Int. Conf. on ..., 20.-21.6.1989, Edmonton, Alberta. Ed.: A. Nitecki u. T. Fell
  6. Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.02
    0.015178238 = product of:
      0.030356476 = sum of:
        0.030356476 = product of:
          0.060712952 = sum of:
            0.060712952 = weight(_text_:classification in 720) [ClassicSimilarity], result of:
              0.060712952 = score(doc=720,freq=6.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.3656675 = fieldWeight in 720, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=720)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.
    Source
    Cataloging and classification quarterly. 59(2021) no.8, p.815-834
  7. McKiernan, G.: Automated categorisation of Web resources : a profile of selected projects, research, products, and services (1996) 0.01
    0.014605265 = product of:
      0.02921053 = sum of:
        0.02921053 = product of:
          0.05842106 = sum of:
            0.05842106 = weight(_text_:classification in 2533) [ClassicSimilarity], result of:
              0.05842106 = score(doc=2533,freq=2.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.35186368 = fieldWeight in 2533, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2533)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Profiles several representative current efforts that apply established as well as more innovative methods of automated classification, organization or other method of categorisation of WWW resources
  8. Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.01
    0.014605265 = product of:
      0.02921053 = sum of:
        0.02921053 = product of:
          0.05842106 = sum of:
            0.05842106 = weight(_text_:classification in 3627) [ClassicSimilarity], result of:
              0.05842106 = score(doc=3627,freq=8.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.35186368 = fieldWeight in 3627, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3627)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).
  9. Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.01
    0.014458476 = product of:
      0.028916951 = sum of:
        0.028916951 = product of:
          0.057833903 = sum of:
            0.057833903 = weight(_text_:classification in 7209) [ClassicSimilarity], result of:
              0.057833903 = score(doc=7209,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.34832728 = fieldWeight in 7209, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7209)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The Nordic WAIS/WWW project sponsored by NORDINFO is a joint project between Lund University Library and the National Technological Library of Denmark. It aims to improve the existing networked information discovery and retrieval tools Wide Area Information System (WAIS) and World Wide Web (WWW), and to move towards unifying WWW and WAIS. Details current results focusing on the WAIS side of the project. Describes research into automatic indexing and classification of WAIS sources, development of an orientation tool for WAIS, and development of a WAIS index of WWW resources
  10. Short, M.: Text mining and subject analysis for fiction; or, using machine learning and information extraction to assign subject headings to dime novels (2019) 0.01
    0.014458476 = product of:
      0.028916951 = sum of:
        0.028916951 = product of:
          0.057833903 = sum of:
            0.057833903 = weight(_text_:classification in 5481) [ClassicSimilarity], result of:
              0.057833903 = score(doc=5481,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.34832728 = fieldWeight in 5481, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5481)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.
    Source
    Cataloging and classification quarterly. 57(2019) no.5, S.315-336
  11. Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.01
    0.014458476 = product of:
      0.028916951 = sum of:
        0.028916951 = product of:
          0.057833903 = sum of:
            0.057833903 = weight(_text_:classification in 1139) [ClassicSimilarity], result of:
              0.057833903 = score(doc=1139,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.34832728 = fieldWeight in 1139, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1139)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In light of AI (Artificial Intelligence) and NLP (Natural language processing) technologies, this article examines the feasibility of using AI/NLP models to enhance the subject indexing of digital resources. While BERT (Bidirectional Encoder Representations from Transformers) models are widely used in scholarly communities, the authors assess whether BERT models can be used in machine-assisted indexing in the Project Gutenberg collection, through suggesting Library of Congress subject headings filtered by certain Library of Congress Classification subclass labels. The findings of this study are informative for further research on BERT models to assist with automatic subject indexing for digital library collections.
    Source
    Cataloging and classification quarterly. 60(2022) no.8, p.807-835
  12. Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.01
    0.014127021 = product of:
      0.028254041 = sum of:
        0.028254041 = product of:
          0.056508083 = sum of:
            0.056508083 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
              0.056508083 = score(doc=4709,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.30952093 = fieldWeight in 4709, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4709)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 7.1996 9:22:19
  13. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01
    0.014127021 = product of:
      0.028254041 = sum of:
        0.028254041 = product of:
          0.056508083 = sum of:
            0.056508083 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.056508083 = score(doc=6752,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    6. 3.1997 16:22:15
  14. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.01
    0.014127021 = product of:
      0.028254041 = sum of:
        0.028254041 = product of:
          0.056508083 = sum of:
            0.056508083 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
              0.056508083 = score(doc=3581,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.30952093 = fieldWeight in 3581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3581)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    24. 3.2006 12:22:02
  15. Probst, M.; Mittelbach, J.: Maschinelle Indexierung in der Sacherschließung wissenschaftlicher Bibliotheken (2006) 0.01
    0.014127021 = product of:
      0.028254041 = sum of:
        0.028254041 = product of:
          0.056508083 = sum of:
            0.056508083 = weight(_text_:22 in 1755) [ClassicSimilarity], result of:
              0.056508083 = score(doc=1755,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.30952093 = fieldWeight in 1755, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1755)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2008 12:35:19
  16. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.01
    0.014127021 = product of:
      0.028254041 = sum of:
        0.028254041 = product of:
          0.056508083 = sum of:
            0.056508083 = weight(_text_:22 in 401) [ClassicSimilarity], result of:
              0.056508083 = score(doc=401,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.30952093 = fieldWeight in 401, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=401)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    11. 9.2012 19:43:22
  17. Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C.: On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages (2015) 0.01
    0.012648531 = product of:
      0.025297062 = sum of:
        0.025297062 = product of:
          0.050594125 = sum of:
            0.050594125 = weight(_text_:classification in 2161) [ClassicSimilarity], result of:
              0.050594125 = score(doc=2161,freq=6.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.3047229 = fieldWeight in 2161, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2161)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focused on determining whether the content of a text is objective or subjective, and in the latter case, if it conveys a positive or a negative opinion. Most polarity detection techniques tend to take into account individual terms in the text and even some degree of linguistic knowledge, but they do not usually consider syntactic relations between words. This article explores how relating lexical, syntactic, and psychometric information can be helpful to perform polarity classification on Spanish tweets. We provide an evaluation for both shallow and deep linguistic perspectives. Empirical results show an improved performance of syntactic approaches over pure lexical models when using large training sets to create a classifier, but this tendency is reversed when small training collections are used.
  18. Kanan, T.; Fox, E.A.: Automated arabic text classification with P-Stemmer, machine learning, and a tailored news article taxonomy (2016) 0.01
    0.012648531 = product of:
      0.025297062 = sum of:
        0.025297062 = product of:
          0.050594125 = sum of:
            0.050594125 = weight(_text_:classification in 3151) [ClassicSimilarity], result of:
              0.050594125 = score(doc=3151,freq=6.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.3047229 = fieldWeight in 3151, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3151)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Arabic news articles in electronic collections are difficult to study. Browsing by category is rarely supported. Although helpful machine-learning methods have been applied successfully to similar situations for English news articles, limited research has been completed to yield suitable solutions for Arabic news. In connection with a Qatar National Research Fund (QNRF)-funded project to build digital library community and infrastructure in Qatar, we developed software for browsing a collection of about 237,000 Arabic news articles, which should be applicable to other Arabic news collections. We designed a simple taxonomy for Arabic news stories that is suitable for the needs of Qatar and other nations, is compatible with the subject codes of the International Press Telecommunications Council, and was enhanced with the aid of a librarian expert as well as five Arabic-speaking volunteers. We developed tailored stemming (i.e., a new Arabic light stemmer called P-Stemmer) and automatic classification methods (the best being binary Support Vector Machines classifiers) to work with the taxonomy. Using evaluation techniques commonly used in the information retrieval community, including 10-fold cross-validation and the Wilcoxon signed-rank test, we showed that our approach to stemming and classification is superior to state-of-the-art techniques.
  19. Golub, K.: Automatic subject indexing of text (2019) 0.01
    0.012648531 = product of:
      0.025297062 = sum of:
        0.025297062 = product of:
          0.050594125 = sum of:
            0.050594125 = weight(_text_:classification in 5268) [ClassicSimilarity], result of:
              0.050594125 = score(doc=5268,freq=6.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.3047229 = fieldWeight in 5268, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5268)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Automatic subject indexing addresses problems of scale and sustainability and can be at the same time used to enrich existing metadata records, establish more connections across and between resources from various metadata and resource collec-tions, and enhance consistency of the metadata. In this work, au-tomatic subject indexing focuses on assigning index terms or classes from established knowledge organization systems (KOSs) for subject indexing like thesauri, subject headings systems and classification systems. The following major approaches are dis-cussed, in terms of their similarities and differences, advantages and disadvantages for automatic assigned indexing from KOSs: "text categorization," "document clustering," and "document classification." Text categorization is perhaps the most wide-spread, machine-learning approach with what seems generally good reported performance. Document clustering automatically both creates groups of related documents and extracts names of subjects depicting the group at hand. Document classification re-uses the intellectual effort invested into creating a KOS for sub-ject indexing and even simple string-matching algorithms have been reported to achieve good results, because one concept can be described using a number of different terms, including equiv-alent, related, narrower and broader terms. Finally, applicability of automatic subject indexing to operative information systems and challenges of evaluation are outlined, suggesting the need for more research.
  20. Zhitomirsky-Geffet, M.; Prebor, G.; Bloch, O.: Improving proverb search and retrieval with a generic multidimensional ontology (2017) 0.01
    0.012392979 = product of:
      0.024785958 = sum of:
        0.024785958 = product of:
          0.049571916 = sum of:
            0.049571916 = weight(_text_:classification in 3320) [ClassicSimilarity], result of:
              0.049571916 = score(doc=3320,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.29856625 = fieldWeight in 3320, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3320)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large-scale analysis. Because proverbs can be described and indexed by multiple characteristics and criteria, we built a multidimensional ontology suitable for proverb classification. To evaluate the effectiveness of the constructed ontology for improving search and retrieval of proverbs, a large-scale user experiment was arranged with 70 users who were asked to search a proverb repository using ontology-based and free-text search interfaces. The comparative analysis of the results shows that the use of this ontology helped to substantially improve the search recall, precision, user satisfaction, and efficiency and to minimize user effort during the search process. A practical contribution of this work is an automated web-based proverb search and retrieval system which incorporates the proposed ontological scheme and an initial corpus of ontology-based annotated proverbs.

Languages

  • e 67
  • d 18
  • ru 1
  • More… Less…

Types

  • a 78
  • el 7
  • m 2
  • s 2
  • x 2
  • p 1
  • More… Less…