Search (27 results, page 1 of 2)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  • × type_ss:"a"
  • × year_i:[2010 TO 2020}
  1. Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.04
    0.035463274 = product of:
      0.12412146 = sum of:
        0.03761707 = weight(_text_:classification in 2299) [ClassicSimilarity], result of:
          0.03761707 = score(doc=2299,freq=10.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.39339557 = fieldWeight in 2299, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
        0.0237488 = product of:
          0.0474976 = sum of:
            0.0474976 = weight(_text_:schemes in 2299) [ClassicSimilarity], result of:
              0.0474976 = score(doc=2299,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.2956176 = fieldWeight in 2299, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2299)
          0.5 = coord(1/2)
        0.02513852 = weight(_text_:bibliographic in 2299) [ClassicSimilarity], result of:
          0.02513852 = score(doc=2299,freq=2.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.21506234 = fieldWeight in 2299, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
        0.03761707 = weight(_text_:classification in 2299) [ClassicSimilarity], result of:
          0.03761707 = score(doc=2299,freq=10.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.39339557 = fieldWeight in 2299, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
      0.2857143 = coord(4/14)
    
    Abstract
    Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.
    Source
    Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro
  2. Wang, Z.; Khoo, C.S.G.; Chaudhry, A.S.: Evaluation of the navigation effectiveness of an organizational taxonomy built on a general classification scheme and domain thesauri (2014) 0.02
    0.021092122 = product of:
      0.0984299 = sum of:
        0.03496567 = weight(_text_:classification in 1251) [ClassicSimilarity], result of:
          0.03496567 = score(doc=1251,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 1251, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=1251)
        0.02849856 = product of:
          0.05699712 = sum of:
            0.05699712 = weight(_text_:schemes in 1251) [ClassicSimilarity], result of:
              0.05699712 = score(doc=1251,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.35474116 = fieldWeight in 1251, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1251)
          0.5 = coord(1/2)
        0.03496567 = weight(_text_:classification in 1251) [ClassicSimilarity], result of:
          0.03496567 = score(doc=1251,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 1251, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=1251)
      0.21428572 = coord(3/14)
    
    Abstract
    This paper presents an evaluation study of the navigation effectiveness of a multifaceted organizational taxonomy that was built on the Dewey Decimal Classification and several domain thesauri in the area of library and information science education. The objective of the evaluation was to detect deficiencies in the taxonomy and to infer problems of applied construction steps from users' navigation difficulties. The evaluation approach included scenario-based navigation exercises and postexercise interviews. Navigation exercise errors and underlying reasons were analyzed in relation to specific components of the taxonomy and applied construction steps. Guidelines for the construction of the hierarchical structure and categories of an organizational taxonomy using existing general classification schemes and domain thesauri were derived from the evaluation results.
  3. Gnoli, C.; Pusterla, L.; Bendiscioli, A.; Recinella, C.: Classification for collections mapping and query expansion (2016) 0.02
    0.021092122 = product of:
      0.0984299 = sum of:
        0.03496567 = weight(_text_:classification in 3102) [ClassicSimilarity], result of:
          0.03496567 = score(doc=3102,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 3102, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=3102)
        0.02849856 = product of:
          0.05699712 = sum of:
            0.05699712 = weight(_text_:schemes in 3102) [ClassicSimilarity], result of:
              0.05699712 = score(doc=3102,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.35474116 = fieldWeight in 3102, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3102)
          0.5 = coord(1/2)
        0.03496567 = weight(_text_:classification in 3102) [ClassicSimilarity], result of:
          0.03496567 = score(doc=3102,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 3102, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=3102)
      0.21428572 = coord(3/14)
    
    Abstract
    Dewey Decimal Classification has been used to organize materials owned by the three scientific libraries at the University of Pavia, and to allow integrated browsing in their union catalogue through SciGator, a home built web-based user interface. Classification acts as a bridge between collections located in different places and shelved according to different local schemes. Furthermore, cross-discipline relationships recorded in the system allow for expanded queries that increase recall. Advantages and possible improvements of such a system are discussed.
  4. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.02
    0.018778171 = product of:
      0.087631464 = sum of:
        0.03364573 = weight(_text_:classification in 3280) [ClassicSimilarity], result of:
          0.03364573 = score(doc=3280,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
        0.03364573 = weight(_text_:classification in 3280) [ClassicSimilarity], result of:
          0.03364573 = score(doc=3280,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
        0.020340007 = product of:
          0.040680014 = sum of:
            0.040680014 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
              0.040680014 = score(doc=3280,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.38690117 = fieldWeight in 3280, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3280)
          0.5 = coord(1/2)
      0.21428572 = coord(3/14)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  5. Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.01
    0.009613066 = product of:
      0.06729146 = sum of:
        0.03364573 = weight(_text_:classification in 5055) [ClassicSimilarity], result of:
          0.03364573 = score(doc=5055,freq=8.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 5055, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5055)
        0.03364573 = weight(_text_:classification in 5055) [ClassicSimilarity], result of:
          0.03364573 = score(doc=5055,freq=8.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 5055, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5055)
      0.14285715 = coord(2/14)
    
    Abstract
    Distant supervision (DS) has the advantage of automatically generating large amounts of labelled training data and has been widely used for relation extraction. However, there are usually many wrong labels in the automatically labelled data in distant supervision (Riedel, Yao, & McCallum, 2010). This paper presents a novel method to reduce the wrong labels. The proposed method uses the semantic Jaccard with word embedding to measure the semantic similarity between the relation phrase in the knowledge base and the dependency phrases between two entities in a sentence to filter the wrong labels. In the process of reducing wrong labels, the semantic Jaccard algorithm selects a core dependency phrase to represent the candidate relation in a sentence, which can capture features for relation classification and avoid the negative impact from irrelevant term sequences that previous neural network models of relation extraction often suffer. In the process of relation classification, the core dependency phrases are also used as the input of a convolutional neural network (CNN) for relation classification. The experimental results show that compared with the methods using original DS data, the methods using filtered DS data performed much better in relation extraction. It indicates that the semantic similarity based method is effective in reducing wrong labels. The relation extraction performance of the CNN model using the core dependency phrases as input is the best of all, which indicates that using the core dependency phrases as input of CNN is enough to capture the features for relation classification and could avoid negative impact from irrelevant terms.
  6. Green, R.: See-also relationships in the Dewey Decimal Classification (2011) 0.01
    0.009516451 = product of:
      0.06661515 = sum of:
        0.033307575 = weight(_text_:classification in 4615) [ClassicSimilarity], result of:
          0.033307575 = score(doc=4615,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 4615, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4615)
        0.033307575 = weight(_text_:classification in 4615) [ClassicSimilarity], result of:
          0.033307575 = score(doc=4615,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 4615, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4615)
      0.14285715 = coord(2/14)
    
    Abstract
    This paper investigates the semantics of topical, associative see-also relationships in schedule and table entries of the Dewey Decimal Classification (DDC) system. Based on the see-also relationships in a random sample of 100 classes containing one or more of these relationships, a semi-structured inventory of sources of see-also relationships is generated, of which the most important are lexical similarity, complementarity, facet difference, and relational configuration difference. The premise that see-also relationships based on lexical similarity may be language-specific is briefly examined. The paper concludes with recommendations on the continued use of see-also relationships in the DDC.
  7. Moreira, W.; Martínez-Ávila, D.: Concept relationships in knowledge organization systems : elements for analysis and common research among fields (2018) 0.01
    0.009516451 = product of:
      0.06661515 = sum of:
        0.033307575 = weight(_text_:classification in 5166) [ClassicSimilarity], result of:
          0.033307575 = score(doc=5166,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 5166, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5166)
        0.033307575 = weight(_text_:classification in 5166) [ClassicSimilarity], result of:
          0.033307575 = score(doc=5166,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 5166, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5166)
      0.14285715 = coord(2/14)
    
    Abstract
    Knowledge organization systems have been studied in several fields and for different and complementary aspects. Among the aspects that concentrate common interests, in this article we highlight those related to the terminological and conceptual relationships among the components of any knowledge organization system. This research aims to contribute to the critical analysis of knowledge organization systems, especially ontologies, thesauri, and classification systems, by the comprehension of its similarities and differences when dealing with concepts and their ways of relating to each other as well as to the conceptual design that is adopted.
    Source
    Cataloging and classification quarterly. 56(2018) no.1, S.19-39
  8. Salaba, A.; Zeng, M.L.: Extending the "Explore" user task beyond subject authority data into the linked data sphere (2014) 0.01
    0.009384072 = product of:
      0.065688506 = sum of:
        0.0514505 = weight(_text_:subject in 1465) [ClassicSimilarity], result of:
          0.0514505 = score(doc=1465,freq=6.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.4791082 = fieldWeight in 1465, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1465)
        0.014238005 = product of:
          0.02847601 = sum of:
            0.02847601 = weight(_text_:22 in 1465) [ClassicSimilarity], result of:
              0.02847601 = score(doc=1465,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.2708308 = fieldWeight in 1465, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1465)
          0.5 = coord(1/2)
      0.14285715 = coord(2/14)
    
    Abstract
    "Explore" is a user task introduced in the Functional Requirements for Subject Authority Data (FRSAD) final report. Through various case scenarios, the authors discuss how structured data, presented based on Linked Data principles and using knowledge organisation systems (KOS) as the backbone, extend the explore task within and beyond subject authority data.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  9. Zeng, M.L.; Gracy, K.F.; Zumer, M.: Using a semantic analysis tool to generate subject access points : a study using Panofsky's theory and two research samples (2014) 0.01
    0.009018113 = product of:
      0.06312679 = sum of:
        0.05092278 = weight(_text_:subject in 1464) [ClassicSimilarity], result of:
          0.05092278 = score(doc=1464,freq=8.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.4741941 = fieldWeight in 1464, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.046875 = fieldNorm(doc=1464)
        0.0122040035 = product of:
          0.024408007 = sum of:
            0.024408007 = weight(_text_:22 in 1464) [ClassicSimilarity], result of:
              0.024408007 = score(doc=1464,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.23214069 = fieldWeight in 1464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1464)
          0.5 = coord(1/2)
      0.14285715 = coord(2/14)
    
    Abstract
    This paper attempts to explore an approach of using an automatic semantic analysis tool to enhance the "subject" access to materials that are not included in the usual library subject cataloging process. Using two research samples the authors analyzed the access points supplied by OpenCalais, a semantic analysis tool. As an aid in understanding how computerized subject analysis might be approached, this paper suggests using the three-layer framework that has been accepted and applied in image analysis, developed by Erwin Panofsky.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  10. Huang, L.; Milne, D.; Frank, E.; Witten, I.H.: Learning a concept-based document similarity measure (2012) 0.01
    0.008156957 = product of:
      0.057098698 = sum of:
        0.028549349 = weight(_text_:classification in 372) [ClassicSimilarity], result of:
          0.028549349 = score(doc=372,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 372, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=372)
        0.028549349 = weight(_text_:classification in 372) [ClassicSimilarity], result of:
          0.028549349 = score(doc=372,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 372, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=372)
      0.14285715 = coord(2/14)
    
    Abstract
    Document similarity measures are crucial components of many text-analysis tasks, including information retrieval, document classification, and document clustering. Conventional measures are brittle: They estimate the surface overlap between documents based on the words they mention and ignore deeper semantic connections. We propose a new measure that assesses similarity at both the lexical and semantic levels, and learns from human judgments how to combine them by using machine-learning techniques. Experiments show that the new measure produces values for documents that are more consistent with people's judgments than people are with each other. We also use it to classify and cluster large document sets covering different genres and topics, and find that it improves both classification and clustering performance.
  11. Gábor, K.; Zargayouna, H.; Tellier, I.; Buscaldi, D.; Charnois, T.: ¬A typology of semantic relations dedicated to scientific literature analysis (2016) 0.01
    0.0067291465 = product of:
      0.047104023 = sum of:
        0.023552012 = weight(_text_:classification in 2933) [ClassicSimilarity], result of:
          0.023552012 = score(doc=2933,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.24630459 = fieldWeight in 2933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2933)
        0.023552012 = weight(_text_:classification in 2933) [ClassicSimilarity], result of:
          0.023552012 = score(doc=2933,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.24630459 = fieldWeight in 2933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2933)
      0.14285715 = coord(2/14)
    
    Abstract
    We propose a method for improving access to scientific literature by analyzing the content of research papers beyond citation links and topic tracking. Our model relies on a typology of explicit semantic relations. These relations are instantiated in the abstract/introduction part of the papers and can be identified automatically using textual data and external ontologies. Preliminary results show a promising precision in unsupervised relationship classification.
  12. Zhang, W.; Yoshida, T.; Tang, X.: ¬A comparative study of TF*IDF, LSI and multi-words for text classification (2011) 0.01
    0.00576784 = product of:
      0.04037488 = sum of:
        0.02018744 = weight(_text_:classification in 1165) [ClassicSimilarity], result of:
          0.02018744 = score(doc=1165,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 1165, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=1165)
        0.02018744 = weight(_text_:classification in 1165) [ClassicSimilarity], result of:
          0.02018744 = score(doc=1165,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 1165, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=1165)
      0.14285715 = coord(2/14)
    
  13. Blanco, R.; Matthews, M.; Mika, P.: Ranking of daily deals with concept expansion (2015) 0.01
    0.00576784 = product of:
      0.04037488 = sum of:
        0.02018744 = weight(_text_:classification in 2663) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2663,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2663, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2663)
        0.02018744 = weight(_text_:classification in 2663) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2663,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2663, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2663)
      0.14285715 = coord(2/14)
    
    Abstract
    Daily deals have emerged in the last three years as a successful form of online advertising. The downside of this success is that users are increasingly overloaded by the many thousands of deals offered each day by dozens of deal providers and aggregators. The challenge is thus offering the right deals to the right users i.e., the relevance ranking of deals. This is the problem we address in our paper. Exploiting the characteristics of deals data, we propose a combination of a term- and a concept-based retrieval model that closes the semantic gap between queries and documents expanding both of them with category information. The method consistently outperforms state-of-the-art methods based on term-matching alone and existing approaches for ad classification and ranking.
  14. Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.01
    0.00576784 = product of:
      0.04037488 = sum of:
        0.02018744 = weight(_text_:classification in 2799) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2799,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2799, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2799)
        0.02018744 = weight(_text_:classification in 2799) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2799,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2799, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2799)
      0.14285715 = coord(2/14)
    
    Abstract
    With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
  15. Oh, K.E.; Joo, S.; Jeong, E.-J.: Online consumer health information organization : users' perspectives on faceted navigation (2015) 0.00
    0.004806533 = product of:
      0.03364573 = sum of:
        0.016822865 = weight(_text_:classification in 2197) [ClassicSimilarity], result of:
          0.016822865 = score(doc=2197,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 2197, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2197)
        0.016822865 = weight(_text_:classification in 2197) [ClassicSimilarity], result of:
          0.016822865 = score(doc=2197,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 2197, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2197)
      0.14285715 = coord(2/14)
    
    Abstract
    We investigate facets of online health information that are preferred, easy-to-use and useful in accessing online consumer health information from a user's perspective. In this study, the existing classification structure of 20 top ranked consumer health information websites in South Korea were analyzed, and nine facets that are used in organizing health information in those websites were identified. Based on the identified facets, an online survey, which asked participants' preferences for as well as perceived ease-of-use and usefulness of each facet in accessing online health information, was conducted. The analysis of the survey results showed that among the nine facets, the "diseases & conditions" and "body part" facets were most preferred, and perceived as easy-to-use and useful in accessing online health information. In contrast, "age," "gender," and "alternative medicine" facets were perceived as relatively less preferred, easy-to-use and useful. This research study has direct implications for organization and design of health information websites in that it suggests facets to include and avoid in organizing and providing access points to online health information.
  16. Buccio, E. Di; Melucci, M.; Moro, F.: Detecting verbose queries and improving information retrieval (2014) 0.00
    0.004806533 = product of:
      0.03364573 = sum of:
        0.016822865 = weight(_text_:classification in 2695) [ClassicSimilarity], result of:
          0.016822865 = score(doc=2695,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 2695, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2695)
        0.016822865 = weight(_text_:classification in 2695) [ClassicSimilarity], result of:
          0.016822865 = score(doc=2695,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 2695, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2695)
      0.14285715 = coord(2/14)
    
    Abstract
    Although most of the queries submitted to search engines are composed of a few keywords and have a length that ranges from three to six words, more than 15% of the total volume of the queries are verbose, introduce ambiguity and cause topic drifts. We consider verbosity a different property of queries from length since a verbose query is not necessarily long, it might be succinct and a short query might be verbose. This paper proposes a methodology to automatically detect verbose queries and conditionally modify queries. The methodology proposed in this paper exploits state-of-the-art classification algorithms, combines concepts from a large linguistic database and uses a topic gisting algorithm we designed for verbose query modification purposes. Our experimental results have been obtained using the TREC Robust track collection, thirty topics classified by difficulty degree, four queries per topic classified by verbosity and length, and human assessment of query verbosity. Our results suggest that the methodology for query modification conditioned to query verbosity detection and topic gisting is significantly effective and that query modification should be refined when topic difficulty and query verbosity are considered since these two properties interact and query verbosity is not straightforwardly related to query length.
  17. Athukorala, K.; Glowacka, D.; Jacucci, G.; Oulasvirta, A.; Vreeken, J.: Is exploratory search different? : a comparison of information search behavior for exploratory and lookup tasks (2016) 0.00
    0.004806533 = product of:
      0.03364573 = sum of:
        0.016822865 = weight(_text_:classification in 3150) [ClassicSimilarity], result of:
          0.016822865 = score(doc=3150,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 3150, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3150)
        0.016822865 = weight(_text_:classification in 3150) [ClassicSimilarity], result of:
          0.016822865 = score(doc=3150,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 3150, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3150)
      0.14285715 = coord(2/14)
    
    Abstract
    Exploratory search is an increasingly important activity yet challenging for users. Although there exists an ample amount of research into understanding exploration, most of the major information retrieval (IR) systems do not provide tailored and adaptive support for such tasks. One reason is the lack of empirical knowledge on how to distinguish exploratory and lookup search behaviors in IR systems. The goal of this article is to investigate how to separate the 2 types of tasks in an IR system using easily measurable behaviors. In this article, we first review characteristics of exploratory search behavior. We then report on a controlled study of 6 search tasks with 3 exploratory-comparison, knowledge acquisition, planning-and 3 lookup tasks-fact-finding, navigational, question answering. The results are encouraging, showing that IR systems can distinguish the 2 search categories in the course of a search session. The most distinctive indicators that characterize exploratory search behaviors are query length, maximum scroll depth, and task completion time. However, 2 tasks are borderline and exhibit mixed characteristics. We assess the applicability of this finding by reporting on several classification experiments. Our results have valuable implications for designing tailored and adaptive IR systems.
  18. Olmos, R.; Jorge-Botana, G.; Luzón, J.M.; Martín-Cordero, J.I.; León, J.A.: Transforming LSA space dimensions into a rubric for an automatic assessment and feedback system (2016) 0.00
    0.003560864 = product of:
      0.04985209 = sum of:
        0.04985209 = product of:
          0.09970418 = sum of:
            0.09970418 = weight(_text_:texts in 2878) [ClassicSimilarity], result of:
              0.09970418 = score(doc=2878,freq=8.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.605712 = fieldWeight in 2878, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2878)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    The purpose of this article is to validate, through two empirical studies, a new method for automatic evaluation of written texts, called Inbuilt Rubric, based on the Latent Semantic Analysis (LSA) technique, which constitutes an innovative and distinct turn with respect to LSA application so far. In the first empirical study, evidence of the validity of the method to identify and evaluate the conceptual axes of a text in a sample of 78 summaries by secondary school students is sought. Results show that the proposed method has a significantly higher degree of reliability than classic LSA methods of text evaluation, and displays very high sensitivity to identify which conceptual axes are included or not in each summary. A second study evaluates the method's capacity to interact and provide feedback about quality in a real online system on a sample of 924 discursive texts written by university students. Results show that students improved the quality of their written texts using this system, and also rated the experience very highly. The final conclusion is that this new method opens a very interesting way regarding the role of automatic assessors in the identification of presence/absence and quality of elaboration of relevant conceptual information in texts written by students with lower time costs than the usual LSA-based methods.
  19. Atanassova, I.; Bertin, M.: Semantic facets for scientific information retrieval (2014) 0.00
    0.0024926048 = product of:
      0.034896467 = sum of:
        0.034896467 = product of:
          0.069792934 = sum of:
            0.069792934 = weight(_text_:texts in 4471) [ClassicSimilarity], result of:
              0.069792934 = score(doc=4471,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.42399842 = fieldWeight in 4471, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4471)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    We present an Information Retrieval System for scientific publications that provides the possibility to filter results according to semantic facets. We use sentence-level semantic annotations that identify specific semantic relations in texts, such as methods, definitions, hypotheses, that correspond to common information needs related to scientific literature. The semantic annotations are obtained using a rule-based method that identifies linguistic clues organized into a linguistic ontology. The system is implemented using Solr Search Server and offers efficient search and navigation in scientific papers.
  20. Colace, F.; Santo, M. De; Greco, L.; Napoletano, P.: Weighted word pairs for query expansion (2015) 0.00
    0.00237488 = product of:
      0.03324832 = sum of:
        0.03324832 = product of:
          0.06649664 = sum of:
            0.06649664 = weight(_text_:schemes in 2687) [ClassicSimilarity], result of:
              0.06649664 = score(doc=2687,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.41386467 = fieldWeight in 2687, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2687)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    This paper proposes a novel query expansion method to improve accuracy of text retrieval systems. Our method makes use of a minimal relevance feedback to expand the initial query with a structured representation composed of weighted pairs of words. Such a structure is obtained from the relevance feedback through a method for pairs of words selection based on the Probabilistic Topic Model. We compared our method with other baseline query expansion schemes and methods. Evaluations performed on TREC-8 demonstrated the effectiveness of the proposed method with respect to the baseline.