Search (65 results, page 1 of 4)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  • × type_ss:"a"
  • × year_i:[2010 TO 2020}
  1. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.03
    0.0318287 = product of:
      0.0636574 = sum of:
        0.008101207 = weight(_text_:information in 1343) [ClassicSimilarity], result of:
          0.008101207 = score(doc=1343,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.09697737 = fieldWeight in 1343, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1343)
        0.055556197 = sum of:
          0.02331961 = weight(_text_:technology in 1343) [ClassicSimilarity], result of:
            0.02331961 = score(doc=1343,freq=2.0), product of:
              0.1417311 = queryWeight, product of:
                2.978387 = idf(docFreq=6114, maxDocs=44218)
                0.047586527 = queryNorm
              0.16453418 = fieldWeight in 1343, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.978387 = idf(docFreq=6114, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1343)
          0.032236587 = weight(_text_:22 in 1343) [ClassicSimilarity], result of:
            0.032236587 = score(doc=1343,freq=2.0), product of:
              0.16663991 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.047586527 = queryNorm
              0.19345059 = fieldWeight in 1343, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1343)
      0.5 = coord(2/4)
    
    Date
    22. 8.2014 17:07:50
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.9, S.1870-1883
  2. Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.02
    0.024219502 = product of:
      0.048439004 = sum of:
        0.016202414 = weight(_text_:information in 3279) [ClassicSimilarity], result of:
          0.016202414 = score(doc=3279,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.19395474 = fieldWeight in 3279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=3279)
        0.032236587 = product of:
          0.064473175 = sum of:
            0.064473175 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
              0.064473175 = score(doc=3279,freq=2.0), product of:
                0.16663991 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047586527 = queryNorm
                0.38690117 = fieldWeight in 3279, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3279)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Series
    Communications in computer and information science; 672
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  3. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.02
    0.024219502 = product of:
      0.048439004 = sum of:
        0.016202414 = weight(_text_:information in 3280) [ClassicSimilarity], result of:
          0.016202414 = score(doc=3280,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.19395474 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
        0.032236587 = product of:
          0.064473175 = sum of:
            0.064473175 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
              0.064473175 = score(doc=3280,freq=2.0), product of:
                0.16663991 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047586527 = queryNorm
                0.38690117 = fieldWeight in 3280, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3280)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Series
    Communications in computer and information science; 672
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  4. Pontis, S.; Kefalidou, G.; Blandford, A.; Forth, J.; Makri, S.; Sharples, S.; Wiggins, G.; Woods, M.: Academics' responses to encountered information : context matters (2016) 0.02
    0.017981712 = product of:
      0.035963424 = sum of:
        0.02430362 = weight(_text_:information in 3049) [ClassicSimilarity], result of:
          0.02430362 = score(doc=3049,freq=18.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.2909321 = fieldWeight in 3049, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3049)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 3049) [ClassicSimilarity], result of:
              0.02331961 = score(doc=3049,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 3049, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3049)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    An increasing number of tools are being developed to help academics interact with information, but little is known about the benefits of those tools for their users. This study evaluated academics' receptiveness to information proposed by a mobile app, the SerenA Notebook: information that is based in their inferred interests but does not relate directly to a prior recognized need. The evaluated app aimed at creating the experience of serendipitous encounters: generating ideas and inspiring thoughts, and potentially triggering follow-up actions, by providing users with suggestions related to their work and leisure interests. We studied how 20 academics interacted with messages sent by the mobile app (3 per day over 10 consecutive days). Collected data sets were analyzed using thematic analysis. We found that contextual factors (location, activity, and focus) strongly influenced their responses to messages. Academics described some unsolicited information as interesting but irrelevant when they could not make immediate use of it. They highlighted filtering information as their major struggle rather than finding information. Some messages that were positively received acted as reminders of activities participants were meant to be doing but were postponing, or were relevant to ongoing activities at the time the information was received.
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.8, S.1883-1903
  5. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.01
    0.014887327 = product of:
      0.029774655 = sum of:
        0.01811485 = weight(_text_:information in 1338) [ClassicSimilarity], result of:
          0.01811485 = score(doc=1338,freq=10.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.21684799 = fieldWeight in 1338, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 1338) [ClassicSimilarity], result of:
              0.02331961 = score(doc=1338,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 1338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1338)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.8, S.1577-1596
  6. Adhikari, A.; Dutta, B.; Dutta, A.; Mondal, D.; Singh, S.: ¬An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology (2018) 0.01
    0.014887327 = product of:
      0.029774655 = sum of:
        0.01811485 = weight(_text_:information in 4372) [ClassicSimilarity], result of:
          0.01811485 = score(doc=4372,freq=10.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.21684799 = fieldWeight in 4372, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4372)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 4372) [ClassicSimilarity], result of:
              0.02331961 = score(doc=4372,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 4372, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4372)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Finding similarity between concepts based on semantics has become a new trend in many applications (e.g., biomedical informatics, natural language processing). Measuring the Semantic Similarity (SS) with higher accuracy is a challenging task. In this context, the Information Content (IC)-based SS measure has gained popularity over the others. The notion of IC evolves from the science of information theory. Information theory has very high potential to characterize the semantics of concepts. Designing an IC-based SS framework comprises (i) an IC calculator, and (ii) an SS calculator. In this article, we propose a generic intrinsic IC-based SS calculator. We also introduce here a new structural aspect of an ontology called DCS (Disjoint Common Subsumers) that plays a significant role in deciding the similarity between two concepts. We evaluated our proposed similarity calculator with the existing intrinsic IC-based similarity calculators, as well as corpora-dependent similarity calculators using several benchmark data sets. The experimental results show that the proposed similarity calculator produces a high correlation with human evaluation over the existing state-of-the-art IC-based similarity calculators.
    Source
    Journal of the Association for Information Science and Technology. 69(2018) no.8, S.1023-1034
  7. Brunetti, J.M.; Roberto García, R.: User-centered design and evaluation of overview components for semantic data exploration (2014) 0.01
    0.014384847 = product of:
      0.028769694 = sum of:
        0.01587506 = weight(_text_:information in 1626) [ClassicSimilarity], result of:
          0.01587506 = score(doc=1626,freq=12.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.19003606 = fieldWeight in 1626, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=1626)
        0.012894635 = product of:
          0.02578927 = sum of:
            0.02578927 = weight(_text_:22 in 1626) [ClassicSimilarity], result of:
              0.02578927 = score(doc=1626,freq=2.0), product of:
                0.16663991 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047586527 = queryNorm
                0.15476047 = fieldWeight in 1626, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1626)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - The growing volumes of semantic data available in the web result in the need for handling the information overload phenomenon. The potential of this amount of data is enormous but in most cases it is very difficult for users to visualize, explore and use this data, especially for lay-users without experience with Semantic Web technologies. The paper aims to discuss these issues. Design/methodology/approach - The Visual Information-Seeking Mantra "Overview first, zoom and filter, then details-on-demand" proposed by Shneiderman describes how data should be presented in different stages to achieve an effective exploration. The overview is the first user task when dealing with a data set. The objective is that the user is capable of getting an idea about the overall structure of the data set. Different information architecture (IA) components supporting the overview tasks have been developed, so they are automatically generated from semantic data, and evaluated with end-users. Findings - The chosen IA components are well known to web users, as they are present in most web pages: navigation bars, site maps and site indexes. The authors complement them with Treemaps, a visualization technique for displaying hierarchical data. These components have been developed following an iterative User-Centered Design methodology. Evaluations with end-users have shown that they get easily used to them despite the fact that they are generated automatically from structured data, without requiring knowledge about the underlying semantic technologies, and that the different overview components complement each other as they focus on different information search needs. Originality/value - Obtaining semantic data sets overviews cannot be easily done with the current semantic web browsers. Overviews become difficult to achieve with large heterogeneous data sets, which is typical in the Semantic Web, because traditional IA techniques do not easily scale to large data sets. There is little or no support to obtain overview information quickly and easily at the beginning of the exploration of a new data set. This can be a serious limitation when exploring a data set for the first time, especially for lay-users. The proposal is to reuse and adapt existing IA components to provide this overview to users and show that they can be generated automatically from the thesaurus and ontologies that structure semantic data while providing a comparable user experience to traditional web sites.
    Date
    20. 1.2015 18:30:22
    Source
    Aslib journal of information management. 66(2014) no.5, S.519-536
  8. Huang, L.; Milne, D.; Frank, E.; Witten, I.H.: Learning a concept-based document similarity measure (2012) 0.01
    0.013869986 = product of:
      0.027739972 = sum of:
        0.013748205 = weight(_text_:information in 372) [ClassicSimilarity], result of:
          0.013748205 = score(doc=372,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16457605 = fieldWeight in 372, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=372)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 372) [ClassicSimilarity], result of:
              0.027983533 = score(doc=372,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 372, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=372)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Document similarity measures are crucial components of many text-analysis tasks, including information retrieval, document classification, and document clustering. Conventional measures are brittle: They estimate the surface overlap between documents based on the words they mention and ignore deeper semantic connections. We propose a new measure that assesses similarity at both the lexical and semantic levels, and learns from human judgments how to combine them by using machine-learning techniques. Experiments show that the new measure produces values for documents that are more consistent with people's judgments than people are with each other. We also use it to classify and cluster large document sets covering different genres and topics, and find that it improves both classification and clustering performance.
    Source
    Journal of the American Society for Information Science and Technology. 63(2012) no.8, S.1593-1608
  9. Narock, T.; Zhou, L.; Yoon, V.: Semantic similarity of ontology instances using polarity mining (2013) 0.01
    0.013869986 = product of:
      0.027739972 = sum of:
        0.013748205 = weight(_text_:information in 620) [ClassicSimilarity], result of:
          0.013748205 = score(doc=620,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16457605 = fieldWeight in 620, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=620)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 620) [ClassicSimilarity], result of:
              0.027983533 = score(doc=620,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 620, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=620)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Semantic similarity is vital to many areas, such as information retrieval. Various methods have been proposed with a focus on comparing unstructured text documents. Several of these have been enhanced with ontology; however, they have not been applied to ontology instances. With the growth in ontology instance data published online through, for example, Linked Open Data, there is an increasing need to apply semantic similarity to ontology instances. Drawing on ontology-supported polarity mining (OSPM), we propose an algorithm that enhances the computation of semantic similarity with polarity mining techniques. The algorithm is evaluated with online customer review data. The experimental results show that the proposed algorithm outperforms the baseline algorithm in multiple settings.
    Source
    Journal of the American Society for Information Science and Technology. 64(2013) no.2, S.416-427
  10. Wang, Z.; Khoo, C.S.G.; Chaudhry, A.S.: Evaluation of the navigation effectiveness of an organizational taxonomy built on a general classification scheme and domain thesauri (2014) 0.01
    0.013869986 = product of:
      0.027739972 = sum of:
        0.013748205 = weight(_text_:information in 1251) [ClassicSimilarity], result of:
          0.013748205 = score(doc=1251,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16457605 = fieldWeight in 1251, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1251)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 1251) [ClassicSimilarity], result of:
              0.027983533 = score(doc=1251,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 1251, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1251)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    This paper presents an evaluation study of the navigation effectiveness of a multifaceted organizational taxonomy that was built on the Dewey Decimal Classification and several domain thesauri in the area of library and information science education. The objective of the evaluation was to detect deficiencies in the taxonomy and to infer problems of applied construction steps from users' navigation difficulties. The evaluation approach included scenario-based navigation exercises and postexercise interviews. Navigation exercise errors and underlying reasons were analyzed in relation to specific components of the taxonomy and applied construction steps. Guidelines for the construction of the hierarchical structure and categories of an organizational taxonomy using existing general classification schemes and domain thesauri were derived from the evaluation results.
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.5, S.948-963
  11. Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.01
    0.013661189 = product of:
      0.027322378 = sum of:
        0.016039573 = weight(_text_:information in 1633) [ClassicSimilarity], result of:
          0.016039573 = score(doc=1633,freq=16.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.1920054 = fieldWeight in 1633, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
        0.011282805 = product of:
          0.02256561 = sum of:
            0.02256561 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
              0.02256561 = score(doc=1633,freq=2.0), product of:
                0.16663991 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047586527 = queryNorm
                0.1354154 = fieldWeight in 1633, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1633)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.
    Date
    20. 1.2015 18:30:22
    Source
    Aslib journal of information management. 66(2014) no.6, S.678-696
  12. Xu, B.; Lin, H.; Lin, Y.: Assessment of learning to rank methods for query expansion (2016) 0.01
    0.012845755 = product of:
      0.02569151 = sum of:
        0.0140317045 = weight(_text_:information in 2929) [ClassicSimilarity], result of:
          0.0140317045 = score(doc=2929,freq=6.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16796975 = fieldWeight in 2929, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2929)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 2929) [ClassicSimilarity], result of:
              0.02331961 = score(doc=2929,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 2929, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2929)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Pseudo relevance feedback, as an effective query expansion method, can significantly improve information retrieval performance. However, the method may negatively impact the retrieval performance when some irrelevant terms are used in the expanded query. Therefore, it is necessary to refine the expansion terms. Learning to rank methods have proven effective in information retrieval to solve ranking problems by ranking the most relevant documents at the top of the returned list, but few attempts have been made to employ learning to rank methods for term refinement in pseudo relevance feedback. This article proposes a novel framework to explore the feasibility of using learning to rank to optimize pseudo relevance feedback by means of reranking the candidate expansion terms. We investigate some learning approaches to choose the candidate terms and introduce some state-of-the-art learning to rank methods to refine the expansion terms. In addition, we propose two term labeling strategies and examine the usefulness of various term features to optimize the framework. Experimental results with three TREC collections show that our framework can effectively improve retrieval performance.
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.6, S.1345-1357
  13. Athukorala, K.; Glowacka, D.; Jacucci, G.; Oulasvirta, A.; Vreeken, J.: Is exploratory search different? : a comparison of information search behavior for exploratory and lookup tasks (2016) 0.01
    0.012845755 = product of:
      0.02569151 = sum of:
        0.0140317045 = weight(_text_:information in 3150) [ClassicSimilarity], result of:
          0.0140317045 = score(doc=3150,freq=6.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16796975 = fieldWeight in 3150, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3150)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 3150) [ClassicSimilarity], result of:
              0.02331961 = score(doc=3150,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 3150, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3150)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Exploratory search is an increasingly important activity yet challenging for users. Although there exists an ample amount of research into understanding exploration, most of the major information retrieval (IR) systems do not provide tailored and adaptive support for such tasks. One reason is the lack of empirical knowledge on how to distinguish exploratory and lookup search behaviors in IR systems. The goal of this article is to investigate how to separate the 2 types of tasks in an IR system using easily measurable behaviors. In this article, we first review characteristics of exploratory search behavior. We then report on a controlled study of 6 search tasks with 3 exploratory-comparison, knowledge acquisition, planning-and 3 lookup tasks-fact-finding, navigational, question answering. The results are encouraging, showing that IR systems can distinguish the 2 search categories in the course of a search session. The most distinctive indicators that characterize exploratory search behaviors are query length, maximum scroll depth, and task completion time. However, 2 tasks are borderline and exhibit mixed characteristics. We assess the applicability of this finding by reporting on several classification experiments. Our results have valuable implications for designing tailored and adaptive IR systems.
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.11, S.2635-2651
  14. Colace, F.; Santo, M. de; Greco, L.; Napoletano, P.: Improving relevance feedback-based query expansion by the use of a weighted word pairs approach (2015) 0.01
    0.011856608 = product of:
      0.023713216 = sum of:
        0.00972145 = weight(_text_:information in 2263) [ClassicSimilarity], result of:
          0.00972145 = score(doc=2263,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.116372846 = fieldWeight in 2263, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2263)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 2263) [ClassicSimilarity], result of:
              0.027983533 = score(doc=2263,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 2263, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2263)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.11, S.2223-2234
  15. Sanfilippo, M.; Yang, S.; Fichman, P.: Trolling here, there, and everywhere : perceptions of trolling behaviors in context (2017) 0.01
    0.011856608 = product of:
      0.023713216 = sum of:
        0.00972145 = weight(_text_:information in 3823) [ClassicSimilarity], result of:
          0.00972145 = score(doc=3823,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.116372846 = fieldWeight in 3823, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3823)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 3823) [ClassicSimilarity], result of:
              0.027983533 = score(doc=3823,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 3823, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3823)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the Association for Information Science and Technology. 68(2017) no.10, S.2313-2327
  16. Pal, D.; Mitra, M.; Datta, K.: Improving query expansion using WordNet (2014) 0.01
    0.011558321 = product of:
      0.023116643 = sum of:
        0.011456838 = weight(_text_:information in 1545) [ClassicSimilarity], result of:
          0.011456838 = score(doc=1545,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13714671 = fieldWeight in 1545, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1545)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 1545) [ClassicSimilarity], result of:
              0.02331961 = score(doc=1545,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 1545, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1545)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    This study proposes a new way of using WordNet for query expansion (QE). We choose candidate expansion terms from a set of pseudo-relevant documents; however, the usefulness of these terms is measured based on their definitions provided in a hand-crafted lexical resource such as WordNet. Experiments with a number of standard TREC collections WordNet-based that this method outperforms existing WordNet-based methods. It also compares favorably with established QE methods such as KLD and RM3. Leveraging earlier work in which a combination of QE methods was found to outperform each individual method (as well as other well-known QE methods), we next propose a combination-based QE method that takes into account three different aspects of a candidate expansion term's usefulness: (a) its distribution in the pseudo-relevant documents and in the target corpus, (b) its statistical association with query terms, and (c) its semantic relation with the query, as determined by the overlap between the WordNet definitions of the term and query terms. This combination of diverse sources of information appears to work well on a number of test collections, viz., TREC123, TREC5, TREC678, TREC robust (new), and TREC910 collections, and yields significant improvements over competing methods on most of these collections.
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.12, S.2469-2478
  17. Chebil, W.; Soualmia, L.F.; Omri, M.N.; Darmoni, S.F.: Indexing biomedical documents with a possibilistic network (2016) 0.01
    0.011558321 = product of:
      0.023116643 = sum of:
        0.011456838 = weight(_text_:information in 2854) [ClassicSimilarity], result of:
          0.011456838 = score(doc=2854,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13714671 = fieldWeight in 2854, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2854)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 2854) [ClassicSimilarity], result of:
              0.02331961 = score(doc=2854,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 2854, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2854)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In this article, we propose a new approach for indexing biomedical documents based on a possibilistic network that carries out partial matching between documents and biomedical vocabulary. The main contribution of our approach is to deal with the imprecision and uncertainty of the indexing task using possibility theory. We enhance estimation of the similarity between a document and a given concept using the two measures of possibility and necessity. Possibility estimates the extent to which a document is not similar to the concept. The second measure can provide confirmation that the document is similar to the concept. Our contribution also reduces the limitation of partial matching. Although the latter allows extracting from the document other variants of terms than those in dictionaries, it also generates irrelevant information. Our objective is to filter the index using the knowledge provided by the Unified Medical Language System®. Experiments were carried out on different corpora, showing encouraging results (the improvement rate is +26.37% in terms of main average precision when compared with the baseline).
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.4, S.928-941
  18. Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.01
    0.011558321 = product of:
      0.023116643 = sum of:
        0.011456838 = weight(_text_:information in 3223) [ClassicSimilarity], result of:
          0.011456838 = score(doc=3223,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13714671 = fieldWeight in 3223, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3223)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 3223) [ClassicSimilarity], result of:
              0.02331961 = score(doc=3223,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 3223, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3223)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.12, S.2928-2946
  19. Kim, H.H.: Toward video semantic search based on a structured folksonomy (2011) 0.01
    0.0098805055 = product of:
      0.019761011 = sum of:
        0.008101207 = weight(_text_:information in 4350) [ClassicSimilarity], result of:
          0.008101207 = score(doc=4350,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.09697737 = fieldWeight in 4350, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4350)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 4350) [ClassicSimilarity], result of:
              0.02331961 = score(doc=4350,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 4350, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4350)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the American Society for Information Science and Technology. 62(2011) no.3, S.478-492
  20. Xamena, E.; Brignole, N.B.; Maguitman, A.G.: ¬A study of relevance propagation in large topic ontologies (2013) 0.01
    0.0098805055 = product of:
      0.019761011 = sum of:
        0.008101207 = weight(_text_:information in 1105) [ClassicSimilarity], result of:
          0.008101207 = score(doc=1105,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.09697737 = fieldWeight in 1105, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1105)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 1105) [ClassicSimilarity], result of:
              0.02331961 = score(doc=1105,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 1105, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1105)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the American Society for Information Science and Technology. 64(2013) no.11, S.2238-2255

Languages

  • e 62
  • d 2
  • More… Less…