Search (53 results, page 1 of 3)

  • × theme_ss:"Retrievalalgorithmen"
  • × type_ss:"a"
  • × year_i:[2010 TO 2020}
  1. Baloh, P.; Desouza, K.C.; Hackney, R.: Contextualizing organizational interventions of knowledge management systems : a design science perspectiveA domain analysis (2012) 0.07
    0.06685467 = product of:
      0.100282 = sum of:
        0.0088240495 = weight(_text_:information in 241) [ClassicSimilarity], result of:
          0.0088240495 = score(doc=241,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.09697737 = fieldWeight in 241, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=241)
        0.09145795 = sum of:
          0.05634501 = weight(_text_:management in 241) [ClassicSimilarity], result of:
            0.05634501 = score(doc=241,freq=6.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.32251096 = fieldWeight in 241, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0390625 = fieldNorm(doc=241)
          0.035112944 = weight(_text_:22 in 241) [ClassicSimilarity], result of:
            0.035112944 = score(doc=241,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.19345059 = fieldWeight in 241, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=241)
      0.6666667 = coord(2/3)
    
    Abstract
    We address how individuals' (workers) knowledge needs influence the design of knowledge management systems (KMS), enabling knowledge creation and utilization. It is evident that KMS technologies and activities are indiscriminately deployed in most organizations with little regard to the actual context of their adoption. Moreover, it is apparent that the extant literature pertaining to knowledge management projects is frequently deficient in identifying the variety of factors indicative for successful KMS. This presents an obvious business practice and research gap that requires a critical analysis of the necessary intervention that will actually improve how workers can leverage and form organization-wide knowledge. This research involved an extensive review of the literature, a grounded theory methodological approach and rigorous data collection and synthesis through an empirical case analysis (Parsons Brinckerhoff and Samsung). The contribution of this study is the formulation of a model for designing KMS based upon the design science paradigm, which aspires to create artifacts that are interdependent of people and organizations. The essential proposition is that KMS design and implementation must be contextualized in relation to knowledge needs and that these will differ for various organizational settings. The findings present valuable insights and further understanding of the way in which KMS design efforts should be focused.
    Date
    11. 6.2012 14:22:34
    Source
    Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.948-966
  2. Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.06
    0.063111395 = product of:
      0.094667085 = sum of:
        0.01247909 = weight(_text_:information in 2591) [ClassicSimilarity], result of:
          0.01247909 = score(doc=2591,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.13714671 = fieldWeight in 2591, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
        0.082187995 = sum of:
          0.032530803 = weight(_text_:management in 2591) [ClassicSimilarity], result of:
            0.032530803 = score(doc=2591,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.18620178 = fieldWeight in 2591, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2591)
          0.049657196 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
            0.049657196 = score(doc=2591,freq=4.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.27358043 = fieldWeight in 2591, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2591)
      0.6666667 = coord(2/3)
    
    Abstract
    Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.
    Date
    20. 1.2015 18:30:22
    18. 9.2018 18:22:56
    Source
    Aslib journal of information management. 67(2015) no.6, S.700-714
  3. Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.03
    0.028797261 = product of:
      0.043195892 = sum of:
        0.02367741 = weight(_text_:information in 2799) [ClassicSimilarity], result of:
          0.02367741 = score(doc=2799,freq=10.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.2602176 = fieldWeight in 2799, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2799)
        0.019518482 = product of:
          0.039036963 = sum of:
            0.039036963 = weight(_text_:management in 2799) [ClassicSimilarity], result of:
              0.039036963 = score(doc=2799,freq=2.0), product of:
                0.17470726 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0518325 = queryNorm
                0.22344214 = fieldWeight in 2799, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2799)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
    Source
    Information processing and management. 50(2014) no.2, S.416-425
  4. Bornmann, L.; Mutz, R.: From P100 to P100' : a new citation-rank approach (2014) 0.03
    0.028139224 = product of:
      0.042208835 = sum of:
        0.01411848 = weight(_text_:information in 1431) [ClassicSimilarity], result of:
          0.01411848 = score(doc=1431,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.1551638 = fieldWeight in 1431, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=1431)
        0.028090354 = product of:
          0.056180708 = sum of:
            0.056180708 = weight(_text_:22 in 1431) [ClassicSimilarity], result of:
              0.056180708 = score(doc=1431,freq=2.0), product of:
                0.18150859 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0518325 = queryNorm
                0.30952093 = fieldWeight in 1431, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1431)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    22. 8.2014 17:05:18
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.9, S.1939-1943
  5. Soulier, L.; Jabeur, L.B.; Tamine, L.; Bahsoun, W.: On ranking relevant entities in heterogeneous networks using a language-based model (2013) 0.02
    0.023469714 = product of:
      0.03520457 = sum of:
        0.017648099 = weight(_text_:information in 664) [ClassicSimilarity], result of:
          0.017648099 = score(doc=664,freq=8.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.19395474 = fieldWeight in 664, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=664)
        0.017556472 = product of:
          0.035112944 = sum of:
            0.035112944 = weight(_text_:22 in 664) [ClassicSimilarity], result of:
              0.035112944 = score(doc=664,freq=2.0), product of:
                0.18150859 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0518325 = queryNorm
                0.19345059 = fieldWeight in 664, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=664)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    A new challenge, accessing multiple relevant entities, arises from the availability of linked heterogeneous data. In this article, we address more specifically the problem of accessing relevant entities, such as publications and authors within a bibliographic network, given an information need. We propose a novel algorithm, called BibRank, that estimates a joint relevance of documents and authors within a bibliographic network. This model ranks each type of entity using a score propagation algorithm with respect to the query topic and the structure of the underlying bi-type information entity network. Evidence sources, namely content-based and network-based scores, are both used to estimate the topical similarity between connected entities. For this purpose, authorship relationships are analyzed through a language model-based score on the one hand and on the other hand, non topically related entities of the same type are detected through marginal citations. The article reports the results of experiments using the Bibrank algorithm for an information retrieval task. The CiteSeerX bibliographic data set forms the basis for the topical query automatic generation and evaluation. We show that a statistically significant improvement over closely related ranking models is achieved.
    Date
    22. 3.2013 19:34:49
    Source
    Journal of the American Society for Information Science and Technology. 64(2013) no.3, S.500-515
  6. Lee, J.; Min, J.-K.; Oh, A.; Chung, C.-W.: Effective ranking and search techniques for Web resources considering semantic relationships (2014) 0.02
    0.022609001 = product of:
      0.0339135 = sum of:
        0.017648099 = weight(_text_:information in 2670) [ClassicSimilarity], result of:
          0.017648099 = score(doc=2670,freq=8.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.19395474 = fieldWeight in 2670, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2670)
        0.016265402 = product of:
          0.032530803 = sum of:
            0.032530803 = weight(_text_:management in 2670) [ClassicSimilarity], result of:
              0.032530803 = score(doc=2670,freq=2.0), product of:
                0.17470726 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0518325 = queryNorm
                0.18620178 = fieldWeight in 2670, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2670)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    On the Semantic Web, the types of resources and the semantic relationships between resources are defined in an ontology. By using that information, the accuracy of information retrieval can be improved. In this paper, we present effective ranking and search techniques considering the semantic relationships in an ontology. Our technique retrieves top-k resources which are the most relevant to query keywords through the semantic relationships. To do this, we propose a weighting measure for the semantic relationship. Based on this measure, we propose a novel ranking method which considers the number of meaningful semantic relationships between a resource and keywords as well as the coverage and discriminating power of keywords. In order to improve the efficiency of the search, we prune the unnecessary search space using the length and weight thresholds of the semantic relationship path. In addition, we exploit Threshold Algorithm based on an extended inverted index to answer top-k results efficiently. The experimental results using real data sets demonstrate that our retrieval method using the semantic information generates accurate results efficiently compared to the traditional methods.
    Source
    Information processing and management. 50(2014) no.1, S.132-155
  7. Liu, X.; Zheng, W.; Fang, H.: ¬An exploration of ranking models and feedback method for related entity finding (2013) 0.02
    0.019162996 = product of:
      0.028744493 = sum of:
        0.01247909 = weight(_text_:information in 2714) [ClassicSimilarity], result of:
          0.01247909 = score(doc=2714,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.13714671 = fieldWeight in 2714, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2714)
        0.016265402 = product of:
          0.032530803 = sum of:
            0.032530803 = weight(_text_:management in 2714) [ClassicSimilarity], result of:
              0.032530803 = score(doc=2714,freq=2.0), product of:
                0.17470726 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0518325 = queryNorm
                0.18620178 = fieldWeight in 2714, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2714)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Most existing search engines focus on document retrieval. However, information needs are certainly not limited to finding relevant documents. Instead, a user may want to find relevant entities such as persons and organizations. In this paper, we study the problem of related entity finding. Our goal is to rank entities based on their relevance to a structured query, which specifies an input entity, the type of related entities and the relation between the input and related entities. We first discuss a general probabilistic framework, derive six possible retrieval models to rank the related entities, and then compare these models both analytically and empirically. To further improve performance, we study the problem of feedback in the context of related entity finding. Specifically, we propose a mixture model based feedback method that can utilize the pseudo feedback entities to estimate an enriched model for the relation between the input and related entities. Experimental results over two standard TREC collections show that the derived relation generation model combined with a relation feedback method performs better than other models.
    Source
    Information processing and management. 49(2013) no.5, S.995-1007
  8. Karisani, P.; Rahgozar, M.; Oroumchian, F.: Transforming LSA space dimensions into a rubric for an automatic assessment and feedback system (2016) 0.02
    0.019162996 = product of:
      0.028744493 = sum of:
        0.01247909 = weight(_text_:information in 2970) [ClassicSimilarity], result of:
          0.01247909 = score(doc=2970,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.13714671 = fieldWeight in 2970, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2970)
        0.016265402 = product of:
          0.032530803 = sum of:
            0.032530803 = weight(_text_:management in 2970) [ClassicSimilarity], result of:
              0.032530803 = score(doc=2970,freq=2.0), product of:
                0.17470726 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0518325 = queryNorm
                0.18620178 = fieldWeight in 2970, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2970)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Pseudo-relevance feedback is the basis of a category of automatic query modification techniques. Pseudo-relevance feedback methods assume the initial retrieved set of documents to be relevant. Then they use these documents to extract more relevant terms for the query or just re-weigh the user's original query. In this paper, we propose a straightforward, yet effective use of pseudo-relevance feedback method in detecting more informative query terms and re-weighting them. The query-by-query analysis of our results indicates that our method is capable of identifying the most important keywords even in short queries. Our main idea is that some of the top documents may contain a closer context to the user's information need than the others. Therefore, re-examining the similarity of those top documents and weighting this set based on their context could help in identifying and re-weighting informative query terms. Our experimental results in standard English and Persian test collections show that our method improves retrieval performance, in terms of MAP criterion, up to 7% over traditional query term re-weighting methods.
    Source
    Information processing and management. 52(2016) no.3, S.478-489
  9. Hubert, G.; Pitarch, Y.; Pinel-Sauvagnat, K.; Tournier, R.; Laporte, L.: TournaRank : when retrieval becomes document competition (2018) 0.02
    0.019162996 = product of:
      0.028744493 = sum of:
        0.01247909 = weight(_text_:information in 5087) [ClassicSimilarity], result of:
          0.01247909 = score(doc=5087,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.13714671 = fieldWeight in 5087, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5087)
        0.016265402 = product of:
          0.032530803 = sum of:
            0.032530803 = weight(_text_:management in 5087) [ClassicSimilarity], result of:
              0.032530803 = score(doc=5087,freq=2.0), product of:
                0.17470726 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0518325 = queryNorm
                0.18620178 = fieldWeight in 5087, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5087)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Numerous feature-based models have been recently proposed by the information retrieval community. The capability of features to express different relevance facets (query- or document-dependent) can explain such a success story. Such models are most of the time supervised, thus requiring a learning phase. To leverage the advantages of feature-based representations of documents, we propose TournaRank, an unsupervised approach inspired by real-life game and sport competition principles. Documents compete against each other in tournaments using features as evidences of relevance. Tournaments are modeled as a sequence of matches, which involve pairs of documents playing in turn their features. Once a tournament is ended, documents are ranked according to their number of won matches during the tournament. This principle is generic since it can be applied to any collection type. It also provides great flexibility since different alternatives can be considered by changing the tournament type, the match rules, the feature set, or the strategies adopted by documents during matches. TournaRank was experimented on several collections to evaluate our model in different contexts and to compare it with related approaches such as Learning To Rank and fusion ones: the TREC Robust2004 collection for homogeneous documents, the TREC Web2014 (ClueWeb12) collection for heterogeneous web documents, and the LETOR3.0 collection for comparison with supervised feature-based models.
    Source
    Information processing and management. 54(2018) no.2, S.252-272
  10. Dadashkarimia, J.; Shakery, A.; Failia, H.; Zamani, H.: ¬An expectation-maximization algorithm for query translation based on pseudo-relevant documents (2017) 0.02
    0.015330397 = product of:
      0.022995595 = sum of:
        0.009983272 = weight(_text_:information in 3296) [ClassicSimilarity], result of:
          0.009983272 = score(doc=3296,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.10971737 = fieldWeight in 3296, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3296)
        0.013012322 = product of:
          0.026024643 = sum of:
            0.026024643 = weight(_text_:management in 3296) [ClassicSimilarity], result of:
              0.026024643 = score(doc=3296,freq=2.0), product of:
                0.17470726 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0518325 = queryNorm
                0.14896142 = fieldWeight in 3296, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3296)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Query translation in cross-language information retrieval (CLIR) can be done by employing dictionaries, aligned corpora, or machine translators. Scarcity of aligned corpora for various domains in many language pairs intensifies the importance of dictionary-based CLIR which motivates us to use only a bilingual dictionary and two independent collections in source and target languages for query translation. We exploit pseudo-relevant documents for a given query in the source language and pseudo-relevant documents for a translation of the query in the target language with a proposed expectation-maximization algorithm for improving query translation. The proposed method (called EM4QT) assumes that each target term either is translated from the source pseudo-relevant documents or has come from a noisy collection. Since EM4QT does not directly consider term coherency, which is defined as fluency of the target translation, we investigate a crucial question: can EM4QT be improved using either coherency-based methods or token-to-token translation ones? To address this question, we combine different translation models via simple linear interpolation and a proposed divergence minimization method. Evaluations over four CLEF collections in Persian, French, Spanish, and German indicate that EM4QT significantly outperforms competitive baselines in all the collections. Our experiments also reveal that since EM4QT indirectly considers term coherency, combining the method with coherency-based models cannot significantly improve the retrieval performance. On the other hand, investigating the query-by-query results supports the view that EM4QT usually gives a relatively high weight to one translation and its combination with the proposed token-to-token translation model, which is obtained by running EM4QT for each query term separately, soothes the effect and reaches better results for many queries. Comparing the method with a competitive word-embedding baseline reveals the superiority of the proposed model.
    Source
    Information processing and management. 53(2017) no.2, S.371-387
  11. Karlsson, A.; Hammarfelt, B.; Steinhauer, H.J.; Falkman, G.; Olson, N.; Nelhans, G.; Nolin, J.: Modeling uncertainty in bibliometrics and information retrieval : an information fusion approach (2015) 0.01
    0.0101891365 = product of:
      0.030567408 = sum of:
        0.030567408 = weight(_text_:information in 1696) [ClassicSimilarity], result of:
          0.030567408 = score(doc=1696,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.3359395 = fieldWeight in 1696, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=1696)
      0.33333334 = coord(1/3)
    
    Footnote
    Beitrag in einem Special Issue "Combining bibliometrics and information retrieval"
  12. Fuhr, N.: Modelle im Information Retrieval (2013) 0.01
    0.008319394 = product of:
      0.02495818 = sum of:
        0.02495818 = weight(_text_:information in 724) [ClassicSimilarity], result of:
          0.02495818 = score(doc=724,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.27429342 = fieldWeight in 724, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=724)
      0.33333334 = coord(1/3)
    
    Source
    Grundlagen der praktischen Information und Dokumentation. Handbuch zur Einführung in die Informationswissenschaft und -praxis. 6., völlig neu gefaßte Ausgabe. Hrsg. von R. Kuhlen, W. Semar u. D. Strauch. Begründet von Klaus Laisiepen, Ernst Lutterbeck, Karl-Heinrich Meyer-Uhlenried
  13. Hoenkamp, E.; Bruza, P.: How everyday language can and will boost effective information retrieval (2015) 0.01
    0.0072048064 = product of:
      0.02161442 = sum of:
        0.02161442 = weight(_text_:information in 2123) [ClassicSimilarity], result of:
          0.02161442 = score(doc=2123,freq=12.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.23754507 = fieldWeight in 2123, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2123)
      0.33333334 = coord(1/3)
    
    Abstract
    Typing 2 or 3 keywords into a browser has become an easy and efficient way to find information. Yet, typing even short queries becomes tedious on ever shrinking (virtual) keyboards. Meanwhile, speech processing is maturing rapidly, facilitating everyday language input. Also, wearable technology can inform users proactively by listening in on their conversations or processing their social media interactions. Given these developments, everyday language may soon become the new input of choice. We present an information retrieval (IR) algorithm specifically designed to accept everyday language. It integrates two paradigms of information retrieval, previously studied in isolation; one directed mainly at the surface structure of language, the other primarily at the underlying meaning. The integration was achieved by a Markov machine that encodes meaning by its transition graph, and surface structure by the language it generates. A rigorous evaluation of the approach showed, first, that it can compete with the quality of existing language models, second, that it is more effective the more verbose the input, and third, as a consequence, that it is promising for an imminent transition from keyword input, where the onus is on the user to formulate concise queries, to a modality where users can express more freely, more informal, and more natural their need for information in everyday language.
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.8, S.1546-1558
  14. Jacucci, G.; Barral, O.; Daee, P.; Wenzel, M.; Serim, B.; Ruotsalo, T.; Pluchino, P.; Freeman, J.; Gamberini, L.; Kaski, S.; Blankertz, B.: Integrating neurophysiologic relevance feedback in intent modeling for information retrieval (2019) 0.01
    0.0072048064 = product of:
      0.02161442 = sum of:
        0.02161442 = weight(_text_:information in 5356) [ClassicSimilarity], result of:
          0.02161442 = score(doc=5356,freq=12.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.23754507 = fieldWeight in 5356, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5356)
      0.33333334 = coord(1/3)
    
    Abstract
    The use of implicit relevance feedback from neurophysiology could deliver effortless information retrieval. However, both computing neurophysiologic responses and retrieving documents are characterized by uncertainty because of noisy signals and incomplete or inconsistent representations of the data. We present the first-of-its-kind, fully integrated information retrieval system that makes use of online implicit relevance feedback generated from brain activity as measured through electroencephalography (EEG), and eye movements. The findings of the evaluation experiment (N = 16) show that we are able to compute online neurophysiology-based relevance feedback with performance significantly better than chance in complex data domains and realistic search tasks. We contribute by demonstrating how to integrate in interactive intent modeling this inherently noisy implicit relevance feedback combined with scarce explicit feedback. Although experimental measures of task performance did not allow us to demonstrate how the classification outcomes translated into search task performance, the experiment proved that our approach is able to generate relevance feedback from brain signals and eye movements in a realistic scenario, thus providing promising implications for future work in neuroadaptive information retrieval (IR).
    Footnote
    Beitrag in einem 'Special issue on neuro-information science'.
    Source
    Journal of the Association for Information Science and Technology. 70(2019) no.9, S.917-930
  15. Efron, M.: Linear time series models for term weighting in information retrieval (2010) 0.01
    0.0070592402 = product of:
      0.02117772 = sum of:
        0.02117772 = weight(_text_:information in 3688) [ClassicSimilarity], result of:
          0.02117772 = score(doc=3688,freq=8.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.23274569 = fieldWeight in 3688, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3688)
      0.33333334 = coord(1/3)
    
    Abstract
    Common measures of term importance in information retrieval (IR) rely on counts of term frequency; rare terms receive higher weight in document ranking than common terms receive. However, realistic scenarios yield additional information about terms in a collection. Of interest in this article is the temporal behavior of terms as a collection changes over time. We propose capturing each term's collection frequency at discrete time intervals over the lifespan of a corpus and analyzing the resulting time series. We hypothesize the collection frequency of a weakly discriminative term x at time t is predictable by a linear model of the term's prior observations. On the other hand, a linear time series model for a strong discriminators' collection frequency will yield a poor fit to the data. Operationalizing this hypothesis, we induce three time-based measures of term importance and test these against state-of-the-art term weighting models.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.7, S.1299-1312
  16. Habernal, I.; Konopík, M.; Rohlík, O.: Question answering (2012) 0.01
    0.0070592402 = product of:
      0.02117772 = sum of:
        0.02117772 = weight(_text_:information in 101) [ClassicSimilarity], result of:
          0.02117772 = score(doc=101,freq=8.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.23274569 = fieldWeight in 101, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=101)
      0.33333334 = coord(1/3)
    
    Abstract
    Question Answering is an area of information retrieval with the added challenge of applying sophisticated techniques to identify the complex syntactic and semantic relationships present in text in order to provide a more sophisticated and satisfactory response to the user's information needs. For this reason, the authors see question answering as the next step beyond standard information retrieval. In this chapter state of the art question answering is covered focusing on providing an overview of systems, techniques and approaches that are likely to be employed in the next generations of search engines. Special attention is paid to question answering using the World Wide Web as the data source and to question answering exploiting the possibilities of Semantic Web. Considerations about the current issues and prospects for promising future research are also provided.
    Source
    Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
  17. Van der Veer Martens, B.; Fleet, C. van: Opening the black box of "relevance work" : a domain analysis (2012) 0.01
    0.0070592402 = product of:
      0.02117772 = sum of:
        0.02117772 = weight(_text_:information in 247) [ClassicSimilarity], result of:
          0.02117772 = score(doc=247,freq=8.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.23274569 = fieldWeight in 247, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=247)
      0.33333334 = coord(1/3)
    
    Abstract
    In response to Hjørland's recent call for a reconceptualization of the foundations of relevance, we suggest that the sociocognitive aspects of intermediation by information agencies, such as archives and libraries, are a necessary and unexplored part of the infrastructure of the subject knowledge domains central to his recommended "view of relevance informed by a social paradigm" (2010, p. 217). From a comparative analysis of documents from 39 graduate-level introductory courses in archives, reference, and strategic/competitive intelligence taught in 13 American Library Association-accredited library and information science (LIS) programs, we identify four defining sociocognitive dimensions of "relevance work" in information agencies within Hjørland's proposed framework for relevance: tasks, time, systems, and assessors. This study is intended to supply sociocognitive content from within the relevance work domain to support further domain analytic research, and to emphasize the importance of intermediary relevance work for all subject knowledge domains.
    Source
    Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.936-947
  18. White, H. D.: Co-cited author retrieval and relevance theory : examples from the humanities (2015) 0.01
    0.0070592402 = product of:
      0.02117772 = sum of:
        0.02117772 = weight(_text_:information in 1687) [ClassicSimilarity], result of:
          0.02117772 = score(doc=1687,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.23274569 = fieldWeight in 1687, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=1687)
      0.33333334 = coord(1/3)
    
    Footnote
    Beitrag in einem Special Issue "Combining bibliometrics and information retrieval"
  19. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.01
    0.0065770587 = product of:
      0.019731175 = sum of:
        0.019731175 = weight(_text_:information in 1338) [ClassicSimilarity], result of:
          0.019731175 = score(doc=1338,freq=10.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.21684799 = fieldWeight in 1338, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
      0.33333334 = coord(1/3)
    
    Abstract
    A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.8, S.1577-1596
  20. Efron, M.; Winget, M.: Query polyrepresentation for ranking retrieval systems without relevance judgments (2010) 0.01
    0.0061134817 = product of:
      0.018340444 = sum of:
        0.018340444 = weight(_text_:information in 3469) [ClassicSimilarity], result of:
          0.018340444 = score(doc=3469,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.20156369 = fieldWeight in 3469, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3469)
      0.33333334 = coord(1/3)
    
    Abstract
    Ranking information retrieval (IR) systems with respect to their effectiveness is a crucial operation during IR evaluation, as well as during data fusion. This article offers a novel method of approaching the system-ranking problem, based on the widely studied idea of polyrepresentation. The principle of polyrepresentation suggests that a single information need can be represented by many query articulations-what we call query aspects. By skimming the top k (where k is small) documents retrieved by a single system for multiple query aspects, we collect a set of documents that are likely to be relevant to a given test topic. Labeling these skimmed documents as putatively relevant lets us build pseudorelevance judgments without undue human intervention. We report experiments where using these pseudorelevance judgments delivers a rank ordering of IR systems that correlates highly with rankings based on human relevance judgments.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.6, S.1081-1091

Languages

  • e 49
  • d 4