Search (364 results, page 2 of 19)

  • × theme_ss:"Retrievalalgorithmen"
  1. Li, H.; Wu, H.; Li, D.; Lin, S.; Su, Z.; Luo, X.: PSI: A probabilistic semantic interpretable framework for fine-grained image ranking (2018) 0.02
    0.016518347 = product of:
      0.06607339 = sum of:
        0.06607339 = product of:
          0.09911008 = sum of:
            0.008461362 = weight(_text_:a in 4577) [ClassicSimilarity], result of:
              0.008461362 = score(doc=4577,freq=8.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.15287387 = fieldWeight in 4577, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4577)
            0.09064872 = weight(_text_:z in 4577) [ClassicSimilarity], result of:
              0.09064872 = score(doc=4577,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.35381722 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4577)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Image Ranking is one of the key problems in information science research area. However, most current methods focus on increasing the performance, leaving the semantic gap problem, which refers to the learned ranking models are hard to be understood, remaining intact. Therefore, in this article, we aim at learning an interpretable ranking model to tackle the semantic gap in fine-grained image ranking. We propose to combine attribute-based representation and online passive-aggressive (PA) learning based ranking models to achieve this goal. Besides, considering the highly localized instances in fine-grained image ranking, we introduce a supervised constrained clustering method to gather class-balanced training instances for local PA-based models, and incorporate the learned local models into a unified probabilistic framework. Extensive experiments on the benchmark demonstrate that the proposed framework outperforms state-of-the-art methods in terms of accuracy and speed.
    Type
    a
  2. Chakrabarti, S.; Dom, B.; Kumar, S.R.; Raghavan, P.; Rajagopalan, S.; Tomkins, A.; Kleinberg, J.M.; Gibson, D.: Neue Pfade durch den Internet-Dschungel : Die zweite Generation von Web-Suchmaschinen (1999) 0.02
    0.016429678 = product of:
      0.032859355 = sum of:
        0.030200208 = weight(_text_:von in 3) [ClassicSimilarity], result of:
          0.030200208 = score(doc=3,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.23581557 = fieldWeight in 3, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0625 = fieldNorm(doc=3)
        0.0026591495 = product of:
          0.007977448 = sum of:
            0.007977448 = weight(_text_:a in 3) [ClassicSimilarity], result of:
              0.007977448 = score(doc=3,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.14413087 = fieldWeight in 3, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Type
    a
  3. Fuhr, N.: Zur Überwindung der Diskrepanz zwischen Retrievalforschung und -praxis (1990) 0.02
    0.016040254 = product of:
      0.03208051 = sum of:
        0.030200208 = weight(_text_:von in 6625) [ClassicSimilarity], result of:
          0.030200208 = score(doc=6625,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.23581557 = fieldWeight in 6625, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0625 = fieldNorm(doc=6625)
        0.0018803024 = product of:
          0.005640907 = sum of:
            0.005640907 = weight(_text_:a in 6625) [ClassicSimilarity], result of:
              0.005640907 = score(doc=6625,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.10191591 = fieldWeight in 6625, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6625)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    In diesem Beitrag werden einige Forschungsergebnisse des Information Retrieval vorgestellt, die unmittelbar zur Verbesserung der Retrievalqualität für bereits existierende Datenbanken eingesetzt werden können: Linguistische Algorithmen zur Grund- und Stammformreduktion unterstützen die Suche nach Flexions- und Derivationsformen von Suchtermen. Rankingalgorithmen, die Frage- und Dokumentterme gewichten, führen zu signifikant besseren Retrievalergebnissen als beim Booleschen Retrieval. Durch Relevance Feedback können die Retrievalqualität weiter gesteigert und außerdem der Benutzer bei der sukzessiven Modifikation seiner Frageformulierung unterstützt werden. Es wird eine benutzerfreundliche Bedienungsoberfläche für ein System vorgestellt, das auf diesen Konzepten basiert.
    Type
    a
  4. Chen, Z.; Meng, X.; Fowler, R.H.; Zhu, B.: Real-time adaptive feature and document learning for Web search (2001) 0.01
    0.014448237 = product of:
      0.057792947 = sum of:
        0.057792947 = product of:
          0.08668942 = sum of:
            0.0111488225 = weight(_text_:a in 5209) [ClassicSimilarity], result of:
              0.0111488225 = score(doc=5209,freq=20.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.20142901 = fieldWeight in 5209, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5209)
            0.075540595 = weight(_text_:z in 5209) [ClassicSimilarity], result of:
              0.075540595 = score(doc=5209,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.29484767 = fieldWeight in 5209, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5209)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Chen et alia report on the design of FEATURES, a web search engine with adaptive features based on minimal relevance feedback. Rather than developing user profiles from previous searcher activity either at the server or client location, or updating indexes after search completion, FEATURES allows for index and user characterization files to be updated during query modification on retrieval from a general purpose search engine. Indexing terms relevant to a query are defined as the union of all terms assigned to documents retrieved by the initial search run and are used to build a vector space model on this retrieved set. The top ten weighted terms are presented to the user for a relevant non-relevant choice which is used to modify the term weights. Documents are chosen if their summed term weights are greater than some threshold. A user evaluation of the top ten ranked documents as non-relevant will decrease these term weights and a positive judgement will increase them. A new ordering of the retrieved set will generate new display lists of terms and documents. Precision is improved in a test on Alta Vista searches.
    Type
    a
  5. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.01
    0.014417464 = product of:
      0.057669856 = sum of:
        0.057669856 = product of:
          0.08650478 = sum of:
            0.008461362 = weight(_text_:a in 58) [ClassicSimilarity], result of:
              0.008461362 = score(doc=58,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.15287387 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
            0.078043416 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.078043416 = score(doc=58,freq=2.0), product of:
                0.16809508 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04800207 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Date
    14. 6.2015 22:12:44
    Type
    a
  6. Jiang, X.; Sun, X.; Yang, Z.; Zhuge, H.; Lapshinova-Koltunski, E.; Yao, J.: Exploiting heterogeneous scientific literature networks to combat ranking bias : evidence from the computational linguistics area (2016) 0.01
    0.0142520685 = product of:
      0.057008274 = sum of:
        0.057008274 = product of:
          0.08551241 = sum of:
            0.0099718105 = weight(_text_:a in 3017) [ClassicSimilarity], result of:
              0.0099718105 = score(doc=3017,freq=16.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.18016359 = fieldWeight in 3017, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3017)
            0.075540595 = weight(_text_:z in 3017) [ClassicSimilarity], result of:
              0.075540595 = score(doc=3017,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.29484767 = fieldWeight in 3017, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3017)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    It is important to help researchers find valuable papers from a large literature collection. To this end, many graph-based ranking algorithms have been proposed. However, most of these algorithms suffer from the problem of ranking bias. Ranking bias hurts the usefulness of a ranking algorithm because it returns a ranking list with an undesirable time distribution. This paper is a focused study on how to alleviate ranking bias by leveraging the heterogeneous network structure of the literature collection. We propose a new graph-based ranking algorithm, MutualRank, that integrates mutual reinforcement relationships among networks of papers, researchers, and venues to achieve a more synthetic, accurate, and less-biased ranking than previous methods. MutualRank provides a unified model that involves both intra- and inter-network information for ranking papers, researchers, and venues simultaneously. We use the ACL Anthology Network as the benchmark data set and construct the gold standard from computer linguistics course websites of well-known universities and two well-known textbooks. The experimental results show that MutualRank greatly outperforms the state-of-the-art competitors, including PageRank, HITS, CoRank, Future Rank, and P-Rank, in ranking papers in both improving ranking effectiveness and alleviating ranking bias. Rankings of researchers and venues by MutualRank are also quite reasonable.
    Type
    a
  7. Weiß, B.: Verwandte Seiten finden : "Ähnliche Seiten" oder "What's Related" (2005) 0.01
    0.014156346 = product of:
      0.056625385 = sum of:
        0.056625385 = weight(_text_:von in 868) [ClassicSimilarity], result of:
          0.056625385 = score(doc=868,freq=18.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.44215417 = fieldWeight in 868, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0390625 = fieldNorm(doc=868)
      0.25 = coord(1/4)
    
    Abstract
    Die Link-Struktur-Analyse (LSA) ist nicht nur beim Crawling, dem Webseitenranking, der Abgrenzung geographischer Bereiche, der Vorhersage von Linkverwendungen, dem Auffinden von "Mirror"-Seiten, dem Kategorisieren von Webseiten und beim Generieren von Webseitenstatistiken eines der wichtigsten Analyseverfahren, sondern auch bei der Suche nach verwandten Seiten. Um qualitativ hochwertige verwandte Seiten zu finden, bildet sie nach herrschender Meinung den Hauptbestandteil bei der Identifizierung von ähnlichen Seiten innerhalb themenspezifischer Graphen vernetzter Dokumente. Dabei wird stets von zwei Annahmen ausgegangen: Links zwischen zwei Dokumenten implizieren einen verwandten Inhalt beider Dokumente und wenn die Dokumente aus unterschiedlichen Quellen (von unterschiedlichen Autoren, Hosts, Domänen, .) stammen, so bedeutet dies das eine Quelle die andere über einen Link empfiehlt. Aufbauend auf dieser Idee entwickelte Kleinberg 1998 den HITS Algorithmus um verwandte Seiten über die Link-Struktur-Analyse zu bestimmen. Dieser Ansatz wurde von Bharat und Henzinger weiterentwickelt und später auch in Algorithmen wie dem Companion und Cocitation Algorithmus zur Suche von verwandten Seiten basierend auf nur einer Anfrage-URL weiter verfolgt. In der vorliegenden Seminararbeit sollen dabei die Algorithmen, die hinter diesen Überlegungen stehen, näher erläutert werden und im Anschluss jeweils neuere Forschungsansätze auf diesem Themengebiet aufgezeigt werden.
  8. Li, M.; Li, H.; Zhou, Z.-H.: Semi-supervised document retrieval (2009) 0.01
    0.014144729 = product of:
      0.056578916 = sum of:
        0.056578916 = product of:
          0.08486837 = sum of:
            0.009327774 = weight(_text_:a in 4218) [ClassicSimilarity], result of:
              0.009327774 = score(doc=4218,freq=14.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.1685276 = fieldWeight in 4218, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4218)
            0.075540595 = weight(_text_:z in 4218) [ClassicSimilarity], result of:
              0.075540595 = score(doc=4218,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.29484767 = fieldWeight in 4218, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4218)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    This paper proposes a new machine learning method for constructing ranking models in document retrieval. The method, which is referred to as SSRank, aims to use the advantages of both the traditional Information Retrieval (IR) methods and the supervised learning methods for IR proposed recently. The advantages include the use of limited amount of labeled data and rich model representation. To do so, the method adopts a semi-supervised learning framework in ranking model construction. Specifically, given a small number of labeled documents with respect to some queries, the method effectively labels the unlabeled documents for the queries. It then uses all the labeled data to train a machine learning model (in our case, Neural Network). In the data labeling, the method also makes use of a traditional IR model (in our case, BM25). A stopping criterion based on machine learning theory is given for the data labeling process. Experimental results on three benchmark datasets and one web search dataset indicate that SSRank consistently and almost always significantly outperforms the baseline methods (unsupervised and supervised learning methods), given the same amount of labeled data. This is because SSRank can effectively leverage the use of unlabeled data in learning.
    Type
    a
  9. Zhu, J.; Han, L.; Gou, Z.; Yuan, X.: ¬A fuzzy clustering-based denoising model for evaluating uncertainty in collaborative filtering recommender systems (2018) 0.01
    0.014144729 = product of:
      0.056578916 = sum of:
        0.056578916 = product of:
          0.08486837 = sum of:
            0.009327774 = weight(_text_:a in 4460) [ClassicSimilarity], result of:
              0.009327774 = score(doc=4460,freq=14.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.1685276 = fieldWeight in 4460, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4460)
            0.075540595 = weight(_text_:z in 4460) [ClassicSimilarity], result of:
              0.075540595 = score(doc=4460,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.29484767 = fieldWeight in 4460, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4460)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Recommender systems are effective in predicting the most suitable products for users, such as movies and books. To facilitate personalized recommendations, the quality of item ratings should be guaranteed. However, a few ratings might not be accurate enough due to the uncertainty of user behavior and are referred to as natural noise. In this article, we present a novel fuzzy clustering-based method for detecting noisy ratings. The entropy of a subset of the original ratings dataset is used to indicate the data-driven uncertainty, and evaluation metrics are adopted to represent the prediction-driven uncertainty. After the repetition of resampling and the execution of a recommendation algorithm, the entropy and evaluation metrics vectors are obtained and are empirically categorized to identify the proportion of the potential noise. Then, the fuzzy C-means-based denoising (FCMD) algorithm is performed to verify the natural noise under the assumption that natural noise is primarily the result of the exceptional behavior of users. Finally, a case study is performed using two real-world datasets. The experimental results show that our proposal outperforms previous proposals and has an advantage in dealing with natural noise.
    Type
    a
  10. Brenner, E.H.: Beyond Boolean : new approaches in information retrieval; the quest for intuitive online search systems past, present & future (1995) 0.01
    0.014035223 = product of:
      0.028070446 = sum of:
        0.026425181 = weight(_text_:von in 2547) [ClassicSimilarity], result of:
          0.026425181 = score(doc=2547,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.20633863 = fieldWeight in 2547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2547)
        0.0016452647 = product of:
          0.004935794 = sum of:
            0.004935794 = weight(_text_:a in 2547) [ClassicSimilarity], result of:
              0.004935794 = score(doc=2547,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.089176424 = fieldWeight in 2547, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2547)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Content
    (1) The Boolean world; (2) The Non-Boolean picture; (3) The commercial search engines: Personal Librarian, CLARIT, ConQuest, DR-LINK, InQuizit, InTEXT, TOPIC, WIN, TARGET, FREESTYLE, InfoSeek; (4) Wiedergabe von 8 Aufsätzen aus 'Monitor'
    Issue
    A collection of writings.
  11. Oberhauser, O.; Labner, J.: Relevance Ranking in Online-Katalogen : Informationsstand und Perspektiven (2003) 0.01
    0.014035223 = product of:
      0.028070446 = sum of:
        0.026425181 = weight(_text_:von in 2188) [ClassicSimilarity], result of:
          0.026425181 = score(doc=2188,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.20633863 = fieldWeight in 2188, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2188)
        0.0016452647 = product of:
          0.004935794 = sum of:
            0.004935794 = weight(_text_:a in 2188) [ClassicSimilarity], result of:
              0.004935794 = score(doc=2188,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.089176424 = fieldWeight in 2188, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2188)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    Bekanntlich führen Suchmaschinen wie Google &Co. beider Auflistung der Suchergebnisse ein "Ranking" nach "Relevanz" durch, d.h. die Dokumente werden in absteigender Reihenfolge entsprechend ihrer Erfüllung von Relevanzkriterien ausgeben. In Online-Katalogen (OPACs) ist derlei noch nicht allgemein übliche Praxis, doch bietet etwa das im Österreichischen Bibliothekenverbund eingesetzte System Aleph 500 tatsächlich eine solche Ranking-Option an (die im Verbundkatalog auch implementiert ist). Bislang liegen allerdings kaum Informationen zur Funktionsweise dieses Features, insbesondere auch im Hinblick auf eine Hilfestellung für Benutzer, vor. Daher möchten wir mit diesem Beitrag versuchen, den in unserem Verbund bestehenden Informationsstand zum Thema "Relevance Ranking" zu erweitern. Sowohl die Verwendung einer Ranking-Option in OPACs generell als auch die sich unter Aleph 500 konkret bietenden Möglichkeiten sollen im folgenden näher betrachtet werden.
    Type
    a
  12. Weller, K.; Stock, W.G.: Transitive meronymy : automatic concept-based query expansion using weighted transitive part-whole relations (2008) 0.01
    0.014035223 = product of:
      0.028070446 = sum of:
        0.026425181 = weight(_text_:von in 1835) [ClassicSimilarity], result of:
          0.026425181 = score(doc=1835,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.20633863 = fieldWeight in 1835, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1835)
        0.0016452647 = product of:
          0.004935794 = sum of:
            0.004935794 = weight(_text_:a in 1835) [ClassicSimilarity], result of:
              0.004935794 = score(doc=1835,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.089176424 = fieldWeight in 1835, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1835)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    Transitive Meronymie. Automatische begriffsbasierte Suchanfrageerweiterung unter Nutzung gewichteter transitiver Teil-Ganzes-Relationen. Unsere theoretisch orientierte Arbeit isoliert transitive Teil-Ganzes-Beziehungen. Wir diskutieren den Einsatz der Meronymie bei der automatischen begriffsbasierten Suchanfrageerweiterung im Information Retrieval. Aus praktischen Gründen schlagen wir vor, die Bestandsrelationen zu spezifizieren und die einzelnen Arten mit unterschiedlichen Gewichtungswerten zu versehen, die im Retrieval genutzt werden. Für das Design von Wissensordnungen ist bedeutsam, dass innerhalb der Begriffsleiter einer Abstraktionsrelation ein Begriff alle seine Teile (sowie alle transitiven Teile der Teile) an seine Unterbegriffe vererbt.
    Type
    a
  13. Reimer, U.: Empfehlungssysteme (2023) 0.01
    0.014035223 = product of:
      0.028070446 = sum of:
        0.026425181 = weight(_text_:von in 519) [ClassicSimilarity], result of:
          0.026425181 = score(doc=519,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.20633863 = fieldWeight in 519, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0546875 = fieldNorm(doc=519)
        0.0016452647 = product of:
          0.004935794 = sum of:
            0.004935794 = weight(_text_:a in 519) [ClassicSimilarity], result of:
              0.004935794 = score(doc=519,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.089176424 = fieldWeight in 519, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=519)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    Mit der wachsenden Informationsflut steigen die Anforderungen an Informationssysteme, aus der Menge potenziell relevanter Information die in einem bestimmten Kontext relevanteste zu selektieren. Empfehlungssysteme spielen hier eine besondere Rolle, da sie personalisiert - d. h. kontextspezifisch und benutzerindividuell - relevante Information herausfiltern können. Definition: Ein Empfehlungssystem empfiehlt einem Benutzer bzw. einer Benutzerin in einem definierten Kontext aus einer gegebenen Menge von Empfehlungsobjekten eine Teilmenge als relevant. Empfehlungssysteme machen Benutzer auf Objekte aufmerksam, die sie möglicherweise nie gefunden hätten, weil sie nicht danach gesucht hätten oder sie in der schieren Menge an insgesamt relevanter Information untergegangen wären.
    Type
    a
  14. Ye, Z.; Huang, J.X.: ¬A learning to rank approach for quality-aware pseudo-relevance feedback (2016) 0.01
    0.013904001 = product of:
      0.055616003 = sum of:
        0.055616003 = product of:
          0.083424 = sum of:
            0.007883408 = weight(_text_:a in 2855) [ClassicSimilarity], result of:
              0.007883408 = score(doc=2855,freq=10.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.14243183 = fieldWeight in 2855, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2855)
            0.075540595 = weight(_text_:z in 2855) [ClassicSimilarity], result of:
              0.075540595 = score(doc=2855,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.29484767 = fieldWeight in 2855, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2855)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Pseudo relevance feedback (PRF) has shown to be effective in ad hoc information retrieval. In traditional PRF methods, top-ranked documents are all assumed to be relevant and therefore treated equally in the feedback process. However, the performance gain brought by each document is different as showed in our preliminary experiments. Thus, it is more reasonable to predict the performance gain brought by each candidate feedback document in the process of PRF. We define the quality level (QL) and then use this information to adjust the weights of feedback terms in these documents. Unlike previous work, we do not make any explicit relevance assumption and we go beyond just selecting "good" documents for PRF. We propose a quality-based PRF framework, in which two quality-based assumptions are introduced. Particularly, two different strategies, relevance-based QL (RelPRF) and improvement-based QL (ImpPRF) are presented to estimate the QL of each feedback document. Based on this, we select a set of heterogeneous document-level features and apply a learning approach to evaluate the QL of each feedback document. Extensive experiments on standard TREC (Text REtrieval Conference) test collections show that our proposed model performs robustly and outperforms strong baselines significantly.
    Type
    a
  15. Chen, Z.; Fu, B.: On the complexity of Rocchio's similarity-based relevance feedback algorithm (2007) 0.01
    0.0137652885 = product of:
      0.055061154 = sum of:
        0.055061154 = product of:
          0.08259173 = sum of:
            0.007051134 = weight(_text_:a in 578) [ClassicSimilarity], result of:
              0.007051134 = score(doc=578,freq=8.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.12739488 = fieldWeight in 578, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=578)
            0.075540595 = weight(_text_:z in 578) [ClassicSimilarity], result of:
              0.075540595 = score(doc=578,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.29484767 = fieldWeight in 578, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=578)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Rocchio's similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d**2(log d + log n)) over the discretized vector space {0, ... , n - 1 }**d when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q, 0) over {0, ... , n - 1 }d can be improved to, at most, 1 + 2k (n - 1) (log d + log(n - 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound Omega((d über 2)log n) on its learning complexity over the Boolean vector space {0,1}**d.
    Type
    a
  16. Lanvent, A.: Licht im Daten Chaos (2004) 0.01
    0.013741862 = product of:
      0.027483724 = sum of:
        0.02615415 = weight(_text_:von in 2806) [ClassicSimilarity], result of:
          0.02615415 = score(doc=2806,freq=6.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.20422229 = fieldWeight in 2806, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.03125 = fieldNorm(doc=2806)
        0.0013295747 = product of:
          0.003988724 = sum of:
            0.003988724 = weight(_text_:a in 2806) [ClassicSimilarity], result of:
              0.003988724 = score(doc=2806,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.072065435 = fieldWeight in 2806, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2806)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    Spätestens bei der Suche nach E-Mails, PDF-Dokumenten oder Bildern mit Texten kapituliert die Windows-Suche. Vier von neun Desktop-Suchtools finden dagegen beinahe jede verborgene Datei.
    Content
    "Bitte suchen Sie alle Unterlagen, die im PC zum Ibelshäuser-Vertrag in Sprockhövel gespeichert sind. Finden Sie alles, was wir haben - Dokumente, Tabellen, Präsentationen, Scans, E-Mails. Und erledigen Sie das gleich! « Wer diese Aufgabe an das Windows-eigene Suchmodul vergibt, wird zwangsläufig enttäuscht. Denn das Betriebssystem beherrscht weder die formatübergreifende Recherche noch die Kontextsuche, die für solche komplexen Aufträge nötig sind. Professionelle Desktop-Suchmaschinen erledigen Aufgaben dieser Art jedoch im Handumdrehen - genauer gesagt in einer einzigen Sekunde. Spitzenprogramme wie Global Brain benötigen dafür nicht einmal umfangreiche Abfrageformulare. Es genügt, einen Satz im Eingabefeld zu formulieren, der das Thema der gewünschten Dokumente eingrenzt. Dabei suchen die Programme über alle Laufwerke, die sich auf dem System einbinden lassen - also auch im Netzwerk-Ordner (Shared Folder), sofern dieser freigegeben wurde. Allen Testkandidaten - mit Ausnahme von Search 32 - gemeinsam ist, dass sie weitaus bessere Rechercheergebnisse abliefern als Windows, deutlich schneller arbeiten und meist auch in den Online-Postfächern stöbern. Wer schon öfter vergeblich über die Windows-Suche nach wichtigen Dokumenten gefahndet hat, kommt angesichts der Qualität der Search-Engines kaum mehr um die Anschaffung eines Desktop-Suchtools herum. Aber Microsoft will nachbessern. Für den Windows-XP-Nachfolger Longhorn wirbt der Hersteller vor allem mit dem Hinweis auf das neue Dateisystem WinFS, das sämtliche Files auf der Festplatte über Meta-Tags indiziert und dem Anwender damit lange Suchläufe erspart. So sollen sich anders als bei Windows XP alle Dateien zu bestimmten Themen in wenigen Sekunden auflisten lassen - unabhängig vom Format und vom physikalischen Speicherort der Files. Für die Recherche selbst ist dann weder der Dateiname noch das Erstelldatum ausschlaggebend. Anhand der kontextsensitiven Suche von WinFS kann der Anwender einfach einen Suchbefehl wie »Vertragsabschluss mit Firma XYZ, Neunkirchen/Saar« eingeben, der dann ohne Umwege zum Ziel führt."
    Type
    a
  17. Tsai, C.-F.; Hu, Y.-H.; Chen, Z.-Y.: Factors affecting rocchio-based pseudorelevance feedback in image retrieval (2015) 0.01
    0.013421084 = product of:
      0.053684335 = sum of:
        0.053684335 = product of:
          0.0805265 = sum of:
            0.0049859053 = weight(_text_:a in 1607) [ClassicSimilarity], result of:
              0.0049859053 = score(doc=1607,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.090081796 = fieldWeight in 1607, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1607)
            0.075540595 = weight(_text_:z in 1607) [ClassicSimilarity], result of:
              0.075540595 = score(doc=1607,freq=2.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.29484767 = fieldWeight in 1607, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1607)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Pseudorelevance feedback (PRF) was proposed to solve the limitation of relevance feedback (RF), which is based on the user-in-the-loop process. In PRF, the top-k retrieved images are regarded as PRF. Although the PRF set contains noise, PRF has proven effective for automatically improving the overall retrieval result. To implement PRF, the Rocchio algorithm has been considered as a reasonable and well-established baseline. However, the performance of Rocchio-based PRF is subject to various representation choices (or factors). In this article, we examine these factors that affect the performance of Rocchio-based PRF, including image-feature representation, the number of top-ranked images, the weighting parameters of Rocchio, and similarity measure. We offer practical insights on how to optimize the performance of Rocchio-based PRF by choosing appropriate representation choices. Our extensive experiments on NUS-WIDE-LITE and Caltech 101 + Corel 5000 data sets show that the optimal feature representation is color moment + wavelet texture in terms of retrieval efficiency and effectiveness. Other representation choices are that using top-20 ranked images as pseudopositive and pseudonegative feedback sets with the equal weight (i.e., 0.5) by the correlation and cosine distance functions can produce the optimal retrieval result.
    Type
    a
  18. Mayr, P.: Bradfordizing als Re-Ranking-Ansatz in Literaturinformationssystemen (2011) 0.01
    0.012030191 = product of:
      0.024060382 = sum of:
        0.022650154 = weight(_text_:von in 4292) [ClassicSimilarity], result of:
          0.022650154 = score(doc=4292,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.17686167 = fieldWeight in 4292, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.046875 = fieldNorm(doc=4292)
        0.001410227 = product of:
          0.004230681 = sum of:
            0.004230681 = weight(_text_:a in 4292) [ClassicSimilarity], result of:
              0.004230681 = score(doc=4292,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.07643694 = fieldWeight in 4292, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4292)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    In diesem Artikel wird ein Re-Ranking-Ansatz für Suchsysteme vorgestellt, der die Recherche nach wissenschaftlicher Literatur messbar verbessern kann. Das nichttextorientierte Rankingverfahren Bradfordizing wird eingeführt und anschließend im empirischen Teil des Artikels bzgl. der Effektivität für typische fachbezogene Recherche-Topics evaluiert. Dem Bradford Law of Scattering (BLS), auf dem Bradfordizing basiert, liegt zugrunde, dass sich die Literatur zu einem beliebigen Fachgebiet bzw. -thema in Zonen unterschiedlicher Dokumentenkonzentration verteilt. Dem Kernbereich mit hoher Konzentration der Literatur folgen Bereiche mit mittlerer und geringer Konzentration. Bradfordizing sortiert bzw. rankt eine Dokumentmenge damit nach den sogenannten Kernzeitschriften. Der Retrievaltest mit 164 intellektuell bewerteten Fragestellungen in Fachdatenbanken aus den Bereichen Sozial- und Politikwissenschaften, Wirtschaftswissenschaften, Psychologie und Medizin zeigt, dass die Dokumente der Kernzeitschriften signifikant häufiger relevant bewertet werden als Dokumente der zweiten Dokumentzone bzw. den Peripherie-Zeitschriften. Die Implementierung von Bradfordizing und weiteren Re-Rankingverfahren liefert unmittelbare Mehrwerte für den Nutzer.
    Type
    a
  19. Mutschke, P.: Autorennetzwerke : Verfahren zur Netzwerkanalyse als Mehrwertdienste für Informationssysteme (2004) 0.01
    0.011325077 = product of:
      0.04530031 = sum of:
        0.04530031 = weight(_text_:von in 4050) [ClassicSimilarity], result of:
          0.04530031 = score(doc=4050,freq=8.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.35372335 = fieldWeight in 4050, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.046875 = fieldNorm(doc=4050)
      0.25 = coord(1/4)
    
    Abstract
    Virtuelle Bibliotheken enthalten eine Fülle an Informationen, die in ihrer Vielfalt und Tiefe von Standardsuchmaschinen nicht erschöpfend erfasst wird. Der Arbeitsbericht informiert über Entwicklungen am IZ, die darauf abzielen, Wissen über das Interaktionsgeschehen in wissenschaftlichen Communities und den sozialen Status ihrer Akteure für das Retrieval auszunutzen. Grundlage hierfür sind soziale Netzwerke, die sich durch Kooperation der wissenschaftlichen Akteure konstituieren und in den Dokumenten der Datenbasis z.B. als Koautorbeziehungen repräsentiert sind (Autorennetzwerke). Die in dem Bericht beschriebenen Studien zur Small-World-Topologie von Autorennetzwerken zeigen, dass diese Netzwerke ein erhebliches Potential für Informationssysteme haben. Der Bericht diskutiert Szenarios, die beschreiben, wie Autorennetzwerke und hier insbesondere das Konzept der Akteurszentralität für die Informationssuche in Datenbanken sinnvoll genutzt werden können. Kernansatz dieser Retrievalmodelle ist die Suche nach Experten und das Ranking von Dokumenten auf der Basis der Zentralität von Autoren in Autorennetzwerken.
  20. Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.01
    0.010773733 = product of:
      0.043094933 = sum of:
        0.043094933 = product of:
          0.0646424 = sum of:
            0.012613453 = weight(_text_:a in 1422) [ClassicSimilarity], result of:
              0.012613453 = score(doc=1422,freq=10.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.22789092 = fieldWeight in 1422, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
            0.052028947 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
              0.052028947 = score(doc=1422,freq=2.0), product of:
                0.16809508 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04800207 = queryNorm
                0.30952093 = fieldWeight in 1422, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
    Date
    22. 3.2003 19:27:23
    Type
    a

Years

Languages

Types

  • a 337
  • m 9
  • el 8
  • x 7
  • s 4
  • r 3
  • p 2
  • More… Less…