Search (356 results, page 1 of 18)

Moura, E.S. de; Fernandes, D.; Ribeiro-Neto, B.; Silva, A.S. da; Gonçalves, M.A.: Using structural information to improve search in Web collections (2010) 0.11

0.109603465 = product of:
  0.14613795 = sum of:
    0.065153345 = weight(_text_:da in 4119) [ClassicSimilarity], result of:
      0.065153345 = score(doc=4119,freq=2.0), product of:
        0.20483522 = queryWeight, product of:
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.04269026 = queryNorm
        0.31807688 = fieldWeight in 4119, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.046875 = fieldNorm(doc=4119)
    0.07677797 = product of:
      0.15355594 = sum of:
        0.15355594 = weight(_text_:silva in 4119) [ClassicSimilarity], result of:
          0.15355594 = score(doc=4119,freq=2.0), product of:
            0.31446302 = queryWeight, product of:
              7.3661537 = idf(docFreq=75, maxDocs=44218)
              0.04269026 = queryNorm
            0.48831162 = fieldWeight in 4119, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.3661537 = idf(docFreq=75, maxDocs=44218)
              0.046875 = fieldNorm(doc=4119)
      0.5 = coord(1/2)
    0.004206628 = product of:
      0.008413256 = sum of:
        0.008413256 = weight(_text_:a in 4119) [ClassicSimilarity], result of:
          0.008413256 = score(doc=4119,freq=10.0), product of:
            0.049223874 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04269026 = queryNorm
            0.1709182 = fieldWeight in 4119, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=4119)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: In this work, we investigate the problem of using the block structure of Web pages to improve ranking results. Starting with basic intuitions provided by the concepts of term frequency (TF) and inverse document frequency (IDF), we propose nine block-weight functions to distinguish the impact of term occurrences inside page blocks, instead of inside whole pages. These are then used to compute a modified BM25 ranking function. Using four distinct Web collections, we ran extensive experiments to compare our block-weight ranking formulas with two other baselines: (a) a BM25 ranking applied to full pages, and (b) a BM25 ranking that takes into account best blocks. Our methods suggest that our block-weighting ranking method is superior to all baselines across all collections we used and that average gain in precision figures from 5 to 20% are generated.
Type: a

Costa Carvalho, A. da; Rossi, C.; Moura, E.S. de; Silva, A.S. da; Fernandes, D.: LePrEF: Learn to precompute evidence fusion for efficient query evaluation (2012) 0.11
```
0.10929237 = product of:
  0.14572316 = sum of:
    0.07678396 = weight(_text_:da in 278) [ClassicSimilarity], result of:
      0.07678396 = score(doc=278,freq=4.0), product of:
        0.20483522 = queryWeight, product of:
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.04269026 = queryNorm
        0.37485722 = fieldWeight in 278, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.0390625 = fieldNorm(doc=278)
    0.063981645 = product of:
      0.12796329 = sum of:
        0.12796329 = weight(_text_:silva in 278) [ClassicSimilarity], result of:
          0.12796329 = score(doc=278,freq=2.0), product of:
            0.31446302 = queryWeight, product of:
              7.3661537 = idf(docFreq=75, maxDocs=44218)
              0.04269026 = queryNorm
            0.40692633 = fieldWeight in 278, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.3661537 = idf(docFreq=75, maxDocs=44218)
              0.0390625 = fieldNorm(doc=278)
      0.5 = coord(1/2)
    0.004957558 = product of:
      0.009915116 = sum of:
        0.009915116 = weight(_text_:a in 278) [ClassicSimilarity], result of:
          0.009915116 = score(doc=278,freq=20.0), product of:
            0.049223874 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04269026 = queryNorm
            0.20142901 = fieldWeight in 278, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=278)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

State-of-the-art search engine ranking methods combine several distinct sources of relevance evidence to produce a high-quality ranking of results for each query. The fusion of information is currently done at query-processing time, which has a direct effect on the response time of search systems. Previous research also shows that an alternative to improve search efficiency in textual databases is to precompute term impacts at indexing time. In this article, we propose a novel alternative to precompute term impacts, providing a generic framework for combining any distinct set of sources of evidence by using a machine-learning technique. This method retains the advantages of producing high-quality results, but avoids the costs of combining evidence at query-processing time. Our method, called Learn to Precompute Evidence Fusion (LePrEF), uses genetic programming to compute a unified precomputed impact value for each term found in each document prior to query processing, at indexing time. Compared with previous research on precomputing term impacts, our method offers the advantage of providing a generic framework to precompute impact using any set of relevance evidence at any text collection, whereas previous research articles do not. The precomputed impact values are indexed and used later for computing document ranking at query-processing time. By doing so, our method effectively reduces the query processing to simple additions of such impacts. We show that this approach, while leading to results comparable to state-of-the-art ranking methods, also can lead to a significant decrease in computational costs during query processing.

Type

a

Archuby, C.G.: Interfaces se recuperacion para catalogos en linea con salidas ordenadas por probable relevancia (2000) 0.06

0.055862173 = product of:
  0.11172435 = sum of:
    0.10858891 = weight(_text_:da in 5727) [ClassicSimilarity], result of:
      0.10858891 = score(doc=5727,freq=2.0), product of:
        0.20483522 = queryWeight, product of:
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.04269026 = queryNorm
        0.5301281 = fieldWeight in 5727, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.078125 = fieldNorm(doc=5727)
    0.0031354348 = product of:
      0.0062708696 = sum of:
        0.0062708696 = weight(_text_:a in 5727) [ClassicSimilarity], result of:
          0.0062708696 = score(doc=5727,freq=2.0), product of:
            0.049223874 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04269026 = queryNorm
            0.12739488 = fieldWeight in 5727, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=5727)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: Ciencia da informacao. 29(2000) no.3, S.5-13
Type: a

Reimer, U.: Empfehlungssysteme (2023) 0.04

0.039103523 = product of:
  0.078207046 = sum of:
    0.07601224 = weight(_text_:da in 519) [ClassicSimilarity], result of:
      0.07601224 = score(doc=519,freq=2.0), product of:
        0.20483522 = queryWeight, product of:
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.04269026 = queryNorm
        0.3710897 = fieldWeight in 519, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.0546875 = fieldNorm(doc=519)
    0.0021948046 = product of:
      0.004389609 = sum of:
        0.004389609 = weight(_text_:a in 519) [ClassicSimilarity], result of:
          0.004389609 = score(doc=519,freq=2.0), product of:
            0.049223874 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04269026 = queryNorm
            0.089176424 = fieldWeight in 519, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=519)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Mit der wachsenden Informationsflut steigen die Anforderungen an Informationssysteme, aus der Menge potenziell relevanter Information die in einem bestimmten Kontext relevanteste zu selektieren. Empfehlungssysteme spielen hier eine besondere Rolle, da sie personalisiert - d. h. kontextspezifisch und benutzerindividuell - relevante Information herausfiltern können. Definition: Ein Empfehlungssystem empfiehlt einem Benutzer bzw. einer Benutzerin in einem definierten Kontext aus einer gegebenen Menge von Empfehlungsobjekten eine Teilmenge als relevant. Empfehlungssysteme machen Benutzer auf Objekte aufmerksam, die sie möglicherweise nie gefunden hätten, weil sie nicht danach gesucht hätten oder sie in der schieren Menge an insgesamt relevanter Information untergegangen wären.
Type: a

Silva, R.M.; Gonçalves, M.A.; Veloso, A.: ¬A Two-stage active learning method for learning to rank (2014) 0.03

0.034817066 = product of:
  0.06963413 = sum of:
    0.063981645 = product of:
      0.12796329 = sum of:
        0.12796329 = weight(_text_:silva in 1184) [ClassicSimilarity], result of:
          0.12796329 = score(doc=1184,freq=2.0), product of:
            0.31446302 = queryWeight, product of:
              7.3661537 = idf(docFreq=75, maxDocs=44218)
              0.04269026 = queryNorm
            0.40692633 = fieldWeight in 1184, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.3661537 = idf(docFreq=75, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1184)
      0.5 = coord(1/2)
    0.005652486 = product of:
      0.011304972 = sum of:
        0.011304972 = weight(_text_:a in 1184) [ClassicSimilarity], result of:
          0.011304972 = score(doc=1184,freq=26.0), product of:
            0.049223874 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04269026 = queryNorm
            0.22966442 = fieldWeight in 1184, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1184)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Learning to rank (L2R) algorithms use a labeled training set to generate a ranking model that can later be used to rank new query results. These training sets are costly and laborious to produce, requiring human annotators to assess the relevance or order of the documents in relation to a query. Active learning algorithms are able to reduce the labeling effort by selectively sampling an unlabeled set and choosing data instances that maximize a learning function's effectiveness. In this article, we propose a novel two-stage active learning method for L2R that combines and exploits interesting properties of its constituent parts, thus being effective and practical. In the first stage, an association rule active sampling algorithm is used to select a very small but effective initial training set. In the second stage, a query-by-committee strategy trained with the first-stage set is used to iteratively select more examples until a preset labeling budget is met or a target effectiveness is achieved. We test our method with various LETOR benchmarking data sets and compare it with several baselines to show that it achieves good results using only a small portion of the original training sets.
Type: a

Wilhelmy, A.: Phonetische Ähnlichkeitssuche in Datenbanken (1991) 0.03

0.033906925 = product of:
  0.06781385 = sum of:
    0.065153345 = weight(_text_:da in 5684) [ClassicSimilarity], result of:
      0.065153345 = score(doc=5684,freq=2.0), product of:
        0.20483522 = queryWeight, product of:
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.04269026 = queryNorm
        0.31807688 = fieldWeight in 5684, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.046875 = fieldNorm(doc=5684)
    0.0026605048 = product of:
      0.0053210096 = sum of:
        0.0053210096 = weight(_text_:a in 5684) [ClassicSimilarity], result of:
          0.0053210096 = score(doc=5684,freq=4.0), product of:
            0.049223874 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04269026 = queryNorm
            0.10809815 = fieldWeight in 5684, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=5684)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: In dialoggesteuerten Systemen zur Informationswiedergewinnung (Information Retrieval Systems, IRS) kann man - vergröbernd - das Wechselspiel zwischen Mensch und Computer als iterativen Prozess zur Erhöhung von Genauigkeit (Precision) auf der einen und Vollständigkeit (Recall) der Nachweise auf der anderen Seite verstehen. Vorgestellt wird ein maschinell anwendbares Verfahren, das auf phonologische Untersuchungen des Sprachwissenschaftlers Nikolaj S. Trubetzkoy (1890-1938) zurückgeht. In den Grundzügen kann es erheblich zur Verbesserung der Nachweisvollständigkeit beitragen. Dadurch, daß es die 'Ähnlichkeitsumgebungen' von Suchbegriffen in die Recherche mit einbezieht, zeigt es sich vor allem für Systeme mit koordinativer maschineller Indexierung als vorteilhaft. Bei alphabetischen Begriffen erweist sich die Einführung eines solchen zunächst nur auf den Benutzer hin orientierten Verfahrens auch aus technischer Sicht als günstig, da damit die Anzahl der Zugriffe bei den Suchvorgängen auch für große Datenvolumina niedrig gehalten werden kann
Type: a

Lanvent, A.: Praxis - Windows-Suche und Indexdienst : Auch Windows kann bei der Suche den Turbo einlegen: mit dem Indexdienst (2004) 0.03
```
0.028255772 = product of:
  0.056511544 = sum of:
    0.054294456 = weight(_text_:da in 3316) [ClassicSimilarity], result of:
      0.054294456 = score(doc=3316,freq=2.0), product of:
        0.20483522 = queryWeight, product of:
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.04269026 = queryNorm
        0.26506406 = fieldWeight in 3316, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.7981725 = idf(docFreq=990, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3316)
    0.0022170874 = product of:
      0.004434175 = sum of:
        0.004434175 = weight(_text_:a in 3316) [ClassicSimilarity], result of:
          0.004434175 = score(doc=3316,freq=4.0), product of:
            0.049223874 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.04269026 = queryNorm
            0.090081796 = fieldWeight in 3316, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3316)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Content

"Für eine 4-GByte-Festplatte mit mehreren Partitionen sucht Windows XP im Volltextmodus weit über zwei Stunden. Der Indexdienst verkürzt diese Recherchedauer drastisch um mehr als eine Stunde. Im Gegensatz zu den Indizes der kommerziellen Suchwerkzeuge erfasst der Windows-Indexdienst nur Text-, HTML- und OfficeDateien über entsprechend integrierte Dokumentfilter. Da er weder ZIP-Files noch PDFs erkennt und auch keine E-Mails scannt, ist er mit komplexen Anfragen schnell überfordert. Standardmäßig ist der Indexdienst zwar installiert, aber nicht aktiviert. Das erledigt der Anwender über Start/Arbeitsplatz und den Befehl Verwalten aus dem Kontextmenü. In der Computerverwaltung aktiviert der Benutzer den Eintrag Indexdienst und wählt Starten aus dem Kontextmenü. Die zu indizierenden Elemente verwaltet Windows über so genannte Kataloge, mit deren Hilfe der User bestimmt, welche Dateitypen aus welchen Ordnern indiziert werden sollen. Zwar kann der Anwender neben dem Katalog System weitere Kataloge einrichten. Ausreichend ist es aber in den meisten Fällen, dem Katalog System weitere Indizierungsordner über die Befehle Neu/Verzeichnis hinzuzufügen. Klickt der Benutzer dann einen der Indizierungsordner mit der rechten Maustaste an und wählt Alle Tasks/Erneut prüfen (Vollständig), beginnt der mitunter langwierige Indizierungsprozess. Über den Eigenschaften-Dialog lässt sich allerdings der Leistungsverbrauch drosseln. Eine inkrementelle Indizierung, bei der Windows nur neue Elemente im jeweiligen Verzeichnis unter die Lupe nimmt, erreicht der Nutzer über Alle Tasks/Erneut prüfen (inkrementell). Einschalten lässt sich der Indexdienst auch über die Eigenschaften eines Ordners und den Befehl Erweitert/ln-halt für schnelle Dateisuche indizieren. Auskunft über die dem Indexdienst zugeordneten Ordner und Laufwerke erhalten Sie, wenn Sie die WindowsSuche starten und Weitere Optionen/ Andere Suchoptionen/Bevorzugte Einstellungen ändern/Indexdienst verwenden anklicken."

Type

a

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03

0.02564411 = product of:
  0.10257644 = sum of:
    0.10257644 = sum of:
      0.010033391 = weight(_text_:a in 402) [ClassicSimilarity], result of:
        0.010033391 = score(doc=402,freq=2.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.20383182 = fieldWeight in 402, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.125 = fieldNorm(doc=402)
      0.09254305 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
        0.09254305 = score(doc=402,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.61904186 = fieldWeight in 402, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.125 = fieldNorm(doc=402)
  0.25 = coord(1/4)

Source: Information processing and management. 22(1986) no.6, S.465-476
Type: a

Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.02

0.023347715 = product of:
  0.09339086 = sum of:
    0.09339086 = sum of:
      0.01241569 = weight(_text_:a in 2134) [ClassicSimilarity], result of:
        0.01241569 = score(doc=2134,freq=4.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.25222903 = fieldWeight in 2134, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.109375 = fieldNorm(doc=2134)
      0.08097517 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
        0.08097517 = score(doc=2134,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.5416616 = fieldWeight in 2134, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.109375 = fieldNorm(doc=2134)
  0.25 = coord(1/4)

Date: 30. 3.2001 13:32:22
Type: a

Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02

0.022438597 = product of:
  0.08975439 = sum of:
    0.08975439 = sum of:
      0.008779218 = weight(_text_:a in 3445) [ClassicSimilarity], result of:
        0.008779218 = score(doc=3445,freq=2.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.17835285 = fieldWeight in 3445, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.109375 = fieldNorm(doc=3445)
      0.08097517 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
        0.08097517 = score(doc=3445,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.5416616 = fieldWeight in 3445, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.109375 = fieldNorm(doc=3445)
  0.25 = coord(1/4)

Date: 25. 8.2005 17:42:22
Type: a

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.02

0.019233081 = product of:
  0.076932326 = sum of:
    0.076932326 = sum of:
      0.0075250445 = weight(_text_:a in 58) [ClassicSimilarity], result of:
        0.0075250445 = score(doc=58,freq=2.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.15287387 = fieldWeight in 58, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.09375 = fieldNorm(doc=58)
      0.069407284 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
        0.069407284 = score(doc=58,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.46428138 = fieldWeight in 58, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.09375 = fieldNorm(doc=58)
  0.25 = coord(1/4)

Date: 14. 6.2015 22:12:44
Type: a

Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.02

0.019233081 = product of:
  0.076932326 = sum of:
    0.076932326 = sum of:
      0.0075250445 = weight(_text_:a in 2051) [ClassicSimilarity], result of:
        0.0075250445 = score(doc=2051,freq=2.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.15287387 = fieldWeight in 2051, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.09375 = fieldNorm(doc=2051)
      0.069407284 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
        0.069407284 = score(doc=2051,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.46428138 = fieldWeight in 2051, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.09375 = fieldNorm(doc=2051)
  0.25 = coord(1/4)

Date: 14. 6.2015 22:12:56
Type: a

Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.01

0.0143723 = product of:
  0.0574892 = sum of:
    0.0574892 = sum of:
      0.011217674 = weight(_text_:a in 1422) [ClassicSimilarity], result of:
        0.011217674 = score(doc=1422,freq=10.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.22789092 = fieldWeight in 1422, product of:
            3.1622777 = tf(freq=10.0), with freq of:
              10.0 = termFreq=10.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0625 = fieldNorm(doc=1422)
      0.046271525 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
        0.046271525 = score(doc=1422,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.30952093 = fieldWeight in 1422, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=1422)
  0.25 = coord(1/4)

Abstract: We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
Date: 22. 3.2003 19:27:23
Type: a

Faloutsos, C.: Signature files (1992) 0.01

0.014076229 = product of:
  0.056304917 = sum of:
    0.056304917 = sum of:
      0.010033391 = weight(_text_:a in 3499) [ClassicSimilarity], result of:
        0.010033391 = score(doc=3499,freq=8.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.20383182 = fieldWeight in 3499, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0625 = fieldNorm(doc=3499)
      0.046271525 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
        0.046271525 = score(doc=3499,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.30952093 = fieldWeight in 3499, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=3499)
  0.25 = coord(1/4)

Abstract: Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
Date: 7. 5.1999 15:22:48
Type: a

Bornmann, L.; Mutz, R.: From P100 to P100' : a new citation-rank approach (2014) 0.01

0.014076229 = product of:
  0.056304917 = sum of:
    0.056304917 = sum of:
      0.010033391 = weight(_text_:a in 1431) [ClassicSimilarity], result of:
        0.010033391 = score(doc=1431,freq=8.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.20383182 = fieldWeight in 1431, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0625 = fieldNorm(doc=1431)
      0.046271525 = weight(_text_:22 in 1431) [ClassicSimilarity], result of:
        0.046271525 = score(doc=1431,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.30952093 = fieldWeight in 1431, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=1431)
  0.25 = coord(1/4)

Abstract: Properties of a percentile-based rating scale needed in bibliometrics are formulated. Based on these properties, P100 was recently introduced as a new citation-rank approach (Bornmann, Leydesdorff, & Wang, 2013). In this paper, we conceptualize P100 and propose an improvement which we call P100'. Advantages and disadvantages of citation-rank indicators are noted.
Date: 22. 8.2014 17:05:18
Type: a

MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01

0.013341552 = product of:
  0.053366207 = sum of:
    0.053366207 = sum of:
      0.00709468 = weight(_text_:a in 5108) [ClassicSimilarity], result of:
        0.00709468 = score(doc=5108,freq=4.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.14413087 = fieldWeight in 5108, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0625 = fieldNorm(doc=5108)
      0.046271525 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
        0.046271525 = score(doc=5108,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.30952093 = fieldWeight in 5108, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=5108)
  0.25 = coord(1/4)

Date: 20. 1.2007 18:30:22
Type: a

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.01

0.012022653 = product of:
  0.04809061 = sum of:
    0.04809061 = sum of:
      0.0076030265 = weight(_text_:a in 1319) [ClassicSimilarity], result of:
        0.0076030265 = score(doc=1319,freq=6.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.1544581 = fieldWeight in 1319, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1319)
      0.040487584 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
        0.040487584 = score(doc=1319,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.2708308 = fieldWeight in 1319, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1319)
  0.25 = coord(1/4)

Abstract: Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia
Type: a

Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.01
```
0.01197742 = product of:
  0.04790968 = sum of:
    0.04790968 = sum of:
      0.007011046 = weight(_text_:a in 2591) [ClassicSimilarity], result of:
        0.007011046 = score(doc=2591,freq=10.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.14243183 = fieldWeight in 2591, product of:
            3.1622777 = tf(freq=10.0), with freq of:
              10.0 = termFreq=10.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2591)
      0.040898636 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
        0.040898636 = score(doc=2591,freq=4.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.27358043 = fieldWeight in 2591, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2591)
  0.25 = coord(1/4)
```
Abstract

Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Date

20. 1.2015 18:30:22
18. 9.2018 18:22:56

Type

a

Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.01

0.0112192985 = product of:
  0.044877194 = sum of:
    0.044877194 = sum of:
      0.004389609 = weight(_text_:a in 3276) [ClassicSimilarity], result of:
        0.004389609 = score(doc=3276,freq=2.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.089176424 = fieldWeight in 3276, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3276)
      0.040487584 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
        0.040487584 = score(doc=3276,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.2708308 = fieldWeight in 3276, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3276)
  0.25 = coord(1/4)

Date: 20. 3.2005 16:23:22
Type: a

Witschel, H.F.: Global term weights in distributed environments (2008) 0.01

0.011164585 = product of:
  0.04465834 = sum of:
    0.04465834 = sum of:
      0.009954698 = weight(_text_:a in 2096) [ClassicSimilarity], result of:
        0.009954698 = score(doc=2096,freq=14.0), product of:
          0.049223874 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.04269026 = queryNorm
          0.20223314 = fieldWeight in 2096, product of:
            3.7416575 = tf(freq=14.0), with freq of:
              14.0 = termFreq=14.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046875 = fieldNorm(doc=2096)
      0.034703642 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
        0.034703642 = score(doc=2096,freq=2.0), product of:
          0.149494 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04269026 = queryNorm
          0.23214069 = fieldWeight in 2096, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2096)
  0.25 = coord(1/4)

Abstract: This paper examines the estimation of global term weights (such as IDF) in information retrieval scenarios where a global view on the collection is not available. In particular, the two options of either sampling documents or of using a reference corpus independent of the target retrieval collection are compared using standard IR test collections. In addition, the possibility of pruning term lists based on frequency is evaluated. The results show that very good retrieval performance can be reached when just the most frequent terms of a collection - an "extended stop word list" - are known and all terms which are not in that list are treated equally. However, the list cannot always be fully estimated from a general-purpose reference corpus, but some "domain-specific stop words" need to be added. A good solution for achieving this is to mix estimates from small samples of the target retrieval collection with ones derived from a reference corpus.
Date: 1. 8.2008 9:44:22
Type: a

Search (356 results, page 1 of 18)

Authors

Years

Languages

Types

Themes

Subjects

Classifications