Search (74 results, page 1 of 4)

Efron, M.; Winget, M.: Query polyrepresentation for ranking retrieval systems without relevance judgments (2010) 0.04

0.038944405 = product of:
  0.19472201 = sum of:
    0.19472201 = weight(_text_:1091 in 3469) [ClassicSimilarity], result of:
      0.19472201 = score(doc=3469,freq=2.0), product of:
        0.35686025 = queryWeight, product of:
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.04335484 = queryNorm
        0.5456534 = fieldWeight in 3469, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.046875 = fieldNorm(doc=3469)
  0.2 = coord(1/5)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.6, S.1081-1091

Koumenides, C.L.; Shadbolt, N.R.: Ranking methods for entity-oriented semantic web search (2014) 0.04

0.038944405 = product of:
  0.19472201 = sum of:
    0.19472201 = weight(_text_:1091 in 1280) [ClassicSimilarity], result of:
      0.19472201 = score(doc=1280,freq=2.0), product of:
        0.35686025 = queryWeight, product of:
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.04335484 = queryNorm
        0.5456534 = fieldWeight in 1280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.046875 = fieldNorm(doc=1280)
  0.2 = coord(1/5)

Source: Journal of the Association for Information Science and Technology. 65(2014) no.6, S.1091-1106

Gonnet, G.H.; Snider, T.; Baeza-Yates, R.A.: New indices for text : PAT trees and PAT arrays (1992) 0.04

0.038422342 = product of:
  0.09605585 = sum of:
    0.05946955 = weight(_text_:t in 3500) [ClassicSimilarity], result of:
      0.05946955 = score(doc=3500,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.34819782 = fieldWeight in 3500, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.0625 = fieldNorm(doc=3500)
    0.036586303 = product of:
      0.07317261 = sum of:
        0.07317261 = weight(_text_:index in 3500) [ClassicSimilarity], result of:
          0.07317261 = score(doc=3500,freq=2.0), product of:
            0.18945041 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04335484 = queryNorm
            0.3862362 = fieldWeight in 3500, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0625 = fieldNorm(doc=3500)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: We survey new indices for text, with emphasis on PAT arrays (also called suffic arrays). A PAT array is an index based on a new model of text that does not use the concept of word and does not need to know the structure of text

Liddy, E.D.; Diamond, T.; McKenna, M.: DR-LINK in TIPSTER (2000) 0.02

0.02378782 = product of:
  0.1189391 = sum of:
    0.1189391 = weight(_text_:t in 3907) [ClassicSimilarity], result of:
      0.1189391 = score(doc=3907,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.69639564 = fieldWeight in 3907, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.125 = fieldNorm(doc=3907)
  0.2 = coord(1/5)

Niemi, T.; Junkkari, M.; Järvelin, K.; Viita, S.: Advanced query language for manipulating complex entities (2004) 0.02

0.020814342 = product of:
  0.104071714 = sum of:
    0.104071714 = weight(_text_:t in 4218) [ClassicSimilarity], result of:
      0.104071714 = score(doc=4218,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.60934615 = fieldWeight in 4218, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.109375 = fieldNorm(doc=4218)
  0.2 = coord(1/5)

Chang, M.; Poon, C.K.: Efficient phrase querying with common phrase index (2008) 0.02
```
0.016463837 = product of:
  0.082319185 = sum of:
    0.082319185 = product of:
      0.16463837 = sum of:
        0.16463837 = weight(_text_:index in 2061) [ClassicSimilarity], result of:
          0.16463837 = score(doc=2061,freq=18.0), product of:
            0.18945041 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04335484 = queryNorm
            0.8690314 = fieldWeight in 2061, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=2061)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

In this paper, we propose a common phrase index as an efficient index structure to support phrase queries in a very large text database. Our structure is an extension of previous index structures for phrases and achieves better query efficiency with modest extra storage cost. Further improvement in efficiency can be attained by implementing our index according to our observation of the dynamic nature of common word set. In experimental evaluation, a common phrase index using 255 common words has an improvement of about 11% and 62% in query time for the overall and large queries (queries of long phrases) respectively over an auxiliary nextword index. Moreover, it has only about 19% extra storage cost. Compared with an inverted index, our improvement is about 72% and 87% for the overall and large queries respectively. We also propose to implement a common phrase index with dynamic update feature. Our experiments show that more improvement in time efficiency can be achieved.

Ruthven, I.; Lalmas, M.: Selective relevance feedback using term characteristics (1999) 0.01

0.014867388 = product of:
  0.07433694 = sum of:
    0.07433694 = weight(_text_:t in 3824) [ClassicSimilarity], result of:
      0.07433694 = score(doc=3824,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.43524727 = fieldWeight in 3824, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.078125 = fieldNorm(doc=3824)
  0.2 = coord(1/5)

Source: Vocabulary as a central concept in digital libraries: interdisciplinary concepts, challenges, and opportunities : proceedings of the Third International Conference an Conceptions of Library and Information Science (COLIS3), Dubrovnik, Croatia, 23-26 May 1999. Ed. by T. Arpanac et al

Dang, E.K.F.; Luk, R.W.P.; Allan, J.; Ho, K.S.; Chung, K.F.L.; Lee, D.L.: ¬A new context-dependent term weight computed by boost and discount using relevance information (2010) 0.01
```
0.012875535 = product of:
  0.06437767 = sum of:
    0.06437767 = weight(_text_:t in 4120) [ClassicSimilarity], result of:
      0.06437767 = score(doc=4120,freq=6.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.37693518 = fieldWeight in 4120, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4120)
  0.2 = coord(1/5)
```
Abstract

We studied the effectiveness of a new class of context-dependent term weights for information retrieval. Unlike the traditional term frequency-inverse document frequency (TF-IDF), the new weighting of a term t in a document d depends not only on the occurrence statistics of t alone but also on the terms found within a text window (or "document-context") centered on t. We introduce a Boost and Discount (B&D) procedure which utilizes partial relevance information to compute the context-dependent term weights of query terms according to a logistic regression model. We investigate the effectiveness of the new term weights compared with the context-independent BM25 weights in the setting of relevance feedback. We performed experiments with title queries of the TREC-6, -7, -8, and 2005 collections, comparing the residual Mean Average Precision (MAP) measures obtained using B&D term weights and those obtained by a baseline using BM25 weights. Given either 10 or 20 relevance judgments of the top retrieved documents, using the new term weights yields improvement over the baseline for all collections tested. The MAP obtained with the new weights has relative improvement over the baseline by 3.3 to 15.2%, with statistical significance at the 95% confidence level across all four collections.
Jacso, P.: Testing the calculation of a realistic h-index in Google Scholar, Scopus, and Web of Science for F. W. Lancaster (2008) 0.01
```
0.012099784 = product of:
  0.06049892 = sum of:
    0.06049892 = product of:
      0.12099784 = sum of:
        0.12099784 = weight(_text_:index in 5586) [ClassicSimilarity], result of:
          0.12099784 = score(doc=5586,freq=14.0), product of:
            0.18945041 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04335484 = queryNorm
            0.63867813 = fieldWeight in 5586, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5586)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

This paper focuses on the practical limitations in the content and software of the databases that are used to calculate the h-index for assessing the publishing productivity and impact of researchers. To celebrate F. W. Lancaster's biological age of seventy-five, and "scientific age" of forty-five, this paper discusses the related features of Google Scholar, Scopus, and Web of Science (WoS), and demonstrates in the latter how a much more realistic and fair h-index can be computed for F. W. Lancaster than the one produced automatically. Browsing and searching the cited reference index of the 1945-2007 edition of WoS, which in my estimate has over a hundred million "orphan references" that have no counterpart master records to be attached to, and "stray references" that cite papers which do have master records but cannot be identified by the matching algorithm because of errors of omission and commission in the references of the citing works, can bring up hundreds of additional cited references given to works of an accomplished author but are ignored in the automatic process of calculating the h-index. The partially manual process doubled the h-index value for F. W. Lancaster from 13 to 26, which is a much more realistic value for an information scientist and professor of his stature.

Object

h-index

Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003) 0.01

0.01189391 = product of:
  0.05946955 = sum of:
    0.05946955 = weight(_text_:t in 1734) [ClassicSimilarity], result of:
      0.05946955 = score(doc=1734,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.34819782 = fieldWeight in 1734, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.0625 = fieldNorm(doc=1734)
  0.2 = coord(1/5)

Behnert, C.; Borst, T.: Neue Formen der Relevanz-Sortierung in bibliothekarischen Informationssystemen : das DFG-Projekt LibRank (2015) 0.01

0.01189391 = product of:
  0.05946955 = sum of:
    0.05946955 = weight(_text_:t in 5392) [ClassicSimilarity], result of:
      0.05946955 = score(doc=5392,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.34819782 = fieldWeight in 5392, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.0625 = fieldNorm(doc=5392)
  0.2 = coord(1/5)

Ruthven, T.; Lalmas, M.; Rijsbergen, K.van: Incorporating user research behavior into relevance feedback (2003) 0.01
```
0.010512831 = product of:
  0.05256415 = sum of:
    0.05256415 = weight(_text_:t in 5169) [ClassicSimilarity], result of:
      0.05256415 = score(doc=5169,freq=4.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.3077663 = fieldWeight in 5169, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5169)
  0.2 = coord(1/5)
```
Abstract

Ruthven, Mounia, and van Rijsbergen rank and select terms for query expansion using information gathered on searcher evaluation behavior. Using the TREC Financial Times and Los Angeles Times collections and search topics from TREC-6 placed in simulated work situations, six student subjects each preformed three searches on an experimental system and three on a control system with instructions to search by natural language expression in any way they found comfortable. Searching was analyzed for behavior differences between experimental and control situations, and for effectiveness and perceptions. In three experiments paired t-tests were the analysis tool with controls being a no relevance feedback system, a standard ranking for automatic expansion system, and a standard ranking for interactive expansion while the experimental systems based ranking upon user information on temporal relevance and partial relevance. Two further experiments compare using user behavior (number assessed relevant and similarity of relevant documents) to choose a query expansion technique against a non-selective technique and finally the effect of providing the user with knowledge of the process. When partial relevance data and time of assessment data are incorporated in term ranking more relevant documents were recovered in fewer iterations, however retrieval effectiveness overall was not improved. The subjects, none-the-less, rated the suggested terms as more useful and used them more heavily. Explanations of what the feedback techniques were doing led to higher use of the techniques.

Behnert, C.; Plassmeier, K.; Borst, T.; Lewandowski, D.: Evaluierung von Rankingverfahren für bibliothekarische Informationssysteme (2019) 0.01

0.010407171 = product of:
  0.052035857 = sum of:
    0.052035857 = weight(_text_:t in 5023) [ClassicSimilarity], result of:
      0.052035857 = score(doc=5023,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.30467308 = fieldWeight in 5023, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5023)
  0.2 = coord(1/5)

Abu-Salem, H.; Al-Omari, M.; Evens, M.W.: Stemming methodologies over individual query words for an Arabic information retrieval system (1999) 0.01
```
0.010226184 = product of:
  0.051130917 = sum of:
    0.051130917 = product of:
      0.102261834 = sum of:
        0.102261834 = weight(_text_:index in 3672) [ClassicSimilarity], result of:
          0.102261834 = score(doc=3672,freq=10.0), product of:
            0.18945041 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04335484 = queryNorm
            0.5397815 = fieldWeight in 3672, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3672)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

Stemming is one of the most important factors that affect the performance of information retrieval systems. This article investigates how to improve the performance of an Arabic information retrieval system by imposing the retrieval method over individual words of a query depending on the importance of the WORD, the STEM, or the ROOT of the query terms in the database. This method, called Mxed Stemming, computes term importance using a weighting scheme that use the Term Frequency (TF) and the Inverse Document Frequency (IDF), called TFxIDF. An extended version of the Arabic IRS system is designed, implemented, and evaluated to reduce the number of irrelevant documents retrieved. The results of the experiment suggest that the proposed method outperforms the Word index method using the TFxIDF weighting scheme. It also outperforms the Stem index method using the Binary weighting scheme but does not outperform the Stem index method using the TFxIDF weighting scheme, and again it outperforms the Root index method using the Binary weighting scheme but does not outperform the Root index method using the TFxIDF weighting scheme
Moffat, A.; Bell, T.A.H.: In situ generation of compressed inverted files (1995) 0.01
```
0.009505401 = product of:
  0.047527004 = sum of:
    0.047527004 = product of:
      0.09505401 = sum of:
        0.09505401 = weight(_text_:index in 2648) [ClassicSimilarity], result of:
          0.09505401 = score(doc=2648,freq=6.0), product of:
            0.18945041 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04335484 = queryNorm
            0.50173557 = fieldWeight in 2648, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=2648)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

An inverted index stores, for each term that appears in a collection of documents, a list of document numbers containing that term. Such an index is indispensible when Boolean or informal ranked queries are to be answered. Construction of the index ist, however, a non trivial task. Simple methods using in.memory data structures cannot be used for large collections because they require too much random access storage, and traditional disc based methods require large amounts of temporary file space. Describes a new indexing algorithm designed to create large compressed inverted indexes in situ. It makes use of simple compression codes for the positive integers and an in place external multi way merge sort. The new techniques has been used to invert a 2-gigabyte text collection in under 4 hours, using less than 40 megabytes of temporary disc space, and less than 20 megabytes of main memory

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.01

0.009398371 = product of:
  0.046991855 = sum of:
    0.046991855 = product of:
      0.09398371 = sum of:
        0.09398371 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.09398371 = score(doc=402,freq=2.0), product of:
            0.15182126 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04335484 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: Information processing and management. 22(1986) no.6, S.465-476

Bar-Ilan, J.; Levene, M.: ¬The hw-rank : an h-index variant for ranking web pages (2015) 0.01

0.009146576 = product of:
  0.04573288 = sum of:
    0.04573288 = product of:
      0.09146576 = sum of:
        0.09146576 = weight(_text_:index in 1694) [ClassicSimilarity], result of:
          0.09146576 = score(doc=1694,freq=2.0), product of:
            0.18945041 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04335484 = queryNorm
            0.48279524 = fieldWeight in 1694, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.078125 = fieldNorm(doc=1694)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Rajashekar, T.B.; Croft, W.B.: Combining automatic and manual index representations in probabilistic retrieval (1995) 0.01
```
0.009054649 = product of:
  0.04527324 = sum of:
    0.04527324 = product of:
      0.09054648 = sum of:
        0.09054648 = weight(_text_:index in 2418) [ClassicSimilarity], result of:
          0.09054648 = score(doc=2418,freq=4.0), product of:
            0.18945041 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04335484 = queryNorm
            0.4779429 = fieldWeight in 2418, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2418)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

Results from research in information retrieval have suggested that significant improvements in retrieval effectiveness can be obtained by combining results from multiple index representioms, query formulations, and search strategies. The inference net model of retrieval, which was designed from this point of view, treats information retrieval as an evidental reasoning process where multiple sources of evidence about document and query content are combined to estimate relevance probabilities. Uses a system based on this model to study the retrieval effectiveness benefits of combining these types of document and query information that are found in typical commercial databases and information services. The results indicate that substantial real benefits are possible

Sakai, T.: On the reliability of information retrieval metrics based on graded relevance (2007) 0.01

0.008920433 = product of:
  0.044602163 = sum of:
    0.044602163 = weight(_text_:t in 910) [ClassicSimilarity], result of:
      0.044602163 = score(doc=910,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.26114836 = fieldWeight in 910, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.046875 = fieldNorm(doc=910)
  0.2 = coord(1/5)

Deerwester, S.; Dumais, S.; Landauer, T.; Furnass, G.; Beck, L.: Improving information retrieval with latent semantic indexing (1988) 0.01

0.008920433 = product of:
  0.044602163 = sum of:
    0.044602163 = weight(_text_:t in 2396) [ClassicSimilarity], result of:
      0.044602163 = score(doc=2396,freq=2.0), product of:
        0.17079243 = queryWeight, product of:
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.04335484 = queryNorm
        0.26114836 = fieldWeight in 2396, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9394085 = idf(docFreq=2338, maxDocs=44218)
          0.046875 = fieldNorm(doc=2396)
  0.2 = coord(1/5)

Search (74 results, page 1 of 4)

Authors

Years

Languages

Types

Themes

Subjects

Classifications