Search (5 results, page 1 of 1)

  • × theme_ss:"Retrievalalgorithmen"
  • × theme_ss:"Retrievalstudien"
  • × type_ss:"a"
  1. Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 0.01
    0.008113983 = product of:
      0.020284958 = sum of:
        0.010812371 = weight(_text_:a in 5696) [ClassicSimilarity], result of:
          0.010812371 = score(doc=5696,freq=14.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.20223314 = fieldWeight in 5696, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=5696)
        0.009472587 = product of:
          0.018945174 = sum of:
            0.018945174 = weight(_text_:information in 5696) [ClassicSimilarity], result of:
              0.018945174 = score(doc=5696,freq=8.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.23274569 = fieldWeight in 5696, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5696)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Shows how probabilistic information retrieval based on document components may be implemented as a feedforward (feedbackward) artificial neural network. The network supports adaptation of connection weights as well as the growing of new edges between queries and terms based on user relevance feedback data for training, and it reflects query modification and expansion in information retrieval. A learning rule is applied that can also be viewed as supporting sequential learning using a harmonic sequence learning rate. Experimental results with 4 standard small collections and a large Wall Street Journal collection show that small query expansion levels of about 30 terms can achieve most of the gains at the low-recall high-precision region, while larger expansion levels continue to provide gains at the high-recall low-precision region of a precision recall curve
    Source
    ACM transactions on information systems. 13(1995) no.3, S.324-353
    Type
    a
  2. Rokaya, M.; Atlam, E.; Fuketa, M.; Dorji, T.C.; Aoe, J.-i.: Ranking of field association terms using Co-word analysis (2008) 0.01
    0.008113983 = product of:
      0.020284958 = sum of:
        0.010812371 = weight(_text_:a in 2060) [ClassicSimilarity], result of:
          0.010812371 = score(doc=2060,freq=14.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.20223314 = fieldWeight in 2060, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2060)
        0.009472587 = product of:
          0.018945174 = sum of:
            0.018945174 = weight(_text_:information in 2060) [ClassicSimilarity], result of:
              0.018945174 = score(doc=2060,freq=8.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.23274569 = fieldWeight in 2060, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2060)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Information retrieval involves finding some desired information in a store of information or a database. In this paper, Co-word analysis will be used to achieve a ranking of a selected sample of FA terms. Based on this ranking a better arranging of search results can be achieved. Experimental results achieved using 41 MB of data (7660 documents) in the field of sports. The corpus was collected from CNN newspaper, sports field. This corpus was chosen to be distributed over 11 sub-fields of the field sports from the experimental results, the average precision increased by 18.3% after applying the proposed arranging scheme depending on the absolute frequency to count the terms weights, and the average precision increased by 17.2% after applying the proposed arranging scheme depending on a formula based on "TF*IDF" to count the terms weights.
    Source
    Information processing and management. 44(2008) no.2, S.738-755
    Type
    a
  3. Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003) 0.01
    0.007827929 = product of:
      0.019569822 = sum of:
        0.005448922 = weight(_text_:a in 1734) [ClassicSimilarity], result of:
          0.005448922 = score(doc=1734,freq=2.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.10191591 = fieldWeight in 1734, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=1734)
        0.014120899 = product of:
          0.028241798 = sum of:
            0.028241798 = weight(_text_:information in 1734) [ClassicSimilarity], result of:
              0.028241798 = score(doc=1734,freq=10.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.3469568 = fieldWeight in 1734, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1734)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Die Menge an Daten im Internet steigt weiter rapide an. Damit wächst auch der Bedarf an qualitativ hochwertigen Information Retrieval Diensten zur Orientierung und problemorientierten Suche. Die Entscheidung für die Benutzung oder Beschaffung von Information Retrieval Software erfordert aussagekräftige Evaluierungsergebnisse. Dieser Beitrag stellt neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen vor und zeigt den Trend zu Spezialisierung und Diversifizierung von Evaluierungsstudien, die den Realitätsgrad derErgebnisse erhöhen. DerSchwerpunkt liegt auf dem Retrieval von Fachtexten, Internet-Seiten und Multimedia-Objekten.
    Source
    Information - Wissenschaft und Praxis. 54(2003) H.4, S.203-210
    Type
    a
  4. Kantor, P.; Kim, M.H.; Ibraev, U.; Atasoy, K.: Estimating the number of relevant documents in enormous collections (1999) 0.01
    0.006203569 = product of:
      0.015508923 = sum of:
        0.0076151006 = weight(_text_:a in 6690) [ClassicSimilarity], result of:
          0.0076151006 = score(doc=6690,freq=10.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.14243183 = fieldWeight in 6690, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.007893822 = product of:
          0.015787644 = sum of:
            0.015787644 = weight(_text_:information in 6690) [ClassicSimilarity], result of:
              0.015787644 = score(doc=6690,freq=8.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.19395474 = fieldWeight in 6690, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6690)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    In assessing information retrieval systems, it is important to know not only the precision of the retrieved set, but also to compare the number of retrieved relevant items to the total number of relevant items. For large collections, such as the TREC test collections, or the World Wide Web, it is not possible to enumerate the entire set of relevant documents. If the retrieved documents are evaluated, a variant of the statistical "capture-recapture" method can be used to estimate the total number of relevant documents, providing the several retrieval systems used are sufficiently independent. We show that the underlying signal detection model supporting such an analysis can be extended in two ways. First, assuming that there are two distinct performance characteristics (corresponding to the chance of retrieving a relevant, and retrieving a given non-relevant document), we show that if there are three or more independent systems available it is possible to estimate the number of relevant documents without actually having to decide whether each individual document is relevant. We report applications of this 3-system method to the TREC data, leading to the conclusion that the independence assumptions are not satisfied. We then extend the model to a multi-system, multi-problem model, and show that it is possible to include statistical dependencies of all orders in the model, and determine the number of relevant documents for each of the problems in the set. Application to the TREC setting will be presented
    Imprint
    Medford, NJ : Information Today
    Series
    Proceedings of the American Society for Information Science; vol.36
    Source
    Knowledge: creation, organization and use. Proceedings of the 62nd Annual Meeting of the American Society for Information Science, 31.10.-4.11.1999. Ed.: L. Woods
    Type
    a
  5. López-Pujalte, C.; Guerrero-Bote, V.P.; Moya-Anegón, F. de: Order-based fitness functions for genetic algorithms applied to relevance feedback (2003) 0.01
    0.0056654564 = product of:
      0.014163641 = sum of:
        0.01021673 = weight(_text_:a in 5154) [ClassicSimilarity], result of:
          0.01021673 = score(doc=5154,freq=18.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.19109234 = fieldWeight in 5154, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5154)
        0.003946911 = product of:
          0.007893822 = sum of:
            0.007893822 = weight(_text_:information in 5154) [ClassicSimilarity], result of:
              0.007893822 = score(doc=5154,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.09697737 = fieldWeight in 5154, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5154)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Lopez-Pujalte and Guerrero-Bote test a relevance feedback genetic algorithm while varying its order based fitness functions and generating a function based upon the Ide dec-hi method as a base line. Using the non-zero weighted term types assigned to the query, and to the initially retrieved set of documents, as genes, a chromosome of equal length is created for each. The algorithm is provided with the chromosomes for judged relevant documents, for judged irrelevant documents, and for the irrelevant documents with their terms negated. The algorithm uses random selection of all possible genes, but gives greater likelihood to those with higher fitness values. When the fittest chromosome of a previous population is eliminated it is restored while the least fittest of the new population is eliminated in its stead. A crossover probability of .8 and a mutation probability of .2 were used with 20 generations. Three fitness functions were utilized; the Horng and Yeh function which takes into account the position of relevant documents, and two new functions, one based on accumulating the cosine similarity for retrieved documents, the other on stored fixed-recall-interval precessions. The Cranfield collection was used with the first 15 documents retrieved from 33 queries chosen to have at least 3 relevant documents in the first 15 and at least 5 relevant documents not initially retrieved. Precision was calculated at fixed recall levels using the residual collection method which removes viewed documents. One of the three functions improved the original retrieval by127 percent, while the Ide dec-hi method provided a 120 percent improvement.
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.2, S.152-160
    Type
    a