Search (126 results, page 1 of 7)

  • × theme_ss:"Retrievalalgorithmen"
  1. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.44
    0.4414408 = product of:
      0.7945934 = sum of:
        0.06423235 = weight(_text_:wide in 5777) [ClassicSimilarity], result of:
          0.06423235 = score(doc=5777,freq=4.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.4153836 = fieldWeight in 5777, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.049281355 = weight(_text_:web in 5777) [ClassicSimilarity], result of:
          0.049281355 = score(doc=5777,freq=8.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.43268442 = fieldWeight in 5777, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.18117946 = weight(_text_:suchmaschine in 5777) [ClassicSimilarity], result of:
          0.18117946 = score(doc=5777,freq=12.0), product of:
            0.19733392 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.034900077 = queryNorm
            0.9181365 = fieldWeight in 5777, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.33258343 = weight(_text_:mathematisches in 5777) [ClassicSimilarity], result of:
          0.33258343 = score(doc=5777,freq=8.0), product of:
            0.29588324 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.034900077 = queryNorm
            1.1240361 = fieldWeight in 5777, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.16731678 = weight(_text_:modell in 5777) [ClassicSimilarity], result of:
          0.16731678 = score(doc=5777,freq=8.0), product of:
            0.2098649 = queryWeight, product of:
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.034900077 = queryNorm
            0.79725945 = fieldWeight in 5777, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.5555556 = coord(5/9)
    
    LCSH
    Web search engines
    RSWK
    Suchmaschine / Information Retrieval
    World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    Subject
    Suchmaschine / Information Retrieval
    World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    Web search engines
  2. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.16
    0.15889259 = product of:
      0.3575083 = sum of:
        0.023231456 = weight(_text_:web in 7) [ClassicSimilarity], result of:
          0.023231456 = score(doc=7,freq=4.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.2039694 = fieldWeight in 7, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.09862161 = weight(_text_:suchmaschine in 7) [ClassicSimilarity], result of:
          0.09862161 = score(doc=7,freq=8.0), product of:
            0.19733392 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.034900077 = queryNorm
            0.4997702 = fieldWeight in 7, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.15678133 = weight(_text_:mathematisches in 7) [ClassicSimilarity], result of:
          0.15678133 = score(doc=7,freq=4.0), product of:
            0.29588324 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.034900077 = queryNorm
            0.5298757 = fieldWeight in 7, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.07887389 = weight(_text_:modell in 7) [ClassicSimilarity], result of:
          0.07887389 = score(doc=7,freq=4.0), product of:
            0.2098649 = queryWeight, product of:
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.034900077 = queryNorm
            0.37583172 = fieldWeight in 7, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.44444445 = coord(4/9)
    
    LCSH
    Web search engines
    RSWK
    Suchmaschine / Information Retrieval
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    Subject
    Web search engines
    Suchmaschine / Information Retrieval
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
  3. Stock, M.; Stock, W.G.: Internet-Suchwerkzeuge im Vergleich (IV) : Relevance Ranking nach "Popularität" von Webseiten: Google (2001) 0.05
    0.054021418 = product of:
      0.16206425 = sum of:
        0.045419127 = weight(_text_:wide in 5771) [ClassicSimilarity], result of:
          0.045419127 = score(doc=5771,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.29372054 = fieldWeight in 5771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
        0.04267891 = weight(_text_:web in 5771) [ClassicSimilarity], result of:
          0.04267891 = score(doc=5771,freq=6.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.37471575 = fieldWeight in 5771, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
        0.07396621 = weight(_text_:suchmaschine in 5771) [ClassicSimilarity], result of:
          0.07396621 = score(doc=5771,freq=2.0), product of:
            0.19733392 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.034900077 = queryNorm
            0.37482765 = fieldWeight in 5771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
      0.33333334 = coord(3/9)
    
    Abstract
    In unserem Retrievaltest von Suchwerkzeugen im World Wide Web (Password 11/2000) schnitt die Suchmaschine Google am besten ab. Im Vergleich zu anderen Search Engines setzt Google kaum auf Informationslinguistik, sondern auf Algorithmen, die sich aus den Besonderheiten der Web-Dokumente ableiten lassen. Kernstück der informationsstatistischen Technik ist das "PageRank"- Verfahren (benannt nach dem Entwickler Larry Page), das aus der Hypertextstruktur des Web die "Popularität" von Seiten anhand ihrer ein- und ausgehenden Links berechnet. Google besticht durch das Angebot intuitiv verstehbarer Suchbildschirme sowie durch einige sehr nützliche "Kleinigkeiten" wie die Angabe des Rangs einer Seite, Highlighting, Suchen in der Seite, Suchen innerhalb eines Suchergebnisses usw., alles verstaut in einer eigenen Befehlsleiste innerhalb des Browsers. Ähnlich wie RealNames bietet Google mit dem Produkt "AdWords" den Aufkauf von Suchtermen an. Nach einer Reihe von nunmehr vier Password-Artikeln über InternetSuchwerkzeugen im Vergleich wollen wir abschließend zu einer Bewertung kommen. Wie ist der Stand der Technik bei Directories und Search Engines aus informationswissenschaftlicher Sicht einzuschätzen? Werden die "typischen" Internetnutzer, die ja in der Regel keine Information Professionals sind, adäquat bedient? Und können auch Informationsfachleute von den Suchwerkzeugen profitieren?
  4. Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.04
    0.037938055 = product of:
      0.11381416 = sum of:
        0.052988984 = weight(_text_:wide in 1319) [ClassicSimilarity], result of:
          0.052988984 = score(doc=1319,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.342674 = fieldWeight in 1319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.049792062 = weight(_text_:web in 1319) [ClassicSimilarity], result of:
          0.049792062 = score(doc=1319,freq=6.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.43716836 = fieldWeight in 1319, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.011033117 = product of:
          0.03309935 = sum of:
            0.03309935 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
              0.03309935 = score(doc=1319,freq=2.0), product of:
                0.12221412 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.034900077 = queryNorm
                0.2708308 = fieldWeight in 1319, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1319)
          0.33333334 = coord(1/3)
      0.33333334 = coord(3/9)
    
    Abstract
    Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
    Date
    1. 8.1996 22:08:06
    Footnote
    Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia
  5. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.03
    0.029571362 = product of:
      0.13307112 = sum of:
        0.122038014 = weight(_text_:suchmaschine in 3276) [ClassicSimilarity], result of:
          0.122038014 = score(doc=3276,freq=4.0), product of:
            0.19733392 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.034900077 = queryNorm
            0.6184341 = fieldWeight in 3276, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
        0.011033117 = product of:
          0.03309935 = sum of:
            0.03309935 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.03309935 = score(doc=3276,freq=2.0), product of:
                0.12221412 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.034900077 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.33333334 = coord(1/3)
      0.22222222 = coord(2/9)
    
    Abstract
    Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
    Date
    20. 3.2005 16:23:22
  6. Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.02
    0.019836226 = product of:
      0.08926302 = sum of:
        0.036961015 = weight(_text_:web in 6) [ClassicSimilarity], result of:
          0.036961015 = score(doc=6,freq=18.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.32451332 = fieldWeight in 6, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
        0.052302007 = weight(_text_:suchmaschine in 6) [ClassicSimilarity], result of:
          0.052302007 = score(doc=6,freq=4.0), product of:
            0.19733392 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.034900077 = queryNorm
            0.26504317 = fieldWeight in 6, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.22222222 = coord(2/9)
    
    Abstract
    Why doesn't your home page appear on the first page of search results, even when you query your own name? How do other Web pages always appear at the top? What creates these powerful rankings? And how? The first book ever about the science of Web page rankings, "Google's PageRank and Beyond" supplies the answers to these and other questions and more. The book serves two very different audiences: the curious science reader and the technical computational reader. The chapters build in mathematical sophistication, so that the first five are accessible to the general academic reader. While other chapters are much more mathematical in nature, each one contains something for both audiences. For example, the authors include entertaining asides such as how search engines make money and how the Great Firewall of China influences research. The book includes an extensive background chapter designed to help readers learn more about the mathematics of search engines, and it contains several MATLAB codes and links to sample Web data sets. The philosophy throughout is to encourage readers to experiment with the ideas and algorithms in the text. Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided. It includes: many illustrative examples and entertaining asides; MATLAB code; accessible and informal style; and complete and self-contained section for mathematics review.
    Content
    Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling
    Chapter 9. Accelerating the Computation of PageRank: 9.1 An Adaptive Power Method - 9.2 Extrapolation - 9.3 Aggregation - 9.4 Other Numerical Methods Chapter 10. Updating the PageRank Vector: 10.1 The Two Updating Problems and their History - 10.2 Restarting the Power Method - 10.3 Approximate Updating Using Approximate Aggregation - 10.4 Exact Aggregation - 10.5 Exact vs. Approximate Aggregation - 10.6 Updating with Iterative Aggregation - 10.7 Determining the Partition - 10.8 Conclusions Chapter 11. The HITS Method for Ranking Webpages: 11.1 The HITS Algorithm - 11.2 HITS Implementation - 11.3 HITS Convergence - 11.4 HITS Example - 11.5 Strengths and Weaknesses of HITS - 11.6 HITS's Relationship to Bibliometrics - 11.7 Query-Independent HITS - 11.8 Accelerating HITS - 11.9 HITS Sensitivity Chapter 12. Other Link Methods for Ranking Webpages: 12.1 SALSA - 12.2 Hybrid Ranking Methods - 12.3 Rankings based on Traffic Flow Chapter 13. The Future of Web Information Retrieval: 13.1 Spam - 13.2 Personalization - 13.3 Clustering - 13.4 Intelligent Agents - 13.5 Trends and Time-Sensitive Search - 13.6 Privacy and Censorship - 13.7 Library Classification Schemes - 13.8 Data Fusion Chapter 14. Resources for Web Information Retrieval: 14.1 Resources for Getting Started - 14.2 Resources for Serious Study Chapter 15. The Mathematics Guide: 15.1 Linear Algebra - 15.2 Perron-Frobenius Theory - 15.3 Markov Chains - 15.4 Perron Complementation - 15.5 Stochastic Complementation - 15.6 Censoring - 15.7 Aggregation - 15.8 Disaggregation
    RSWK
    Google / Web-Seite / Rangstatistik (HEBIS)
    Google / Suchmaschine / Ranking (BVB)
    Subject
    Google / Web-Seite / Rangstatistik (HEBIS)
    Google / Suchmaschine / Ranking (BVB)
  7. Ding, Y.; Chowdhury, G.; Foo, S.: Organsising keywords in a Web search environment : a methodology based on co-word analysis (2000) 0.02
    0.017836958 = product of:
      0.08026631 = sum of:
        0.045419127 = weight(_text_:wide in 105) [ClassicSimilarity], result of:
          0.045419127 = score(doc=105,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.29372054 = fieldWeight in 105, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=105)
        0.03484718 = weight(_text_:web in 105) [ClassicSimilarity], result of:
          0.03484718 = score(doc=105,freq=4.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.3059541 = fieldWeight in 105, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=105)
      0.22222222 = coord(2/9)
    
    Abstract
    The rapid development of the Internet and World Wide Web has caused some critical problem for information retrieval. Researchers have made several attempts to solve these problems. Thesauri and subject heading lists as traditional information retrieval tools have been criticised for their efficiency to tackle these newly emerging problems. This paper proposes an information retrieval tool generated by cocitation analysis, comprising keyword clusters with relationships based on the co-occurrences of keywords in the literature. Such a tool can play the role of an associative thesaurus that can provide information about the keywords in a domain that might be useful for information searching and query expansion
  8. Habernal, I.; Konopík, M.; Rohlík, O.: Question answering (2012) 0.02
    0.017836958 = product of:
      0.08026631 = sum of:
        0.045419127 = weight(_text_:wide in 101) [ClassicSimilarity], result of:
          0.045419127 = score(doc=101,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.29372054 = fieldWeight in 101, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=101)
        0.03484718 = weight(_text_:web in 101) [ClassicSimilarity], result of:
          0.03484718 = score(doc=101,freq=4.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.3059541 = fieldWeight in 101, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=101)
      0.22222222 = coord(2/9)
    
    Abstract
    Question Answering is an area of information retrieval with the added challenge of applying sophisticated techniques to identify the complex syntactic and semantic relationships present in text in order to provide a more sophisticated and satisfactory response to the user's information needs. For this reason, the authors see question answering as the next step beyond standard information retrieval. In this chapter state of the art question answering is covered focusing on providing an overview of systems, techniques and approaches that are likely to be employed in the next generations of search engines. Special attention is paid to question answering using the World Wide Web as the data source and to question answering exploiting the possibilities of Semantic Web. Considerations about the current issues and prospects for promising future research are also provided.
  9. Marcus, S.: Textvergleich mit mehreren Mustern (2005) 0.02
    0.016910639 = product of:
      0.076097876 = sum of:
        0.06973601 = weight(_text_:suchmaschine in 862) [ClassicSimilarity], result of:
          0.06973601 = score(doc=862,freq=4.0), product of:
            0.19733392 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.034900077 = queryNorm
            0.3533909 = fieldWeight in 862, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03125 = fieldNorm(doc=862)
        0.0063618678 = product of:
          0.019085603 = sum of:
            0.019085603 = weight(_text_:29 in 862) [ClassicSimilarity], result of:
              0.019085603 = score(doc=862,freq=2.0), product of:
                0.12276756 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.034900077 = queryNorm
                0.15546128 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03125 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
      0.22222222 = coord(2/9)
    
    Abstract
    Das Gebiet des Pattern-Matching besitzt in vielen wissenschaftlichen Bereichen eine hohe Relevanz. Aufgrund unterschiedlicher Einsatzgebiete sind auch Umsetzung und Anwendung des Pattern-Matching sehr verschieden. Die allen Anwendungen des Pattern-Matching inhärente Aufgabe besteht darin, in einer Vielzahl von Eingabedaten bestimmte Muster wieder zu erkennen. Dies ist auch der deutschen Bezeichnung Mustererkennung zu entnehmen. In der Medizin findet Pattern-Matching zum Beispiel bei der Untersuchung von Chromosomensträngen auf bestimmte Folgen von Chromosomen Verwendung. Auf dem Gebiet der Bildverarbeitung können mit Hilfe des Pattern-Matching ganze Bilder verglichen oder einzelne Bildpunkte betrachtet werden, die durch ein Muster identifizierbar sind. Ein weiteres Einsatzgebiet des Pattern-Matching ist das Information-Retrieval, bei dem in gespeicherten Daten nach relevanten Informationen gesucht wird. Die Relevanz der zu suchenden Daten wird auch hier anhand eines Musters, zum Beispiel einem bestimmten Schlagwort, beurteilt. Ein vergleichbares Verfahren findet auch im Internet Anwendung. Internet-Benutzer, die mittels einer Suchmaschine nach bedeutsamen Informationen suchen, erhalten diese durch den Einsatz eines Pattern-Matching-Automaten. Die in diesem Zusammenhang an den Pattern-Matching-Automaten gestellten Anforderungen variieren mit der Suchanfrage, die an eine Suchmaschine gestellt wird. Eine solche Suchanfrage kann im einfachsten Fall aus genau einem Schlüsselwort bestehen. Im komplexeren Fall enthält die Anfrage mehrere Schlüsselwörter. Dabei muss für eine erfolgreiche Suche eine Konkatenation der in der Anfrage enthaltenen Wörter erfolgen. Zu Beginn dieser Arbeit wird in Kapitel 2 eine umfassende Einführung in die Thematik des Textvergleichs gegeben, wobei die Definition einiger grundlegender Begriffe vorgenommen wird. Anschließend werden in Kapitel 3 Verfahren zum Textvergleich mit mehreren Mustern vorgestellt. Dabei wird zunächst ein einfaches Vorgehen erläutert, um einen Einsteig in das Thema des Textvergleichs mit mehreren Mustern zu erleichtern. Danach wird eine komplexe Methode des Textvergleichs vorgestellt und anhand von Beispielen verdeutlicht.
    Date
    13. 2.2007 20:56:29
  10. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment (1998) 0.02
    0.015568846 = product of:
      0.070059806 = sum of:
        0.045419127 = weight(_text_:wide in 5) [ClassicSimilarity], result of:
          0.045419127 = score(doc=5,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.29372054 = fieldWeight in 5, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=5)
        0.024640677 = weight(_text_:web in 5) [ClassicSimilarity], result of:
          0.024640677 = score(doc=5,freq=2.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.21634221 = fieldWeight in 5, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5)
      0.22222222 = coord(2/9)
    
    Abstract
    The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of "authoritative" information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of "hub pages" that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristics for link-based analysis.
  11. Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.01
    0.014922119 = product of:
      0.067149535 = sum of:
        0.026494492 = weight(_text_:wide in 93) [ClassicSimilarity], result of:
          0.026494492 = score(doc=93,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.171337 = fieldWeight in 93, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
        0.040655047 = weight(_text_:web in 93) [ClassicSimilarity], result of:
          0.040655047 = score(doc=93,freq=16.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.35694647 = fieldWeight in 93, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
      0.22222222 = coord(2/9)
    
    Abstract
    Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it? Posed in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines (perhaps you found this article using one) so we know that there is a solution. This article will describe Google's PageRank algorithm and how it returns pages from the web's collection of 25 billion documents that match search criteria so well that "google" has become a widely used verb. Most search engines, including Google, continually run an army of computer programs that retrieve pages from the web, index the words in each document, and store this information in an efficient format. Each time a user asks for a web search using a search phrase, such as "search engine," the search engine determines all the pages on the web that contains the words in the search phrase. (Perhaps additional information such as the distance between the words "search" and "engine" will be noted as well.) Here is the problem: Google now claims to index 25 billion pages. Roughly 95% of the text in web pages is composed from a mere 10,000 words. This means that, for most searches, there will be a huge number of pages containing the words in the search phrase. What is needed is a means of ranking the importance of the pages that fit the search criteria so that the pages can be sorted with the most important pages at the top of the list. One way to determine the importance of pages is to use a human-generated ranking. For instance, you may have seen pages that consist mainly of a large number of links to other resources in a particular area of interest. Assuming the person maintaining this page is reliable, the pages referenced are likely to be useful. Of course, the list may quickly fall out of date, and the person maintaining the list may miss some important pages, either unintentionally or as a result of an unstated bias. Google's PageRank algorithm assesses the importance of web pages without human evaluation of the content. In fact, Google feels that the value of its service is largely in its ability to provide unbiased results to search queries; Google claims, "the heart of our software is PageRank." As we'll see, the trick is to ask the web itself to rank the importance of pages.
  12. Picard, J.; Savoy, J.: Enhancing retrieval with hyperlinks : a general model based on propositional argumentation systems (2003) 0.01
    0.014864132 = product of:
      0.06688859 = sum of:
        0.037849274 = weight(_text_:wide in 1427) [ClassicSimilarity], result of:
          0.037849274 = score(doc=1427,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.24476713 = fieldWeight in 1427, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1427)
        0.02903932 = weight(_text_:web in 1427) [ClassicSimilarity], result of:
          0.02903932 = score(doc=1427,freq=4.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.25496176 = fieldWeight in 1427, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1427)
      0.22222222 = coord(2/9)
    
    Abstract
    Fast, effective, and adaptable techniques are needed to automatically organize and retrieve information an the ever-increasing World Wide Web. In that respect, different strategies have been suggested to take hypertext links into account. For example, hyperlinks have been used to (1) enhance document representation, (2) improve document ranking by propagating document score, (3) provide an indicator of popularity, and (4) find hubs and authorities for a given topic. Although the TREC experiments have not demonstrated the usefulness of hyperlinks for retrieval, the hypertext structure is nevertheless an essential aspect of the Web, and as such, should not be ignored. The development of abstract models of the IR task was a key factor to the improvement of search engines. However, at this time conceptual tools for modeling the hypertext retrieval task are lacking, making it difficult to compare, improve, and reason an the existing techniques. This article proposes a general model for using hyperlinks based an Probabilistic Argumentation Systems, in which each of the above-mentioned techniques can be stated. This model will allow to discover some inconsistencies in the mentioned techniques, and to take a higher level and systematic approach for using hyperlinks for retrieval.
  13. Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.01
    0.013052959 = product of:
      0.058738314 = sum of:
        0.049281355 = weight(_text_:web in 2239) [ClassicSimilarity], result of:
          0.049281355 = score(doc=2239,freq=8.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.43268442 = fieldWeight in 2239, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2239)
        0.009456957 = product of:
          0.02837087 = sum of:
            0.02837087 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
              0.02837087 = score(doc=2239,freq=2.0), product of:
                0.12221412 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.034900077 = queryNorm
                0.23214069 = fieldWeight in 2239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2239)
          0.33333334 = coord(1/3)
      0.22222222 = coord(2/9)
    
    Abstract
    Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.
    Date
    31. 5.2004 19:22:06
  14. Kantor, P.; Kim, M.H.; Ibraev, U.; Atasoy, K.: Estimating the number of relevant documents in enormous collections (1999) 0.01
    0.012974039 = product of:
      0.058383174 = sum of:
        0.037849274 = weight(_text_:wide in 6690) [ClassicSimilarity], result of:
          0.037849274 = score(doc=6690,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.24476713 = fieldWeight in 6690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.020533899 = weight(_text_:web in 6690) [ClassicSimilarity], result of:
          0.020533899 = score(doc=6690,freq=2.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.18028519 = fieldWeight in 6690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
      0.22222222 = coord(2/9)
    
    Abstract
    In assessing information retrieval systems, it is important to know not only the precision of the retrieved set, but also to compare the number of retrieved relevant items to the total number of relevant items. For large collections, such as the TREC test collections, or the World Wide Web, it is not possible to enumerate the entire set of relevant documents. If the retrieved documents are evaluated, a variant of the statistical "capture-recapture" method can be used to estimate the total number of relevant documents, providing the several retrieval systems used are sufficiently independent. We show that the underlying signal detection model supporting such an analysis can be extended in two ways. First, assuming that there are two distinct performance characteristics (corresponding to the chance of retrieving a relevant, and retrieving a given non-relevant document), we show that if there are three or more independent systems available it is possible to estimate the number of relevant documents without actually having to decide whether each individual document is relevant. We report applications of this 3-system method to the TREC data, leading to the conclusion that the independence assumptions are not satisfied. We then extend the model to a multi-system, multi-problem model, and show that it is possible to include statistical dependencies of all orders in the model, and determine the number of relevant documents for each of the problems in the set. Application to the TREC setting will be presented
  15. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.01
    0.012974039 = product of:
      0.058383174 = sum of:
        0.037849274 = weight(_text_:wide in 1338) [ClassicSimilarity], result of:
          0.037849274 = score(doc=1338,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.24476713 = fieldWeight in 1338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
        0.020533899 = weight(_text_:web in 1338) [ClassicSimilarity], result of:
          0.020533899 = score(doc=1338,freq=2.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.18028519 = fieldWeight in 1338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
      0.22222222 = coord(2/9)
    
    Abstract
    A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
  16. Effektive Information Retrieval Verfahren in Theorie und Praxis : ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005 (2006) 0.01
    0.011345038 = product of:
      0.05105267 = sum of:
        0.011615728 = weight(_text_:web in 5973) [ClassicSimilarity], result of:
          0.011615728 = score(doc=5973,freq=4.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.1019847 = fieldWeight in 5973, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.015625 = fieldNorm(doc=5973)
        0.039436944 = weight(_text_:modell in 5973) [ClassicSimilarity], result of:
          0.039436944 = score(doc=5973,freq=4.0), product of:
            0.2098649 = queryWeight, product of:
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.034900077 = queryNorm
            0.18791586 = fieldWeight in 5973, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.015625 = fieldNorm(doc=5973)
      0.22222222 = coord(2/9)
    
    Content
    Inhalt: Jan-Hendrik Scheufen: RECOIN: Modell offener Schnittstellen für Information-Retrieval-Systeme und -Komponenten Markus Nick, Klaus-Dieter Althoff: Designing Maintainable Experience-based Information Systems Gesine Quint, Steffen Weichert: Die benutzerzentrierte Entwicklung des Produkt- Retrieval-Systems EIKON der Blaupunkt GmbH Claus-Peter Klas, Sascha Kriewel, André Schaefer, Gudrun Fischer: Das DAFFODIL System - Strategische Literaturrecherche in Digitalen Bibliotheken Matthias Meiert: Entwicklung eines Modells zur Integration digitaler Dokumente in die Universitätsbibliothek Hildesheim Daniel Harbig, René Schneider: Ontology Learning im Rahmen von MyShelf Michael Kluck, Marco Winter: Topic-Entwicklung und Relevanzbewertung bei GIRT: ein Werkstattbericht Thomas Mandl: Neue Entwicklungen bei den Evaluierungsinitiativen im Information Retrieval Joachim Pfister: Clustering von Patent-Dokumenten am Beispiel der Datenbanken des Fachinformationszentrums Karlsruhe Ralph Kölle, Glenn Langemeier, Wolfgang Semar: Programmieren lernen in kollaborativen Lernumgebungen Olga Tartakovski, Margaryta Shramko: Implementierung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten Nina Kummer: Indexierungstechniken für das japanische Retrieval Suriya Na Nhongkai, Hans-Joachim Bentz: Bilinguale Suche mittels Konzeptnetzen Robert Strötgen, Thomas Mandl, René Schneider: Entwicklung und Evaluierung eines Question Answering Systems im Rahmen des Cross Language Evaluation Forum (CLEF) Niels Jensen: Evaluierung von mehrsprachigem Web-Retrieval: Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF)
    Footnote
    Im ersten Kapitel "Retrieval-Systeme" werden verschiedene Information RetrievalSysteme präsentiert und Verfahren zu deren Gestaltung diskutiert. Jan-Hendrik Scheufen stellt das Meta-Framework RECOIN zur Information Retrieval Forschung vor, das sich durch eine flexible Handhabung unterschiedlichster Applikationen auszeichnet und dadurch eine zentrierte Protokollierung und Steuerung von Retrieval-Prozessen ermöglicht. Dieses Konzept eines offenen, komponentenbasierten Systems wurde in Form eines Plug-Ins für die javabasierte Open-Source-Plattform Eclipse realisiert. Markus Nick und Klaus-Dieter Althoff erläutern in ihrem Beitrag, der übrigens der einzige englischsprachige Text im Buch ist, das Verfahren DILLEBIS zur Erhaltung und Pflege (Maintenance) von erfahrungsbasierten Informationssystemen. Sie bezeichnen dieses Verfahren als Maintainable Experience-based Information System und plädieren für eine Ausrichtung von erfahrungsbasierten Systemen entsprechend diesem Modell. Gesine Quint und Steffen Weichert stellen dagegen in ihrem Beitrag die benutzerzentrierte Entwicklung des Produkt-Retrieval-Systems EIKON vor, das in Kooperation mit der Blaupunkt GmbH realisiert wurde. In einem iterativen Designzyklus erfolgte die Gestaltung von gruppenspezifischen Interaktionsmöglichkeiten für ein Car-Multimedia-Zubehör-System. Im zweiten Kapitel setzen sich mehrere Autoren dezidierter mit dem Anwendungsgebiet "Digitale Bibliothek" auseinander. Claus-Peter Klas, Sascha Kriewel, Andre Schaefer und Gudrun Fischer von der Universität Duisburg-Essen stellen das System DAFFODIL vor, das durch eine Vielzahl an Werkzeugen zur strategischen Unterstützung bei Literaturrecherchen in digitalen Bibliotheken dient. Zusätzlich ermöglicht die Protokollierung sämtlicher Ereignisse den Einsatz des Systems als Evaluationsplattform. Der Aufsatz von Matthias Meiert erläutert die Implementierung von elektronischen Publikationsprozessen an Hochschulen am Beispiel von Abschlussarbeiten des Studienganges Internationales Informationsmanagement der Universität Hildesheim. Neben Rahmenbedingungen werden sowohl der Ist-Zustand als auch der Soll-Zustand des wissenschaftlichen elektronischen Publizierens in Form von gruppenspezifischen Empfehlungen dargestellt. Daniel Harbig und Rene Schneider beschreiben in ihrem Aufsatz zwei Verfahrensweisen zum maschinellen Erlernen von Ontologien, angewandt am virtuellen Bibliotheksregal MyShelf. Nach der Evaluation dieser beiden Ansätze plädieren die Autoren für ein semi-automatisiertes Verfahren zur Erstellung von Ontologien.
    "Evaluierung", das Thema des dritten Kapitels, ist in seiner Breite nicht auf das Information Retrieval beschränkt sondern beinhaltet ebenso einzelne Aspekte der Bereiche Mensch-Maschine-Interaktion sowie des E-Learning. Michael Muck und Marco Winter von der Stiftung Wissenschaft und Politik sowie dem Informationszentrum Sozialwissenschaften thematisieren in ihrem Beitrag den Einfluss der Fragestellung (Topic) auf die Bewertung von Relevanz und zeigen Verfahrensweisen für die Topic-Erstellung auf, die beim Cross Language Evaluation Forum (CLEF) Anwendung finden. Im darauf folgenden Aufsatz stellt Thomas Mandl verschiedene Evaluierungsinitiativen im Information Retrieval und aktuelle Entwicklungen dar. Joachim Pfister erläutert in seinem Beitrag das automatisierte Gruppieren, das sogenannte Clustering, von Patent-Dokumenten in den Datenbanken des Fachinformationszentrums Karlsruhe und evaluiert unterschiedliche Clusterverfahren auf Basis von Nutzerbewertungen. Ralph Kölle, Glenn Langemeier und Wolfgang Semar widmen sich dem kollaborativen Lernen unter den speziellen Bedingungen des Programmierens. Dabei werden das System VitaminL zur synchronen Bearbeitung von Programmieraufgaben und das Kennzahlensystem K-3 für die Bewertung kollaborativer Zusammenarbeit in einer Lehrveranstaltung angewendet. Der aktuelle Forschungsschwerpunkt der Hildesheimer Informationswissenschaft zeichnet sich im vierten Kapitel unter dem Thema "Multilinguale Systeme" ab. Hier finden sich die meisten Beiträge des Tagungsbandes wieder. Olga Tartakovski und Margaryta Shramko beschreiben und prüfen das System Langldent, das die Sprache von mono- und multilingualen Texten identifiziert. Die Eigenheiten der japanischen Schriftzeichen stellt Nina Kummer dar und vergleicht experimentell die unterschiedlichen Techniken der Indexierung. Suriya Na Nhongkai und Hans-Joachim Bentz präsentieren und prüfen eine bilinguale Suche auf Basis von Konzeptnetzen, wobei die Konzeptstruktur das verbindende Elemente der beiden Textsammlungen darstellt. Das Entwickeln und Evaluieren eines mehrsprachigen Question-Answering-Systems im Rahmen des Cross Language Evaluation Forum (CLEF), das die alltagssprachliche Formulierung von konkreten Fragestellungen ermöglicht, wird im Beitrag von Robert Strötgen, Thomas Mandl und Rene Schneider thematisiert. Den Schluss bildet der Aufsatz von Niels Jensen, der ein mehrsprachiges Web-Retrieval-System ebenfalls im Zusammenhang mit dem CLEF anhand des multilingualen EuroGOVKorpus evaluiert.
  17. Fuhr, N.: Theorie des Information Retrieval I : Modelle (2004) 0.01
    0.010954706 = product of:
      0.098592356 = sum of:
        0.098592356 = weight(_text_:modell in 2912) [ClassicSimilarity], result of:
          0.098592356 = score(doc=2912,freq=4.0), product of:
            0.2098649 = queryWeight, product of:
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.034900077 = queryNorm
            0.46978965 = fieldWeight in 2912, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.0133076 = idf(docFreq=293, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2912)
      0.11111111 = coord(1/9)
    
    Abstract
    Information-Retrieval-(IR-)Modelle spezifizieren, wie zur einer gegebenen Anfrage die Antwortdokumente aus einer Dokumentenkollektion bestimmt werden. Dabei macht jedes Modell bestimmte Annahmen über die Struktur von Dokumenten und Anfragen und definiert dann die so genannte Retrievalfunktion, die das Retrievalgewicht eines Dokumentes bezüglich einer Anfrage bestimmt - im Falle des Booleschen Retrieval etwa eines der Gewichte 0 oder 1. Die Dokumente werden dann nach fallenden Gewichten sortiert und dem Benutzer präsentiert. Zunächst sollen hier einige grundlegende Charakteristika von Retrievalmodellen beschrieben werden, bevor auf die einzelnen Modelle näher eingegangen wird. Wie eingangs erwähnt, macht jedes Modell Annahmen über die Struktur von Dokumenten und Fragen. Ein Dokument kann entweder als Menge oder Multimenge von so genannten Termen aufgefasst werden, wobei im zweiten Fall das Mehrfachvorkommen berücksichtigt wird. Dabei subsummiert 'Term' einen Suchbegriff, der ein einzelnes Wort, ein mehrgliedriger Begriff oder auch ein komplexes Freitextmuster sein kann. Diese Dokumentrepräsentation wird wiederum auf eine so genannte Dokumentbeschreibung abgebildet, in der die einzelnen Terme gewichtet sein können; dies ist Aufgabe der in Kapitel B 5 beschriebenen Indexierungsmodelle. Im Folgenden unterscheiden wir nur zwischen ungewichteter (Gewicht eines Terms ist entweder 0 oderl) und gewichteter Indexierung (das Gewicht ist eine nichtnegative reelle Zahl). Ebenso wie bei Dokumenten können auch die Terme in der Frage entweder ungewichtet oder gewichtet sein. Daneben unterscheidet man zwischen linearen (Frage als Menge von Termen, ungewichtet oder gewichtet) und Booleschen Anfragen.
  18. Meghabghab, G.: Google's Web page ranking applied to different topological Web graph structures (2001) 0.01
    0.010941902 = product of:
      0.09847712 = sum of:
        0.09847712 = weight(_text_:web in 6028) [ClassicSimilarity], result of:
          0.09847712 = score(doc=6028,freq=46.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.86461735 = fieldWeight in 6028, product of:
              6.78233 = tf(freq=46.0), with freq of:
                46.0 = termFreq=46.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6028)
      0.11111111 = coord(1/9)
    
    Abstract
    This research is part of the ongoing study to better understand web page ranking on the web. It looks at a web page as a graph structure or a web graph, and tries to classify different web graphs in the new coordinate space: (out-degree, in-degree). The out-degree coordinate od is defined as the number of outgoing web pages from a given web page. The in-degree id coordinate is the number of web pages that point to a given web page. In this new coordinate space a metric is built to classify how close or far different web graphs are. Google's web ranking algorithm (Brin & Page, 1998) on ranking web pages is applied in this new coordinate space. The results of the algorithm has been modified to fit different topological web graph structures. Also the algorithm was not successful in the case of general web graphs and new ranking web algorithms have to be considered. This study does not look at enhancing web ranking by adding any contextual information. It only considers web links as a source to web page ranking. The author believes that understanding the underlying web page as a graph will help design better ranking web algorithms, enhance retrieval and web performance, and recommends using graphs as a part of visual aid for browsing engine designers
  19. Agosti, M.; Pretto, L.: ¬A theoretical study of a generalized version of kleinberg's HITS algorithm (2005) 0.01
    0.0108933635 = product of:
      0.049020134 = sum of:
        0.041067798 = weight(_text_:web in 4) [ClassicSimilarity], result of:
          0.041067798 = score(doc=4,freq=8.0), product of:
            0.113896765 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.034900077 = queryNorm
            0.36057037 = fieldWeight in 4, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4)
        0.007952334 = product of:
          0.023857003 = sum of:
            0.023857003 = weight(_text_:29 in 4) [ClassicSimilarity], result of:
              0.023857003 = score(doc=4,freq=2.0), product of:
                0.12276756 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.034900077 = queryNorm
                0.19432661 = fieldWeight in 4, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4)
          0.33333334 = coord(1/3)
      0.22222222 = coord(2/9)
    
    Abstract
    Kleinberg's HITS (Hyperlink-Induced Topic Search) algorithm (Kleinberg 1999), which was originally developed in a Web context, tries to infer the authoritativeness of a Web page in relation to a specific query using the structure of a subgraph of the Web graph, which is obtained considering this specific query. Recent applications of this algorithm in contexts far removed from that of Web searching (Bacchin, Ferro and Melucci 2002, Ng et al. 2001) inspired us to study the algorithm in the abstract, independently of its particular applications, trying to mathematically illuminate its behaviour. In the present paper we detail this theoretical analysis. The original work starts from the definition of a revised and more general version of the algorithm, which includes the classic one as a particular case. We perform an analysis of the structure of two particular matrices, essential to studying the behaviour of the algorithm, and we prove the convergence of the algorithm in the most general case, finding the analytic expression of the vectors to which it converges. Then we study the symmetry of the algorithm and prove the equivalence between the existence of symmetry and the independence from the order of execution of some basic operations on initial vectors. Finally, we expound some interesting consequences of our theoretical results.
    Date
    31.12.1996 19:29:41
  20. Baloh, P.; Desouza, K.C.; Hackney, R.: Contextualizing organizational interventions of knowledge management systems : a design science perspectiveA domain analysis (2012) 0.01
    0.010162238 = product of:
      0.04573007 = sum of:
        0.037849274 = weight(_text_:wide in 241) [ClassicSimilarity], result of:
          0.037849274 = score(doc=241,freq=2.0), product of:
            0.1546338 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.034900077 = queryNorm
            0.24476713 = fieldWeight in 241, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=241)
        0.007880798 = product of:
          0.023642393 = sum of:
            0.023642393 = weight(_text_:22 in 241) [ClassicSimilarity], result of:
              0.023642393 = score(doc=241,freq=2.0), product of:
                0.12221412 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.034900077 = queryNorm
                0.19345059 = fieldWeight in 241, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=241)
          0.33333334 = coord(1/3)
      0.22222222 = coord(2/9)
    
    Abstract
    We address how individuals' (workers) knowledge needs influence the design of knowledge management systems (KMS), enabling knowledge creation and utilization. It is evident that KMS technologies and activities are indiscriminately deployed in most organizations with little regard to the actual context of their adoption. Moreover, it is apparent that the extant literature pertaining to knowledge management projects is frequently deficient in identifying the variety of factors indicative for successful KMS. This presents an obvious business practice and research gap that requires a critical analysis of the necessary intervention that will actually improve how workers can leverage and form organization-wide knowledge. This research involved an extensive review of the literature, a grounded theory methodological approach and rigorous data collection and synthesis through an empirical case analysis (Parsons Brinckerhoff and Samsung). The contribution of this study is the formulation of a model for designing KMS based upon the design science paradigm, which aspires to create artifacts that are interdependent of people and organizations. The essential proposition is that KMS design and implementation must be contextualized in relation to knowledge needs and that these will differ for various organizational settings. The findings present valuable insights and further understanding of the way in which KMS design efforts should be focused.
    Date
    11. 6.2012 14:22:34

Years

Languages

  • e 107
  • d 17
  • m 1
  • pt 1
  • More… Less…

Types

  • a 113
  • m 8
  • el 3
  • s 2
  • x 2
  • r 1
  • More… Less…