Search (116 results, page 1 of 6)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.25

0.2485744 = product of:
  0.4350052 = sum of:
    0.060665026 = product of:
      0.18199508 = sum of:
        0.18199508 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.18199508 = score(doc=562,freq=2.0), product of:
            0.32382426 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03819578 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.18199508 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.18199508 = score(doc=562,freq=2.0), product of:
        0.32382426 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03819578 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.18199508 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.18199508 = score(doc=562,freq=2.0), product of:
        0.32382426 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03819578 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.010350002 = product of:
      0.031050006 = sum of:
        0.031050006 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.031050006 = score(doc=562,freq=2.0), product of:
            0.13375512 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03819578 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
  0.5714286 = coord(4/7)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Wätjen, H.-J.: GERHARD : Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web (1998) 0.05

0.048528418 = product of:
  0.11323297 = sum of:
    0.027029924 = weight(_text_:retrieval in 3064) [ClassicSimilarity], result of:
      0.027029924 = score(doc=3064,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.23394634 = fieldWeight in 3064, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
    0.049791705 = weight(_text_:bibliothek in 3064) [ClassicSimilarity], result of:
      0.049791705 = score(doc=3064,freq=2.0), product of:
        0.15681393 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.03819578 = queryNorm
        0.31752092 = fieldWeight in 3064, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
    0.036411345 = weight(_text_:internet in 3064) [ClassicSimilarity], result of:
      0.036411345 = score(doc=3064,freq=4.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.32290122 = fieldWeight in 3064, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
  0.42857143 = coord(3/7)

Abstract: Die intellektuelle Erschließung des Internet befindet sich in einer Krise. Yahoo und andere Dienste können mit dem Wachstum des Web nicht mithalten. GERHARD ist derzeit weltweit der einzige Such- und Navigationsdienst, der die mit einem Roboter gesammelten Internetressourcen mit computerlinguistischen und statistischen Verfahren auch automatisch vollständig klassifiziert. Weit über eine Million HTML-Dokumente von wissenschaftlich relevanten Servern in Deutschland können wie bei anderen Suchmaschinen in der Datenbank gesucht, aber auch über die Navigation in der dreisprachigen Universalen Dezimalklassifikation (ETH-Bibliothek Zürich) recherchiert werden
Theme: Internet
Klassifikationssysteme im Online-Retrieval

Koch, T.; Vizine-Goetz, D.: DDC and knowledge organization in the digital library : Research and development. Demonstration pages (1999) 0.04

0.04460188 = product of:
  0.10407105 = sum of:
    0.023168506 = weight(_text_:retrieval in 942) [ClassicSimilarity], result of:
      0.023168506 = score(doc=942,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.20052543 = fieldWeight in 942, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=942)
    0.042678602 = weight(_text_:bibliothek in 942) [ClassicSimilarity], result of:
      0.042678602 = score(doc=942,freq=2.0), product of:
        0.15681393 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.03819578 = queryNorm
        0.27216077 = fieldWeight in 942, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.046875 = fieldNorm(doc=942)
    0.03822395 = weight(_text_:internet in 942) [ClassicSimilarity], result of:
      0.03822395 = score(doc=942,freq=6.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.33897567 = fieldWeight in 942, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.046875 = fieldNorm(doc=942)
  0.42857143 = coord(3/7)

Abstract: Der Workshop gibt einen Einblick in die aktuelle Forschung und Entwicklung zur Wissensorganisation in digitalen Bibliotheken. Diane Vizine-Goetz vom OCLC Office of Research in Dublin, Ohio, stellt die Forschungsprojekte von OCLC zur Anpassung und Weiterentwicklung der Dewey Decimal Classification als Wissensorganisationsinstrument fuer grosse digitale Dokumentensammlungen vor. Traugott Koch, NetLab, Universität Lund in Schweden, demonstriert die Ansätze und Lösungen des EU-Projekts DESIRE zum Einsatz von intellektueller und vor allem automatischer Klassifikation in Fachinformationsdiensten im Internet.
Content: 1. Increased Importance of Knowledge Organization in Internet Services - 2. Quality Subject Service and the role of classification - 3. Developing the DDC into a knowledge organization instrument for the digital library. OCLC site - 4. DESIRE's Barefoot Solutions of Automatic Classification - 5. Advanced Classification Solutions in DESIRE and CORC - 6. Future directions of research and development - 7. General references
Footnote: Vortrag anläßlich des Workshops am 21.10.1999, Deutsche Bibliothek, Frankfurt/M.
Theme: Klassifikationssysteme im Online-Retrieval
Internet

Wätjen, H.-J.; Diekmann, B.; Möller, G.; Carstensen, K.-U.: Bericht zum DFG-Projekt: GERHARD : German Harvest Automated Retrieval and Directory (1998) 0.03

0.026111344 = product of:
  0.0913897 = sum of:
    0.05460869 = weight(_text_:retrieval in 3065) [ClassicSimilarity], result of:
      0.05460869 = score(doc=3065,freq=4.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.47264296 = fieldWeight in 3065, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=3065)
    0.036781013 = weight(_text_:internet in 3065) [ClassicSimilarity], result of:
      0.036781013 = score(doc=3065,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.3261795 = fieldWeight in 3065, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.078125 = fieldNorm(doc=3065)
  0.2857143 = coord(2/7)

Theme: Internet
Klassifikationssysteme im Online-Retrieval

Vizine-Goetz, D.: NetLab / OCLC collaboration seeks to improve Web searching (1999) 0.03

0.025894396 = product of:
  0.09063038 = sum of:
    0.038614176 = weight(_text_:retrieval in 4180) [ClassicSimilarity], result of:
      0.038614176 = score(doc=4180,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.33420905 = fieldWeight in 4180, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4180)
    0.05201621 = weight(_text_:internet in 4180) [ClassicSimilarity], result of:
      0.05201621 = score(doc=4180,freq=4.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.46128747 = fieldWeight in 4180, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.078125 = fieldNorm(doc=4180)
  0.2857143 = coord(2/7)

Abstract: Vorstellung verschiedener Projekte zur Verbesserung der Internet-Erschließung mit Hilfe der DDC
Theme: Internet
Klassifikationssysteme im Online-Retrieval

GERHARD : eine Spezialsuchmaschine für die Wissenschaft (1998) 0.03

0.02584978 = product of:
  0.090474226 = sum of:
    0.046337012 = weight(_text_:retrieval in 381) [ClassicSimilarity], result of:
      0.046337012 = score(doc=381,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.40105087 = fieldWeight in 381, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=381)
    0.044137213 = weight(_text_:internet in 381) [ClassicSimilarity], result of:
      0.044137213 = score(doc=381,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.3914154 = fieldWeight in 381, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.09375 = fieldNorm(doc=381)
  0.2857143 = coord(2/7)

Theme: Internet
Klassifikationssysteme im Online-Retrieval

Koch, T.; Ardö, A.; Brümmer, A.: ¬The building and maintenance of robot based internet search services : A review of current indexing and data collection methods. Prepared to meet the requirements of Work Package 3 of EU Telematics for Research, project DESIRE. Version D3.11v0.3 (Draft version 3) (1996) 0.02
```
0.021972368 = product of:
  0.07690328 = sum of:
    0.040865403 = weight(_text_:retrieval in 1669) [ClassicSimilarity], result of:
      0.040865403 = score(doc=1669,freq=14.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.3536936 = fieldWeight in 1669, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1669)
    0.036037885 = weight(_text_:internet in 1669) [ClassicSimilarity], result of:
      0.036037885 = score(doc=1669,freq=12.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.31958932 = fieldWeight in 1669, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03125 = fieldNorm(doc=1669)
  0.2857143 = coord(2/7)
```
Abstract

After a short outline of problems, possibilities and difficulties of systematic information retrieval on the Internet and a description of efforts for development in this area, a specification of the terminology for this report is required. Although the process of retrieval is generally seen as an iterative process of browsing and information retrieval and several important services on the net have taken this fact into consideration, the emphasis of this report lays on the general retrieval tools for the whole of Internet. In order to be able to evaluate the differences, possibilities and restrictions of the different services it is necessary to begin with organizing the existing varieties in a typological/ taxonomical survey. The possibilities and weaknesses will be briefly compared and described for the most important services in the categories robot-based WWW-catalogues of different types, list- or form-based catalogues and simultaneous or collected search services respectively. It will however for different reasons not be possible to rank them in order of "best" services. Still more important are the weaknesses and problems common for all attempts of indexing the Internet. The problems of the quality of the input, the technical performance and the general problem of indexing virtual hypertext are shown to be at least as difficult as the different aspects of harvesting, indexing and information retrieval. Some of the attempts made in the area of further development of retrieval services will be mentioned in relation to descriptions of the contents of documents and standardization efforts. Internet harvesting and indexing technology and retrieval software is thoroughly reviewed. Details about all services and software are listed in analytical forms in Annex 1-3.

Theme

Internet

Oberhauser, O.: Automatisches Klassifizieren und Bibliothekskataloge (2005) 0.02

0.021949036 = product of:
  0.076821625 = sum of:
    0.027029924 = weight(_text_:retrieval in 4099) [ClassicSimilarity], result of:
      0.027029924 = score(doc=4099,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.23394634 = fieldWeight in 4099, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4099)
    0.049791705 = weight(_text_:bibliothek in 4099) [ClassicSimilarity], result of:
      0.049791705 = score(doc=4099,freq=2.0), product of:
        0.15681393 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.03819578 = queryNorm
        0.31752092 = fieldWeight in 4099, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4099)
  0.2857143 = coord(2/7)

Abstract: In der bibliothekarischen Welt sind die Vorzüge einer klassifikatorischen Inhaltserschließung seit jeher wohlbekannt. Auch im Zeitalter der Online-Kataloge gibt es dafür keinen wirklichen Ersatz, da - kurz formuliert - ein stichwortbasiertes Retrieval alleine mit Problemen wie Ambiguität und Mehrsprachigkeit nicht fertig zu werden vermag. Zahlreiche Online-Kataloge weisen daher Notationen verschiedener Klassifikationssysteme auf; allerdings sind die darauf basierenden Abfragemöglichkeiten meist noch arg unterentwickelt. Viele Datensätze in OPACs sind aber überhaupt nicht sachlich erschlossen, sei es, dass sie aus retrospektiv konvertierten Nominalkatalogen stammen, sei es, dass ein Mangel an personellen Ressourcen ihre inhaltliche Erschließung verhindert hat. Angesichts großer Mengen solcher Datensätze liegt ein Interesse an automatischen Verfahren zur Sacherschließung durchaus nahe.
Source: Bibliothek Technik Recht. Festschrift für Peter Kubalek zum 60. Geburtstag. Hrsg.: H. Hrusa

Wätjen, H.-J.: Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web : das DFG-Projekt GERHARD (1998) 0.02

0.021541484 = product of:
  0.07539519 = sum of:
    0.038614176 = weight(_text_:retrieval in 3066) [ClassicSimilarity], result of:
      0.038614176 = score(doc=3066,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.33420905 = fieldWeight in 3066, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=3066)
    0.036781013 = weight(_text_:internet in 3066) [ClassicSimilarity], result of:
      0.036781013 = score(doc=3066,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.3261795 = fieldWeight in 3066, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.078125 = fieldNorm(doc=3066)
  0.2857143 = coord(2/7)

Theme: Internet
Klassifikationssysteme im Online-Retrieval

Möller, G.: Automatic classification of the World Wide Web using Universal Decimal Classification (1999) 0.02

0.021541484 = product of:
  0.07539519 = sum of:
    0.038614176 = weight(_text_:retrieval in 494) [ClassicSimilarity], result of:
      0.038614176 = score(doc=494,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.33420905 = fieldWeight in 494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=494)
    0.036781013 = weight(_text_:internet in 494) [ClassicSimilarity], result of:
      0.036781013 = score(doc=494,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.3261795 = fieldWeight in 494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.078125 = fieldNorm(doc=494)
  0.2857143 = coord(2/7)

Theme: Internet
Klassifikationssysteme im Online-Retrieval

Shafer, K.E.: Evaluating Scorpion results (1998) 0.02

0.021541484 = product of:
  0.07539519 = sum of:
    0.038614176 = weight(_text_:retrieval in 1569) [ClassicSimilarity], result of:
      0.038614176 = score(doc=1569,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.33420905 = fieldWeight in 1569, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1569)
    0.036781013 = weight(_text_:internet in 1569) [ClassicSimilarity], result of:
      0.036781013 = score(doc=1569,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.3261795 = fieldWeight in 1569, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.078125 = fieldNorm(doc=1569)
  0.2857143 = coord(2/7)

Abstract: Scorpion is a research project at OCLC that builds tools for automatic subject assignment by combining library science and information retrieval techniques. A thesis of Scorpion is that the Dewey Decimal Classification (Dewey) can be used to perform automatic subject assignment for electronic items.
Theme: Internet

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.02

0.01852492 = product of:
  0.06483722 = sum of:
    0.044137213 = weight(_text_:internet in 1046) [ClassicSimilarity], result of:
      0.044137213 = score(doc=1046,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.3914154 = fieldWeight in 1046, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.09375 = fieldNorm(doc=1046)
    0.020700004 = product of:
      0.06210001 = sum of:
        0.06210001 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.06210001 = score(doc=1046,freq=2.0), product of:
            0.13375512 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03819578 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Date: 5. 5.2003 14:17:22
Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part II

Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.02

0.018126078 = product of:
  0.06344127 = sum of:
    0.027029924 = weight(_text_:retrieval in 7209) [ClassicSimilarity], result of:
      0.027029924 = score(doc=7209,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.23394634 = fieldWeight in 7209, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
    0.036411345 = weight(_text_:internet in 7209) [ClassicSimilarity], result of:
      0.036411345 = score(doc=7209,freq=4.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.32290122 = fieldWeight in 7209, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
  0.2857143 = coord(2/7)

Abstract: The Nordic WAIS/WWW project sponsored by NORDINFO is a joint project between Lund University Library and the National Technological Library of Denmark. It aims to improve the existing networked information discovery and retrieval tools Wide Area Information System (WAIS) and World Wide Web (WWW), and to move towards unifying WWW and WAIS. Details current results focusing on the WAIS side of the project. Describes research into automatic indexing and classification of WAIS sources, development of an orientation tool for WAIS, and development of a WAIS index of WWW resources
Source: Internet world and document delivery world international 94: Proceedings of the 2nd Annual Conference, London, May 1994
Theme: Internet

Chan, L.M.; Lin, X.; Zeng, M.: Structural and multilingual approaches to subject access on the Web (1999) 0.02

0.017233187 = product of:
  0.060316153 = sum of:
    0.030891342 = weight(_text_:retrieval in 162) [ClassicSimilarity], result of:
      0.030891342 = score(doc=162,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.26736724 = fieldWeight in 162, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=162)
    0.02942481 = weight(_text_:internet in 162) [ClassicSimilarity], result of:
      0.02942481 = score(doc=162,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.2609436 = fieldWeight in 162, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0625 = fieldNorm(doc=162)
  0.2857143 = coord(2/7)

Abstract: Zu den großen Herausforderungen einer sinnvollen Suche im WWW gehören die riesige Menge des Verfügbaren und die Sparchbarrieren. Verfahren, die die Web-Ressourcen im Hinblick auf ein effizienteres Retrieval inhaltlich strukturieren, werden daher ebenso dringend benötigt wie Programme, die mit der Sprachvielfalt umgehen können. Im folgenden Vortrag werden wir einige Ansätze diskutieren, die zur Bewältigung der beiden Probleme derzeit unternommen werden
Theme: Internet

Koch, T.: Nutzung von Klassifikationssystemen zur verbesserten Beschreibung, Organisation und Suche von Internetressourcen (1998) 0.02

0.017233187 = product of:
  0.060316153 = sum of:
    0.030891342 = weight(_text_:retrieval in 1030) [ClassicSimilarity], result of:
      0.030891342 = score(doc=1030,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.26736724 = fieldWeight in 1030, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1030)
    0.02942481 = weight(_text_:internet in 1030) [ClassicSimilarity], result of:
      0.02942481 = score(doc=1030,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.2609436 = fieldWeight in 1030, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0625 = fieldNorm(doc=1030)
  0.2857143 = coord(2/7)

Theme: Internet
Klassifikationssysteme im Online-Retrieval

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02

0.015961196 = product of:
  0.05586418 = sum of:
    0.038614176 = weight(_text_:retrieval in 611) [ClassicSimilarity], result of:
      0.038614176 = score(doc=611,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.33420905 = fieldWeight in 611, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=611)
    0.017250005 = product of:
      0.05175001 = sum of:
        0.05175001 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.05175001 = score(doc=611,freq=2.0), product of:
            0.13375512 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03819578 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Date: 22. 8.2009 12:54:24
Theme: Klassifikationssysteme im Online-Retrieval

Oberhauser, O.: Automatisches Klassifizieren : Entwicklungsstand - Methodik - Anwendungsbereiche (2005) 0.02
```
0.015699234 = product of:
  0.036631547 = sum of:
    0.009653544 = weight(_text_:retrieval in 38) [ClassicSimilarity], result of:
      0.009653544 = score(doc=38,freq=2.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.08355226 = fieldWeight in 38, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.01953125 = fieldNorm(doc=38)
    0.017782751 = weight(_text_:bibliothek in 38) [ClassicSimilarity], result of:
      0.017782751 = score(doc=38,freq=2.0), product of:
        0.15681393 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.03819578 = queryNorm
        0.113400325 = fieldWeight in 38, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.01953125 = fieldNorm(doc=38)
    0.009195253 = weight(_text_:internet in 38) [ClassicSimilarity], result of:
      0.009195253 = score(doc=38,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.081544876 = fieldWeight in 38, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.01953125 = fieldNorm(doc=38)
  0.42857143 = coord(3/7)
```
Footnote

Rez. in: VÖB-Mitteilungen 58(2005) H.3, S.102-104 (R.F. Müller); ZfBB 53(2006) H.5, S.282-283 (L. Svensson): "Das Sammeln und Verzeichnen elektronischer Ressourcen gehört in wissenschaftlichen Bibliotheken längst zum Alltag. Parallel dazu kündigt sich ein Paradigmenwechsel bei den Findmitteln an: Um einen effizienten und benutzerorientierten Zugang zu den gemischten Kollektionen bieten zu können, experimentieren einige bibliothekarische Diensteanbieter wie z. B. das hbz (http://suchen.hbz-nrw.de/dreilaender/), die Bibliothek der North Carolina State University (www.lib.ncsu.edu/) und demnächst vascoda (www.vascoda.de/) und der Librarians-Internet Index (www.lii.org/) zunehmend mit Suchmaschinentechnologie. Dabei wird angestrebt, nicht nur einen vollinvertierten Suchindex anzubieten, sondern auch das Browsing durch eine hierarchisch geordnete Klassifikation. Von den Daten in den deutschen Verbunddatenbanken ist jedoch nur ein kleiner Teil schon klassifikatorisch erschlossen. Fremddaten aus dem angloamerikanischen Bereich sind oft mit LCC und/oder DDC erschlossen, wobei die Library of Congress sich bei der DDCErschließung auf Titel, die hauptsächlich für die Public Libraries interessant sind, konzentriert. Die Deutsche Nationalbibliothek wird ab 2007 Printmedien und Hochschulschriften flächendeckend mit DDC erschließen. Es ist aber schon offensichtlich, dass v. a. im Bereich der elektronischen Publikationen die anfallenden Dokumentenmengen mit immer knapperen Personalressourcen nicht intellektuell erschlossen werden können, sondern dass neue Verfahren entwickelt werden müssen. Hier kommt Oberhausers Buch gerade richtig. Seit Anfang der 1990er Jahre sind mehrere Projekte zum Thema automatisches Klassifizieren durchgeführt worden. Wer sich in diese Thematik einarbeiten wollte oder sich für die Ergebnisse der größeren Projekte interessierte, konnte bislang auf keine Überblicksdarstellung zurückgreifen, sondern war auf eine Vielzahl von Einzeluntersuchungen sowie die Projektdokumentationen angewiesen. Oberhausers Darstellung, die auf einer Fülle von publizierter und grauer Literatur fußt, schließt diese Lücke. Das selbst gesetzte Ziel, einen guten Überblick über den momentanen Kenntnisstand und die Ergebnisse der einschlägigen Projekte verständlich zu vermitteln, erfüllt der Autor mit Bravour. Dabei ist anzumerken, dass er ein bibliothekarisches Grundwissen und mindestens grundlegende Kenntnisse über informationswissenschaftliche Grundbegriffe und Fragestellungen voraussetzt, wobei hier für den Einsteiger einige Hinweise auf einführende Darstellungen wünschenswert gewesen wären.
Die am Anfang des Werkes gestellte Frage, ob »die Techniken des automatischen Klassifizierens heute bereits so weit [sind], dass damit grosse Mengen elektronischer Dokumente [-] zufrieden stellend erschlossen werden können? « (S. 13), beantwortet der Verfasser mit einem eindeutigen »nein«, was Salton und McGills Aussage von 1983, »daß einfache automatische Indexierungsverfahren schnell und kostengünstig arbeiten, und daß sie Recall- und Precisionwerte erreichen, die mindestens genauso gut sind wie bei der manuellen Indexierung mit kontrolliertem Vokabular « (Gerard Salton und Michael J. McGill: Information Retrieval. Hamburg u.a. 1987, S. 64 f.) kräftig relativiert. Über die Gründe, warum drei der großen Projekte nicht weiter verfolgt werden, will Oberhauser nicht spekulieren, nennt aber mangelnden Erfolg, Verlagerung der Arbeit in den beteiligten Institutionen sowie Finanzierungsprobleme als mögliche Ursachen. Das größte Entwicklungspotenzial beim automatischen Erschließen großer Dokumentenmengen sieht der Verfasser heute in den Bereichen der Patentund Mediendokumentation. Hier solle man im bibliothekarischen Bereich die Entwicklung genau verfolgen, da diese »sicherlich mittelfristig auf eine qualitativ zufrieden stellende Vollautomatisierung« abziele (S. 146). Oberhausers Darstellung ist ein rundum gelungenes Werk, das zum Handapparat eines jeden, der sich für automatische Erschließung interessiert, gehört."
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.02
```
0.015666807 = product of:
  0.054833822 = sum of:
    0.032765217 = weight(_text_:retrieval in 316) [ClassicSimilarity], result of:
      0.032765217 = score(doc=316,freq=4.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.2835858 = fieldWeight in 316, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=316)
    0.022068607 = weight(_text_:internet in 316) [ClassicSimilarity], result of:
      0.022068607 = score(doc=316,freq=2.0), product of:
        0.11276311 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03819578 = queryNorm
        0.1957077 = fieldWeight in 316, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.046875 = fieldNorm(doc=316)
  0.2857143 = coord(2/7)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC) [10], within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR).
Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.02
```
0.015151031 = product of:
  0.053028606 = sum of:
    0.042678602 = weight(_text_:bibliothek in 3051) [ClassicSimilarity], result of:
      0.042678602 = score(doc=3051,freq=2.0), product of:
        0.15681393 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.03819578 = queryNorm
        0.27216077 = fieldWeight in 3051, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.046875 = fieldNorm(doc=3051)
    0.010350002 = product of:
      0.031050006 = sum of:
        0.031050006 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
          0.031050006 = score(doc=3051,freq=2.0), product of:
            0.13375512 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03819578 = queryNorm
            0.23214069 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)
```
Abstract

Klassifikation von bibliografischen Einheiten ist für einen systematischen Zugang zu den Beständen einer Bibliothek und deren Aufstellung unumgänglich. Bislang wurde diese Aufgabe von Fachexperten manuell erledigt, sei es individuell nach einer selbst entwickelten Systematik oder kooperativ nach einer gemeinsamen Systematik. In dieser Arbeit wird ein Verfahren zur Automatisierung des Klassifikationsvorgangs vorgestellt. Dabei kommt das Verfahren des fallbasierten Schließens zum Einsatz, das im Kontext der Forschung zur künstlichen Intelligenz entwickelt wurde. Das Verfahren liefert für jedes Werk, für das bibliografische Daten vorliegen, eine oder mehrere mögliche Klassifikationen. In Experimenten werden die Ergebnisse der automatischen Klassifikation mit der durch Fachexperten verglichen. Diese Experimente belegen die hohe Qualität der automatischen Klassifikation und dass das Verfahren geeignet ist, Fachexperten bei der Klassifikationsarbeit signifikant zu entlasten. Auch die nahezu vollständige Resystematisierung eines Bibliothekskataloges ist - mit gewissen Abstrichen - möglich.

Date

22. 8.2009 19:51:28
Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01
```
0.014799134 = product of:
  0.051796965 = sum of:
    0.04317196 = weight(_text_:retrieval in 2765) [ClassicSimilarity], result of:
      0.04317196 = score(doc=2765,freq=10.0), product of:
        0.11553899 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03819578 = queryNorm
        0.37365708 = fieldWeight in 2765, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2765)
    0.008625003 = product of:
      0.025875006 = sum of:
        0.025875006 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
          0.025875006 = score(doc=2765,freq=2.0), product of:
            0.13375512 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03819578 = queryNorm
            0.19345059 = fieldWeight in 2765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2765)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)
```
Abstract

Passages can be hidden within a text to circumvent their disallowed transfer. Such release of compartmentalized information is of concern to all corporate and governmental organizations. Passage retrieval is well studied; we posit, however, that passage detection is not. Passage retrieval is the determination of the degree of relevance of blocks of text, namely passages, comprising a document. Rather than determining the relevance of a document in its entirety, passage retrieval determines the relevance of the individual passages. As such, modified traditional information-retrieval techniques compare terms found in user queries with the individual passages to determine a similarity score for passages of interest. In passage detection, passages are classified into predetermined categories. More often than not, passage detection techniques are deployed to detect hidden paragraphs in documents. That is, to hide information, documents are injected with hidden text into passages. Rather than matching query terms against passages to determine their relevance, using text-mining techniques, the passages are classified. Those documents with hidden passages are defined as infected. Thus, simply stated, passage retrieval is the search for passages relevant to a user query, while passage detection is the classification of passages. That is, in passage detection, passages are labeled with one or more categories from a set of predetermined categories. We present a keyword-based dynamic passage approach (KDP) and demonstrate that KDP outperforms statistically significantly (99% confidence) the other document-splitting approaches by 12% to 18% in the passage detection and passage category-prediction tasks. Furthermore, we evaluate the effects of the feature selection, passage length, ambiguous passages, and finally training-data category distribution on passage-detection accuracy.

Date

22. 3.2009 19:14:43

Search (116 results, page 1 of 6)

Authors

Years

Languages

Types

Themes

Subjects