Search (182 results, page 1 of 10)

  • × language_ss:"e"
  • × theme_ss:"Automatisches Indexieren"
  1. 7e Dag van het Document : 19 & 20 mei 1998, Congrescentrum De Reehorst, Ede ; proceedings (1998) 0.09
    0.0850351 = product of:
      0.3401404 = sum of:
        0.14644803 = weight(_text_:allgemeines in 2427) [ClassicSimilarity], result of:
          0.14644803 = score(doc=2427,freq=4.0), product of:
            0.16427658 = queryWeight, product of:
              5.705423 = idf(docFreq=399, maxDocs=44218)
              0.02879306 = queryNorm
            0.89147234 = fieldWeight in 2427, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.705423 = idf(docFreq=399, maxDocs=44218)
              0.078125 = fieldNorm(doc=2427)
        0.099661656 = weight(_text_:medien in 2427) [ClassicSimilarity], result of:
          0.099661656 = score(doc=2427,freq=4.0), product of:
            0.1355183 = queryWeight, product of:
              4.7066307 = idf(docFreq=1085, maxDocs=44218)
              0.02879306 = queryNorm
            0.73541105 = fieldWeight in 2427, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.7066307 = idf(docFreq=1085, maxDocs=44218)
              0.078125 = fieldNorm(doc=2427)
        0.022099946 = weight(_text_:und in 2427) [ClassicSimilarity], result of:
          0.022099946 = score(doc=2427,freq=4.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.34630734 = fieldWeight in 2427, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.078125 = fieldNorm(doc=2427)
        0.022099946 = weight(_text_:und in 2427) [ClassicSimilarity], result of:
          0.022099946 = score(doc=2427,freq=4.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.34630734 = fieldWeight in 2427, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.078125 = fieldNorm(doc=2427)
        0.049830828 = product of:
          0.099661656 = sum of:
            0.099661656 = weight(_text_:medien in 2427) [ClassicSimilarity], result of:
              0.099661656 = score(doc=2427,freq=4.0), product of:
                0.1355183 = queryWeight, product of:
                  4.7066307 = idf(docFreq=1085, maxDocs=44218)
                  0.02879306 = queryNorm
                0.73541105 = fieldWeight in 2427, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.7066307 = idf(docFreq=1085, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2427)
          0.5 = coord(1/2)
      0.25 = coord(5/20)
    
    BK
    06.00 (Information und Dokumentation: Allgemeines)
    05.38 (Neue elektronische Medien, Kommunikationswissenschaft)
    Classification
    06.00 (Information und Dokumentation: Allgemeines)
    05.38 (Neue elektronische Medien, Kommunikationswissenschaft)
  2. Munkelt, J.: Erstellung einer DNB-Retrieval-Testkollektion (2018) 0.04
    0.043746933 = product of:
      0.1458231 = sum of:
        0.026794761 = weight(_text_:und in 4310) [ClassicSimilarity], result of:
          0.026794761 = score(doc=4310,freq=12.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.41987535 = fieldWeight in 4310, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4310)
        0.044483 = product of:
          0.088966 = sum of:
            0.088966 = weight(_text_:kommunikationswissenschaften in 4310) [ClassicSimilarity], result of:
              0.088966 = score(doc=4310,freq=4.0), product of:
                0.15303716 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.02879306 = queryNorm
                0.5813359 = fieldWeight in 4310, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4310)
          0.5 = coord(1/2)
        0.02484572 = weight(_text_:der in 4310) [ClassicSimilarity], result of:
          0.02484572 = score(doc=4310,freq=10.0), product of:
            0.06431698 = queryWeight, product of:
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.02879306 = queryNorm
            0.38630107 = fieldWeight in 4310, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4310)
        0.026794761 = weight(_text_:und in 4310) [ClassicSimilarity], result of:
          0.026794761 = score(doc=4310,freq=12.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.41987535 = fieldWeight in 4310, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4310)
        0.017077852 = weight(_text_:des in 4310) [ClassicSimilarity], result of:
          0.017077852 = score(doc=4310,freq=2.0), product of:
            0.079736836 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.02879306 = queryNorm
            0.2141777 = fieldWeight in 4310, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4310)
        0.005827016 = weight(_text_:in in 4310) [ClassicSimilarity], result of:
          0.005827016 = score(doc=4310,freq=4.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.14877784 = fieldWeight in 4310, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4310)
      0.3 = coord(6/20)
    
    Abstract
    Seit Herbst 2017 findet in der Deutschen Nationalbibliothek die Inhaltserschließung bestimmter Medienwerke rein maschinell statt. Die Qualität dieses Verfahrens, das die Prozessorganisation von Bibliotheken maßgeblich prägen kann, wird unter Fachleuten kontrovers diskutiert. Ihre Standpunkte werden zunächst hinreichend erläutert, ehe die Notwendigkeit einer Qualitätsprüfung des Verfahrens und dessen Grundlagen dargelegt werden. Zentraler Bestandteil einer künftigen Prüfung ist eine Testkollektion. Ihre Erstellung und deren Dokumentation steht im Fokus dieser Arbeit. In diesem Zusammenhang werden auch die Entstehungsgeschichte und Anforderungen an gelungene Testkollektionen behandelt. Abschließend wird ein Retrievaltest durchgeführt, der die Einsatzfähigkeit der erarbeiteten Testkollektion belegt. Seine Ergebnisse dienen ausschließlich der Funktionsüberprüfung. Eine Qualitätsbeurteilung maschineller Inhaltserschließung im Speziellen sowie im Allgemeinen findet nicht statt und ist nicht Ziel der Ausarbeitung.
    Content
    Bachelorarbeit, Bibliothekswissenschaften, Fakultät für Informations- und Kommunikationswissenschaften, Technische Hochschule Köln
    Imprint
    Köln : Technische Hochschule, Fakultät für Informations- und Kommunikationswissenschaften
  3. Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.02
    0.020685997 = product of:
      0.08274399 = sum of:
        0.019766793 = weight(_text_:und in 3081) [ClassicSimilarity], result of:
          0.019766793 = score(doc=3081,freq=20.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.3097467 = fieldWeight in 3081, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.020078372 = weight(_text_:der in 3081) [ClassicSimilarity], result of:
          0.020078372 = score(doc=3081,freq=20.0), product of:
            0.06431698 = queryWeight, product of:
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.02879306 = queryNorm
            0.3121784 = fieldWeight in 3081, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.019766793 = weight(_text_:und in 3081) [ClassicSimilarity], result of:
          0.019766793 = score(doc=3081,freq=20.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.3097467 = fieldWeight in 3081, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.016902693 = weight(_text_:des in 3081) [ClassicSimilarity], result of:
          0.016902693 = score(doc=3081,freq=6.0), product of:
            0.079736836 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.02879306 = queryNorm
            0.21198097 = fieldWeight in 3081, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
        0.006229343 = weight(_text_:in in 3081) [ClassicSimilarity], result of:
          0.006229343 = score(doc=3081,freq=14.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.15905021 = fieldWeight in 3081, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
      0.25 = coord(5/20)
    
    Abstract
    Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."
    Footnote
    Dissertation, Humboldt-Universität zu Berlin - Institut für Bibliotheks- und Informationswissenschaft.
    Imprint
    Berlin : Humboldt-Universität zu Berlin / Institut für Bibliotheks- und Informationswissenschaft
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  4. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.01
    0.013326541 = product of:
      0.0666327 = sum of:
        0.015627023 = weight(_text_:und in 4157) [ClassicSimilarity], result of:
          0.015627023 = score(doc=4157,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.24487628 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
        0.015873346 = weight(_text_:der in 4157) [ClassicSimilarity], result of:
          0.015873346 = score(doc=4157,freq=2.0), product of:
            0.06431698 = queryWeight, product of:
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.02879306 = queryNorm
            0.2467987 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
        0.015627023 = weight(_text_:und in 4157) [ClassicSimilarity], result of:
          0.015627023 = score(doc=4157,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.24487628 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
        0.01950531 = product of:
          0.03901062 = sum of:
            0.03901062 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.03901062 = score(doc=4157,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.5 = coord(1/2)
      0.2 = coord(4/20)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  5. Milstead, J.L.: Thesauri in a full-text world (1998) 0.01
    0.0108660795 = product of:
      0.043464318 = sum of:
        0.007813511 = weight(_text_:und in 2337) [ClassicSimilarity], result of:
          0.007813511 = score(doc=2337,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.12243814 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.007813511 = weight(_text_:und in 2337) [ClassicSimilarity], result of:
          0.007813511 = score(doc=2337,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.12243814 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.012198467 = weight(_text_:des in 2337) [ClassicSimilarity], result of:
          0.012198467 = score(doc=2337,freq=2.0), product of:
            0.079736836 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.02879306 = queryNorm
            0.15298408 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.005886175 = weight(_text_:in in 2337) [ClassicSimilarity], result of:
          0.005886175 = score(doc=2337,freq=8.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.15028831 = fieldWeight in 2337, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.009752655 = product of:
          0.01950531 = sum of:
            0.01950531 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
              0.01950531 = score(doc=2337,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.19345059 = fieldWeight in 2337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2337)
          0.5 = coord(1/2)
      0.25 = coord(5/20)
    
    Abstract
    Despite early claims to the contemporary, thesauri continue to find use as access tools for information in the full-text environment. Their mode of use is changing, but this change actually represents an expansion rather than a contrdiction of their utility. Thesauri and similar vocabulary tools can complement full-text access by aiding users in focusing their searches, by supplementing the linguistic analysis of the text search engine, and even by serving as one of the tools used by the linguistic engine for its analysis. While human indexing contunues to be used for many databases, the trend is to increase the use of machine aids for this purpose. All machine-aided indexing (MAI) systems rely on thesauri as the basis for term selection. In the 21st century, the balance of effort between human and machine will change at both input and output, but thesauri will continue to play an important role for the foreseeable future
    Date
    22. 9.1997 19:16:05
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  6. Salton, G.: Future prospects for text-based information retrieval (1990) 0.01
    0.010017176 = product of:
      0.06678117 = sum of:
        0.018752426 = weight(_text_:und in 2327) [ClassicSimilarity], result of:
          0.018752426 = score(doc=2327,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.29385152 = fieldWeight in 2327, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.09375 = fieldNorm(doc=2327)
        0.018752426 = weight(_text_:und in 2327) [ClassicSimilarity], result of:
          0.018752426 = score(doc=2327,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.29385152 = fieldWeight in 2327, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.09375 = fieldNorm(doc=2327)
        0.029276319 = weight(_text_:des in 2327) [ClassicSimilarity], result of:
          0.029276319 = score(doc=2327,freq=2.0), product of:
            0.079736836 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.02879306 = queryNorm
            0.36716178 = fieldWeight in 2327, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.09375 = fieldNorm(doc=2327)
      0.15 = coord(3/20)
    
    Source
    Pragmatische Aspekte beim Entwurf und Betrieb von Informationssystemen: Proc. des 1. Int. Symposiums für Informationswissenschaft, Universität Konstanz, 17.-19.10.1990. Hrsg.: J. Herget u. R. Kuhlen
  7. Siebenkäs, A.; Markscheffel, B.: Conception of a workflow for the semi-automatic construction of a thesaurus for the German printing industry (2015) 0.01
    0.008956539 = product of:
      0.0447827 = sum of:
        0.010938915 = weight(_text_:und in 2091) [ClassicSimilarity], result of:
          0.010938915 = score(doc=2091,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.17141339 = fieldWeight in 2091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
        0.010938915 = weight(_text_:und in 2091) [ClassicSimilarity], result of:
          0.010938915 = score(doc=2091,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.17141339 = fieldWeight in 2091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
        0.017077852 = weight(_text_:des in 2091) [ClassicSimilarity], result of:
          0.017077852 = score(doc=2091,freq=2.0), product of:
            0.079736836 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.02879306 = queryNorm
            0.2141777 = fieldWeight in 2091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
        0.005827016 = weight(_text_:in in 2091) [ClassicSimilarity], result of:
          0.005827016 = score(doc=2091,freq=4.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.14877784 = fieldWeight in 2091, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2091)
      0.2 = coord(4/20)
    
    Abstract
    During the BMWI granted project "Print-IT", the need of a thesaurus based uniform and consistent language for the German printing industry became evident. In this paper we introduce a semi-automatic construction approach for such a thesaurus and present a workflow which supports users to generate thesaurus typical information structures from relevant digitalized resources with the help of common IT-tools.
    Source
    Re:inventing information science in the networked society: Proceedings of the 14th International Symposium on Information Science, Zadar/Croatia, 19th-21st May 2015. Eds.: F. Pehar, C. Schloegl u. C. Wolff
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  8. Willis, C.; Losee, R.M.: ¬A random walk on an ontology : using thesaurus structure for automatic subject indexing (2013) 0.01
    0.005941176 = product of:
      0.029705878 = sum of:
        0.0062508085 = weight(_text_:und in 1016) [ClassicSimilarity], result of:
          0.0062508085 = score(doc=1016,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.09795051 = fieldWeight in 1016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
        0.0062508085 = weight(_text_:und in 1016) [ClassicSimilarity], result of:
          0.0062508085 = score(doc=1016,freq=2.0), product of:
            0.06381599 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.02879306 = queryNorm
            0.09795051 = fieldWeight in 1016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
        0.009758773 = weight(_text_:des in 1016) [ClassicSimilarity], result of:
          0.009758773 = score(doc=1016,freq=2.0), product of:
            0.079736836 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.02879306 = queryNorm
            0.12238726 = fieldWeight in 1016, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
        0.0074454886 = weight(_text_:in in 1016) [ClassicSimilarity], result of:
          0.0074454886 = score(doc=1016,freq=20.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.19010136 = fieldWeight in 1016, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=1016)
      0.2 = coord(4/20)
    
    Abstract
    Relationships between terms and features are an essential component of thesauri, ontologies, and a range of controlled vocabularies. In this article, we describe ways to identify important concepts in documents using the relationships in a thesaurus or other vocabulary structures. We introduce a methodology for the analysis and modeling of the indexing process based on a weighted random walk algorithm. The primary goal of this research is the analysis of the contribution of thesaurus structure to the indexing process. The resulting models are evaluated in the context of automatic subject indexing using four collections of documents pre-indexed with 4 different thesauri (AGROVOC [UN Food and Agriculture Organization], high-energy physics taxonomy [HEP], National Agricultural Library Thesaurus [NALT], and medical subject headings [MeSH]). We also introduce a thesaurus-centric matching algorithm intended to improve the quality of candidate concepts. In all cases, the weighted random walk improves automatic indexing performance over matching alone with an increase in average precision (AP) of 9% for HEP, 11% for MeSH, 35% for NALT, and 37% for AGROVOC. The results of the analysis support our hypothesis that subject indexing is in part a browsing process, and that using the vocabulary and its structure in a thesaurus contributes to the indexing process. The amount that the vocabulary structure contributes was found to differ among the 4 thesauri, possibly due to the vocabulary used in the corresponding thesauri and the structural relationships between the terms. Each of the thesauri and the manual indexing associated with it is characterized using the methods developed here.
    Content
    Korrektur einer Referenz in: JASIST 64(2013) no.8, S.1757.
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  9. Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.00
    0.004373513 = product of:
      0.029156754 = sum of:
        0.009524008 = weight(_text_:der in 5480) [ClassicSimilarity], result of:
          0.009524008 = score(doc=5480,freq=2.0), product of:
            0.06431698 = queryWeight, product of:
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.02879306 = queryNorm
            0.14807922 = fieldWeight in 5480, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.046875 = fieldNorm(doc=5480)
        0.014638159 = weight(_text_:des in 5480) [ClassicSimilarity], result of:
          0.014638159 = score(doc=5480,freq=2.0), product of:
            0.079736836 = queryWeight, product of:
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.02879306 = queryNorm
            0.18358089 = fieldWeight in 5480, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.7693076 = idf(docFreq=7536, maxDocs=44218)
              0.046875 = fieldNorm(doc=5480)
        0.0049945856 = weight(_text_:in in 5480) [ClassicSimilarity], result of:
          0.0049945856 = score(doc=5480,freq=4.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.12752387 = fieldWeight in 5480, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=5480)
      0.15 = coord(3/20)
    
    Abstract
    (Automatic) document classification is generally defined as content-based assignment of one or more predefined categories to documents. Usually, machine learning, statistical pattern recognition, or neural network approaches are used to construct classifiers automatically. In this paper we thoroughly evaluate a wide variety of these methods on a document classification task for German text. We evaluate different feature construction and selection methods and various classifiers. Our main results are: (1) feature selection is necessary not only to reduce learning and classification time, but also to avoid overfitting (even for Support Vector Machines); (2) surprisingly, our morphological analysis does not improve classification quality compared to a letter 5-gram approach; (3) Support Vector Machines are significantly better than all other classification methods
    Source
    Informationskompetenz - Basiskompetenz in der Informationsgesellschaft: Proceedings des 7. Internationalen Symposiums für Informationswissenschaft (ISI 2000), Hrsg.: G. Knorz u. R. Kuhlen
  10. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.00
    0.0040626377 = product of:
      0.040626377 = sum of:
        0.00941788 = weight(_text_:in in 402) [ClassicSimilarity], result of:
          0.00941788 = score(doc=402,freq=2.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.24046129 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.031208497 = product of:
          0.062416993 = sum of:
            0.062416993 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.062416993 = score(doc=402,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  11. Salton, G.: Fast document classification in automatic information retrieval (1978) 0.00
    0.0033553482 = product of:
      0.03355348 = sum of:
        0.025397355 = weight(_text_:der in 2331) [ClassicSimilarity], result of:
          0.025397355 = score(doc=2331,freq=8.0), product of:
            0.06431698 = queryWeight, product of:
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.02879306 = queryNorm
            0.3948779 = fieldWeight in 2331, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.0625 = fieldNorm(doc=2331)
        0.0081561245 = weight(_text_:in in 2331) [ClassicSimilarity], result of:
          0.0081561245 = score(doc=2331,freq=6.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.2082456 = fieldWeight in 2331, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=2331)
      0.1 = coord(2/20)
    
    Abstract
    A classified or clustered file is one where related or similar records are grouped into classes or clusters of items in such a way that all itmes within a cluster are jointly retrievable. Clustered files are easily adapted to to broad and narrow search strategies, and simple file updating methods are available. An inexpensive file clustering method applicable to large files is given together with appropriate file search methods
    Source
    Kooperation in der Klassifikation I. Proc. der Sekt.1-3 der 2. Fachtagung der Gesellschaft für Klassifikation, Frankfurt-Hoechst, 6.-7.4.1978. Bearb.: W. Dahlberg
  12. Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.00
    0.0029700466 = product of:
      0.029700466 = sum of:
        0.010195156 = weight(_text_:in in 1952) [ClassicSimilarity], result of:
          0.010195156 = score(doc=1952,freq=6.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.260307 = fieldWeight in 1952, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
        0.01950531 = product of:
          0.03901062 = sum of:
            0.03901062 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
              0.03901062 = score(doc=1952,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.38690117 = fieldWeight in 1952, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1952)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Date
    16. 8.1998 12:51:22
    Footnote
    Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.513-517.
    Source
    Proceedings of the 11th annual conference on research and development in information retrieval. Ed.: Y. Chiaramella
  13. Lepsky, K.: Automatische Indexierung in der Inhaltserschließung (1998) 0.00
    0.0026111426 = product of:
      0.026111426 = sum of:
        0.019048017 = weight(_text_:der in 1283) [ClassicSimilarity], result of:
          0.019048017 = score(doc=1283,freq=2.0), product of:
            0.06431698 = queryWeight, product of:
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.02879306 = queryNorm
            0.29615843 = fieldWeight in 1283, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.09375 = fieldNorm(doc=1283)
        0.00706341 = weight(_text_:in in 1283) [ClassicSimilarity], result of:
          0.00706341 = score(doc=1283,freq=2.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.18034597 = fieldWeight in 1283, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.09375 = fieldNorm(doc=1283)
      0.1 = coord(2/20)
    
  14. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.00
    0.0025391486 = product of:
      0.025391486 = sum of:
        0.005886175 = weight(_text_:in in 2759) [ClassicSimilarity], result of:
          0.005886175 = score(doc=2759,freq=2.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.15028831 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
        0.01950531 = product of:
          0.03901062 = sum of:
            0.03901062 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
              0.03901062 = score(doc=2759,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.38690117 = fieldWeight in 2759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2759)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Date
    1. 2.2016 18:25:22
    Series
    Lecture notes in computer science ; 9398
  15. Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.00
    0.0025307748 = product of:
      0.025307748 = sum of:
        0.011654032 = weight(_text_:in in 5001) [ClassicSimilarity], result of:
          0.011654032 = score(doc=5001,freq=16.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.29755569 = fieldWeight in 5001, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
        0.013653717 = product of:
          0.027307434 = sum of:
            0.027307434 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
              0.027307434 = score(doc=5001,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.2708308 = fieldWeight in 5001, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5001)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Abstract
    A study was done to test the effectiveness of retrieval using title word searching. It was based on actual search profiles used in the Mechanized Information Center at Ohio State University, in order ro replicate as closely as possible actual searching conditions. Fewer than 50% of the relevant titles were retrieved by keywords in titles. The low rate of retrieval can be attributes to three sources: titles themselves, user and information specialist ignorance of the subject vocabulary in use, and to general language problems. Across fields it was found that the social sciences had the best retrieval rate, with science having the next best, and arts and humanities the lowest. Ways to enhance and supplement keyword in title searching on the computer and in printed indexes are discussed.
    Date
    14. 3.1996 13:22:21
  16. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.00
    0.0022263695 = product of:
      0.022263695 = sum of:
        0.006659447 = weight(_text_:in in 6752) [ClassicSimilarity], result of:
          0.006659447 = score(doc=6752,freq=4.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.17003182 = fieldWeight in 6752, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
        0.015604248 = product of:
          0.031208497 = sum of:
            0.031208497 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.031208497 = score(doc=6752,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Abstract
    AutoSlog is a system that addresses the knowledge engineering bottleneck for information extraction. AutoSlog automatically creates domain specific dictionaries for information extraction, given an appropriate training corpus. Describes experiments with AutoSlog in terrorism, joint ventures and microelectronics domains. Compares the performance of AutoSlog across the 3 domains, discusses the lessons learned and presents results from 2 experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog
    Date
    6. 3.1997 16:22:15
  17. Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.00
    0.0021894362 = product of:
      0.021894362 = sum of:
        0.008240645 = weight(_text_:in in 530) [ClassicSimilarity], result of:
          0.008240645 = score(doc=530,freq=8.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.21040362 = fieldWeight in 530, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
        0.013653717 = product of:
          0.027307434 = sum of:
            0.027307434 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
              0.027307434 = score(doc=530,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.2708308 = fieldWeight in 530, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=530)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Abstract
    Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
    Source
    International forum on information and documentation. 22(1997) no.1, S.17-28
  18. Williams, R.V.: Hans Peter Luhn and Herbert M. Ohlman : their roles in the origins of keyword-in-context/permutation automatic indexing (2010) 0.00
    0.0020854801 = product of:
      0.020854801 = sum of:
        0.012698677 = weight(_text_:der in 3440) [ClassicSimilarity], result of:
          0.012698677 = score(doc=3440,freq=2.0), product of:
            0.06431698 = queryWeight, product of:
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.02879306 = queryNorm
            0.19743896 = fieldWeight in 3440, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.2337668 = idf(docFreq=12875, maxDocs=44218)
              0.0625 = fieldNorm(doc=3440)
        0.0081561245 = weight(_text_:in in 3440) [ClassicSimilarity], result of:
          0.0081561245 = score(doc=3440,freq=6.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.2082456 = fieldWeight in 3440, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=3440)
      0.1 = coord(2/20)
    
    Abstract
    The invention of automatic indexing using a keyword-in-context approach has generally been attributed solely to Hans Peter Luhn of IBM. This article shows that credit for this invention belongs equally to Luhn and Herbert Ohlman of the System Development Corporation. It also traces the origins of title derivative automatic indexing, its development and implementation, and current status.
    Theme
    Geschichte der Sacherschließung
  19. Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.00
    0.0020790326 = product of:
      0.020790325 = sum of:
        0.0071366085 = weight(_text_:in in 2673) [ClassicSimilarity], result of:
          0.0071366085 = score(doc=2673,freq=6.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.1822149 = fieldWeight in 2673, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
        0.013653717 = product of:
          0.027307434 = sum of:
            0.027307434 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
              0.027307434 = score(doc=2673,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.2708308 = fieldWeight in 2673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2673)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Abstract
    Examines techniques that discover features in sets of pre-categorized documents, such that similar documents can be found on the WWW. Examines techniques which will classifiy training examples with high accuracy, then explains why this is not necessarily useful. Describes a method for extracting word clusters from the raw document features. Results show that the clustering technique is successful in discovering word groups in personal Web pages which can be used to find similar information on the WWW
    Date
    1. 8.1996 22:08:06
  20. Ward, M.L.: ¬The future of the human indexer (1996) 0.00
    0.001960032 = product of:
      0.019600319 = sum of:
        0.007897133 = weight(_text_:in in 7244) [ClassicSimilarity], result of:
          0.007897133 = score(doc=7244,freq=10.0), product of:
            0.039165888 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.02879306 = queryNorm
            0.20163295 = fieldWeight in 7244, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
        0.011703186 = product of:
          0.023406371 = sum of:
            0.023406371 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
              0.023406371 = score(doc=7244,freq=2.0), product of:
                0.10082839 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02879306 = queryNorm
                0.23214069 = fieldWeight in 7244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7244)
          0.5 = coord(1/2)
      0.1 = coord(2/20)
    
    Abstract
    Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)
    Date
    9. 2.1997 18:44:22

Types

  • a 170
  • el 19
  • m 4
  • s 4
  • x 1
  • More… Less…