Search (58 results, page 1 of 3)

  • × type_ss:"a"
  • × theme_ss:"Automatisches Indexieren"
  1. Griffiths, A.; Robinson, L.A.; Willett, P.: Hierarchic agglomerative clustering methods for automatic document classification (1984) 0.02
    0.019129448 = product of:
      0.22955337 = sum of:
        0.22955337 = weight(_text_:205 in 2414) [ClassicSimilarity], result of:
          0.22955337 = score(doc=2414,freq=2.0), product of:
            0.2057144 = queryWeight, product of:
              6.312392 = idf(docFreq=217, maxDocs=44218)
              0.032588977 = queryNorm
            1.1158838 = fieldWeight in 2414, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.312392 = idf(docFreq=217, maxDocs=44218)
              0.125 = fieldNorm(doc=2414)
      0.083333336 = coord(1/12)
    
    Source
    Journal of documentation. 40(1984) no.3, S.175-205
  2. Pfeifer, U.: Entwicklung linear-iterativer und logistischer Indexierungsfunktionen (1991) 0.01
    0.010936644 = product of:
      0.13123973 = sum of:
        0.13123973 = weight(_text_:informatik in 794) [ClassicSimilarity], result of:
          0.13123973 = score(doc=794,freq=2.0), product of:
            0.1662844 = queryWeight, product of:
              5.1024737 = idf(docFreq=730, maxDocs=44218)
              0.032588977 = queryNorm
            0.7892486 = fieldWeight in 794, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.1024737 = idf(docFreq=730, maxDocs=44218)
              0.109375 = fieldNorm(doc=794)
      0.083333336 = coord(1/12)
    
    Series
    Informatik-Fachberichte; 289
  3. Volk, M.; Mittermaier, H.; Schurig, A.; Biedassek, T.: Halbautomatische Volltextanalyse, Datenbankaufbau und Document Retrieval (1992) 0.01
    0.008369134 = product of:
      0.1004296 = sum of:
        0.1004296 = weight(_text_:205 in 2571) [ClassicSimilarity], result of:
          0.1004296 = score(doc=2571,freq=2.0), product of:
            0.2057144 = queryWeight, product of:
              6.312392 = idf(docFreq=217, maxDocs=44218)
              0.032588977 = queryNorm
            0.48819917 = fieldWeight in 2571, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.312392 = idf(docFreq=217, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2571)
      0.083333336 = coord(1/12)
    
    Pages
    S.205-214
  4. Zimmermann, H.: Automatische Indexierung: Entwicklung und Perspektiven (1983) 0.01
    0.0068737715 = product of:
      0.08248526 = sum of:
        0.08248526 = weight(_text_:systeme in 2318) [ClassicSimilarity], result of:
          0.08248526 = score(doc=2318,freq=2.0), product of:
            0.17439179 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.032588977 = queryNorm
            0.4729882 = fieldWeight in 2318, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0625 = fieldNorm(doc=2318)
      0.083333336 = coord(1/12)
    
    Abstract
    Die Automatische Indexierung als ein Teilgebiet der Inhaltserschließung wird inzwischen in einer Reihe von Gebieten, vor allem in der Fachinformation und Kommunikation praktisch eingesetzt. Dabei dominieren äußerst einfache Systeme, die (noch) erhebliche Anpassungen des Benutzers an die jeweilige Systemstrategie voraussetzen. Unter Berücksichtigung des Konzepts der Einheit von Informationserschließung und -retrieval werden höherwertige ("intelligentere") Verfahren vorgestellt, die der Entlastung des Informationssuchenden wie auch der Verbesserung der Rechercheergebnisse dienen sollen
  5. Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.01
    0.0062368447 = product of:
      0.037421066 = sum of:
        0.021967318 = weight(_text_:internet in 2673) [ClassicSimilarity], result of:
          0.021967318 = score(doc=2673,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.22832564 = fieldWeight in 2673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
        0.015453748 = product of:
          0.030907497 = sum of:
            0.030907497 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
              0.030907497 = score(doc=2673,freq=2.0), product of:
                0.11412105 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032588977 = queryNorm
                0.2708308 = fieldWeight in 2673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2673)
          0.5 = coord(1/2)
      0.16666667 = coord(2/12)
    
    Date
    1. 8.1996 22:08:06
    Theme
    Internet
  6. Tzeras, K.: Zur Aufwandsabschätzung bei der Entwicklung eines Indexierungswörterbuches (1991) 0.01
    0.005468322 = product of:
      0.06561986 = sum of:
        0.06561986 = weight(_text_:informatik in 792) [ClassicSimilarity], result of:
          0.06561986 = score(doc=792,freq=2.0), product of:
            0.1662844 = queryWeight, product of:
              5.1024737 = idf(docFreq=730, maxDocs=44218)
              0.032588977 = queryNorm
            0.3946243 = fieldWeight in 792, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.1024737 = idf(docFreq=730, maxDocs=44218)
              0.0546875 = fieldNorm(doc=792)
      0.083333336 = coord(1/12)
    
    Series
    Informatik-Fachberichte; 289
  7. Daudaravicius, V.: ¬A framework for keyphrase extraction from scientific journals (2016) 0.01
    0.005468322 = product of:
      0.06561986 = sum of:
        0.06561986 = weight(_text_:informatik in 2930) [ClassicSimilarity], result of:
          0.06561986 = score(doc=2930,freq=2.0), product of:
            0.1662844 = queryWeight, product of:
              5.1024737 = idf(docFreq=730, maxDocs=44218)
              0.032588977 = queryNorm
            0.3946243 = fieldWeight in 2930, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.1024737 = idf(docFreq=730, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2930)
      0.083333336 = coord(1/12)
    
    Field
    Informatik
  8. Rapke, K.: Automatische Indexierung von Volltexten für die Gruner+Jahr Pressedatenbank (2001) 0.01
    0.0051553287 = product of:
      0.061863944 = sum of:
        0.061863944 = weight(_text_:systeme in 6386) [ClassicSimilarity], result of:
          0.061863944 = score(doc=6386,freq=2.0), product of:
            0.17439179 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.032588977 = queryNorm
            0.35474116 = fieldWeight in 6386, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=6386)
      0.083333336 = coord(1/12)
    
    Abstract
    Retrieval Tests sind die anerkannteste Methode, um neue Verfahren der Inhaltserschließung gegenüber traditionellen Verfahren zu rechtfertigen. Im Rahmen einer Diplomarbeit wurden zwei grundsätzlich unterschiedliche Systeme der automatischen inhaltlichen Erschließung anhand der Pressedatenbank des Verlagshauses Gruner + Jahr (G+J) getestet und evaluiert. Untersucht wurde dabei natürlichsprachliches Retrieval im Vergleich zu Booleschem Retrieval. Bei den beiden Systemen handelt es sich zum einen um Autonomy von Autonomy Inc. und DocCat, das von IBM an die Datenbankstruktur der G+J Pressedatenbank angepasst wurde. Ersteres ist ein auf natürlichsprachlichem Retrieval basierendes, probabilistisches System. DocCat demgegenüber basiert auf Booleschem Retrieval und ist ein lernendes System, das auf Grund einer intellektuell erstellten Trainingsvorlage indexiert. Methodisch geht die Evaluation vom realen Anwendungskontext der Textdokumentation von G+J aus. Die Tests werden sowohl unter statistischen wie auch qualitativen Gesichtspunkten bewertet. Ein Ergebnis der Tests ist, dass DocCat einige Mängel gegenüber der intellektuellen Inhaltserschließung aufweist, die noch behoben werden müssen, während das natürlichsprachliche Retrieval von Autonomy in diesem Rahmen und für die speziellen Anforderungen der G+J Textdokumentation so nicht einsetzbar ist
  9. Lepsky, K.: Vom OPAC zum Hyperkatalog : Daten und Indexierung (1996) 0.00
    0.0049703075 = product of:
      0.059643686 = sum of:
        0.059643686 = product of:
          0.11928737 = sum of:
            0.11928737 = weight(_text_:allgemein in 7726) [ClassicSimilarity], result of:
              0.11928737 = score(doc=7726,freq=2.0), product of:
                0.17123379 = queryWeight, product of:
                  5.254347 = idf(docFreq=627, maxDocs=44218)
                  0.032588977 = queryNorm
                0.69663453 = fieldWeight in 7726, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.254347 = idf(docFreq=627, maxDocs=44218)
                  0.09375 = fieldNorm(doc=7726)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Theme
    Katalogfragen allgemein
  10. Rapke, K.: Automatische Indexierung von Volltexten für die Gruner+Jahr Pressedatenbank (2001) 0.00
    0.004296107 = product of:
      0.051553283 = sum of:
        0.051553283 = weight(_text_:systeme in 5863) [ClassicSimilarity], result of:
          0.051553283 = score(doc=5863,freq=2.0), product of:
            0.17439179 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.032588977 = queryNorm
            0.2956176 = fieldWeight in 5863, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5863)
      0.083333336 = coord(1/12)
    
    Abstract
    Retrievaltests sind die anerkannteste Methode, um neue Verfahren der Inhaltserschließung gegenüber traditionellen Verfahren zu rechtfertigen. Im Rahmen einer Diplomarbeit wurden zwei grundsätzlich unterschiedliche Systeme der automatischen inhaltlichen Erschließung anhand der Pressedatenbank des Verlagshauses Gruner + Jahr (G+J) getestet und evaluiert. Untersucht wurde dabei natürlichsprachliches Retrieval im Vergleich zu Booleschem Retrieval. Bei den beiden Systemen handelt es sich zum einen um Autonomy von Autonomy Inc. und DocCat, das von IBM an die Datenbankstruktur der G+J Pressedatenbank angepasst wurde. Ersteres ist ein auf natürlichsprachlichem Retrieval basierendes, probabilistisches System. DocCat demgegenüber basiert auf Booleschem Retrieval und ist ein lernendes System, das aufgrund einer intellektuell erstellten Trainingsvorlage indexiert. Methodisch geht die Evaluation vom realen Anwendungskontext der Textdokumentation von G+J aus. Die Tests werden sowohl unter statistischen wie auch qualitativen Gesichtspunkten bewertet. Ein Ergebnis der Tests ist, dass DocCat einige Mängel gegenüber der intellektuellen Inhaltserschließung aufweist, die noch behoben werden müssen, während das natürlichsprachliche Retrieval von Autonomy in diesem Rahmen und für die speziellen Anforderungen der G+J Textdokumentation so nicht einsetzbar ist
  11. Nohr, H.: Theorie des Information Retrieval II : Automatische Indexierung (2004) 0.00
    0.004296107 = product of:
      0.051553283 = sum of:
        0.051553283 = weight(_text_:systeme in 8) [ClassicSimilarity], result of:
          0.051553283 = score(doc=8,freq=2.0), product of:
            0.17439179 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.032588977 = queryNorm
            0.2956176 = fieldWeight in 8, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0390625 = fieldNorm(doc=8)
      0.083333336 = coord(1/12)
    
    Abstract
    Ein großer Teil der Informationen - Schätzungen zufolge bis zu 80% - liegt in Organisationen in unstrukturierten Dokumenten vor. In der Vergangenheit wurden Lösungen für das Management strukturierter Informationen entwickelt, die es nun auch zu erreichen gilt für unstrukturierte Informationen. Neben Verfahren des Data Mining für die Datenanalyse treten Versuche, Text Mining (Lit. 06) auf die Textanalyse anzuwenden. Um gezielt Dokumente im Repository suchen zu können, ist eine effektive Inhaltserkennung und -kennzeichnung erforderlich, d.h. eine Zuordnung der Dokumente zu Themengebieten bzw die Speicherung geeigneter Indexterme als Metadaten. Zu diesem Zweck müssen die Dokumenteninhalte repräsentiert, d.h. indexiert oder klassifiziert, werden. Dokumentanalyse dient auch der Steuerung des Informations- und Dokumentenflusses. Ziel ist die Einleitung eines "Workflow nach Posteingang". Eine Dokumentanalyse kann anhand erkannter Merkmale Eingangspost automatisch an den Sachbearbeiter oder die zuständige Organisationseinheit (Rechnungen in die Buchhaltung, Aufträge in den Vertrieb) im Unternehmen leiten. Dokumentanalysen werden auch benötigt, wenn Mitarbeiter über einen persönlichen Informationsfilter relevante Dokumente automatisch zugestellt bekommen sollen. Aufgrund der Systemintegration werden Indexierungslösungen in den Funktionsumfang von DMS- bzw. Workflow-Produkten integriert. Eine Architektur solcher Systeme zeigt Abb. 1. Die Architektur zeigt die Indexierungs- bzw. Klassifizierungsfunktion im Zentrum der Anwendung. Dabei erfüllt sie Aufgaben für die Repräsentation von Dokumenten (Metadaten) und das spätere Retrieval.
  12. MacDougall, S.: Rethinking indexing : the impact of the Internet (1996) 0.00
    0.004151433 = product of:
      0.049817193 = sum of:
        0.049817193 = weight(_text_:internet in 704) [ClassicSimilarity], result of:
          0.049817193 = score(doc=704,freq=14.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.5177939 = fieldWeight in 704, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=704)
      0.083333336 = coord(1/12)
    
    Abstract
    Considers the challenge to professional indexers posed by the Internet. Indexing and searching on the Internet appears to have a retrograde step, as well developed and efficient information retrieval techniques have been replaced by cruder techniques, involving automatic keyword indexing and frequency ranking, leading to large retrieval sets and low precision. This is made worse by the apparent acceptance of this poor perfromance by Internet users and the feeling, on the part of indexers, that they are being bypassed by the producers of these hyperlinked menus and search engines. Key issues are: how far 'human' indexing will still be required in the Internet environment; how indexing techniques will have to change to stay relevant; and the future role of indexers. The challenge facing indexers is to adapt their skills to suit the online environment and to convince publishers of the need for efficient indexes on the Internet
    Theme
    Internet
  13. Thirion, B.; Leroy, J.P.; Baudic, F.; Douyère, M.; Piot, J.; Darmoni, S.J.: SDI selecting, decribing, and indexing : did you mean automatically? (2001) 0.00
    0.0031381883 = product of:
      0.03765826 = sum of:
        0.03765826 = weight(_text_:internet in 6198) [ClassicSimilarity], result of:
          0.03765826 = score(doc=6198,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.3914154 = fieldWeight in 6198, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.09375 = fieldNorm(doc=6198)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  14. Bloomfield, M.: Indexing : neglected and poorly understood (2001) 0.00
    0.0031381883 = product of:
      0.03765826 = sum of:
        0.03765826 = weight(_text_:internet in 5439) [ClassicSimilarity], result of:
          0.03765826 = score(doc=5439,freq=8.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.3914154 = fieldWeight in 5439, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=5439)
      0.083333336 = coord(1/12)
    
    Abstract
    The growth of the Internet has highlighted the use of machine indexing. The difficulties in using the Internet as a searching device can be frustrating. The use of the term "Python" is given as an example. Machine indexing is noted as "rotten" and human indexing as "capricious." The problem seems to be a lack of a theoretical foundation for the art of indexing. What librarians have learned over the last hundred years has yet to yield a consistent approach to what really works best in preparing index terms and in the ability of our customers to search the various indexes. An attempt is made to consider the elements of indexing, their pros and cons. The argument is made that machine indexing is far too prolific in its production of index terms. Neither librarians nor computer programmers have made much progress to improve Internet indexing. Human indexing has had the same problems for over fifty years.
    Theme
    Internet
  15. Hirawa, M.: Role of keywords in the network searching era (1998) 0.00
    0.0029587122 = product of:
      0.035504546 = sum of:
        0.035504546 = weight(_text_:internet in 3446) [ClassicSimilarity], result of:
          0.035504546 = score(doc=3446,freq=4.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.36902997 = fieldWeight in 3446, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0625 = fieldNorm(doc=3446)
      0.083333336 = coord(1/12)
    
    Abstract
    A survey of Japanese OPACs available on the Internet was conducted relating to use of keywords for subject access. The findings suggest that present OPACs are not capable of storing subject-oriented information. Currently available keyword access derives from a merely title-based retrieval system. Contents data should be added to bibliographic records as an efficient way of providing subject access, and costings for this process should be estimated. Word standardisation issues must also be addressed
    Theme
    Internet
  16. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.00
    0.0029435712 = product of:
      0.035322852 = sum of:
        0.035322852 = product of:
          0.070645705 = sum of:
            0.070645705 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.070645705 = score(doc=402,freq=2.0), product of:
                0.11412105 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032588977 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  17. Gödert, W.; Lepsky, K.: Semantische Umfeldsuche im Information Retrieval (1998) 0.00
    0.002899346 = product of:
      0.03479215 = sum of:
        0.03479215 = product of:
          0.0695843 = sum of:
            0.0695843 = weight(_text_:allgemein in 606) [ClassicSimilarity], result of:
              0.0695843 = score(doc=606,freq=2.0), product of:
                0.17123379 = queryWeight, product of:
                  5.254347 = idf(docFreq=627, maxDocs=44218)
                  0.032588977 = queryNorm
                0.40637016 = fieldWeight in 606, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.254347 = idf(docFreq=627, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=606)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Theme
    Katalogfragen allgemein
  18. McKiernan, G.: Automated categorisation of Web resources : a profile of selected projects, research, products, and services (1996) 0.00
    0.002615157 = product of:
      0.031381883 = sum of:
        0.031381883 = weight(_text_:internet in 2533) [ClassicSimilarity], result of:
          0.031381883 = score(doc=2533,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.3261795 = fieldWeight in 2533, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.078125 = fieldNorm(doc=2533)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  19. Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.00
    0.0025888733 = product of:
      0.03106648 = sum of:
        0.03106648 = weight(_text_:internet in 7209) [ClassicSimilarity], result of:
          0.03106648 = score(doc=7209,freq=4.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.32290122 = fieldWeight in 7209, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7209)
      0.083333336 = coord(1/12)
    
    Source
    Internet world and document delivery world international 94: Proceedings of the 2nd Annual Conference, London, May 1994
    Theme
    Internet
  20. Cheng, K.-H.: Automatic identification for topics of electronic documents (1997) 0.00
    0.0025888733 = product of:
      0.03106648 = sum of:
        0.03106648 = weight(_text_:internet in 1811) [ClassicSimilarity], result of:
          0.03106648 = score(doc=1811,freq=4.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.32290122 = fieldWeight in 1811, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1811)
      0.083333336 = coord(1/12)
    
    Abstract
    With the rapid rise in numbers of electronic documents on the Internet, how to effectively assign topics to documents become an important issue. Current research in this area focuses on the behaviour of nouns in documents. Proposes, however, that nouns and verbs together contribute to the process of topic identification. Constructs a mathematical model taking into account the following factors: word importance, word frequency, word co-occurence, and word distance. Preliminary experiments ahow that the performance of the proposed model is equivalent to that of a human being
    Theme
    Internet

Years

Languages