Search (57 results, page 2 of 3)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Wätjen, H.-J.: Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web : das DFG-Projekt GERHARD (1998) 0.00
    0.002615157 = product of:
      0.031381883 = sum of:
        0.031381883 = weight(_text_:internet in 3066) [ClassicSimilarity], result of:
          0.031381883 = score(doc=3066,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.3261795 = fieldWeight in 3066, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.078125 = fieldNorm(doc=3066)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  2. Möller, G.: Automatic classification of the World Wide Web using Universal Decimal Classification (1999) 0.00
    0.002615157 = product of:
      0.031381883 = sum of:
        0.031381883 = weight(_text_:internet in 494) [ClassicSimilarity], result of:
          0.031381883 = score(doc=494,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.3261795 = fieldWeight in 494, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.078125 = fieldNorm(doc=494)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  3. Subramanian, S.; Shafer, K.E.: Clustering (1998) 0.00
    0.002615157 = product of:
      0.031381883 = sum of:
        0.031381883 = weight(_text_:internet in 1103) [ClassicSimilarity], result of:
          0.031381883 = score(doc=1103,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.3261795 = fieldWeight in 1103, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.078125 = fieldNorm(doc=1103)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  4. Shafer, K.E.: Evaluating Scorpion results (1998) 0.00
    0.002615157 = product of:
      0.031381883 = sum of:
        0.031381883 = weight(_text_:internet in 1569) [ClassicSimilarity], result of:
          0.031381883 = score(doc=1569,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.3261795 = fieldWeight in 1569, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.078125 = fieldNorm(doc=1569)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  5. Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.00
    0.0025888733 = product of:
      0.03106648 = sum of:
        0.03106648 = weight(_text_:internet in 7209) [ClassicSimilarity], result of:
          0.03106648 = score(doc=7209,freq=4.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.32290122 = fieldWeight in 7209, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7209)
      0.083333336 = coord(1/12)
    
    Source
    Internet world and document delivery world international 94: Proceedings of the 2nd Annual Conference, London, May 1994
    Theme
    Internet
  6. Wätjen, H.-J.: GERHARD : Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web (1998) 0.00
    0.0025888733 = product of:
      0.03106648 = sum of:
        0.03106648 = weight(_text_:internet in 3064) [ClassicSimilarity], result of:
          0.03106648 = score(doc=3064,freq=4.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.32290122 = fieldWeight in 3064, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3064)
      0.083333336 = coord(1/12)
    
    Abstract
    Die intellektuelle Erschließung des Internet befindet sich in einer Krise. Yahoo und andere Dienste können mit dem Wachstum des Web nicht mithalten. GERHARD ist derzeit weltweit der einzige Such- und Navigationsdienst, der die mit einem Roboter gesammelten Internetressourcen mit computerlinguistischen und statistischen Verfahren auch automatisch vollständig klassifiziert. Weit über eine Million HTML-Dokumente von wissenschaftlich relevanten Servern in Deutschland können wie bei anderen Suchmaschinen in der Datenbank gesucht, aber auch über die Navigation in der dreisprachigen Universalen Dezimalklassifikation (ETH-Bibliothek Zürich) recherchiert werden
    Theme
    Internet
  7. Walther, R.: Möglichkeiten und Grenzen automatischer Klassifikationen von Web-Dokumenten (2001) 0.00
    0.0025888733 = product of:
      0.03106648 = sum of:
        0.03106648 = weight(_text_:internet in 1562) [ClassicSimilarity], result of:
          0.03106648 = score(doc=1562,freq=4.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.32290122 = fieldWeight in 1562, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1562)
      0.083333336 = coord(1/12)
    
    Abstract
    Automatische Klassifikationen von Web- und andern Textdokumenten ermöglichen es, betriebsinterne und externe Informationen geordnet zugänglich zu machen. Die Forschung zur automatischen Klassifikation hat sich in den letzten Jahren intensiviert. Das Resultat sind verschiedenen Methoden, die heute in der Praxis einzeln oder kombiniert für die Klassifikation im Einsatz sind. In der vorliegenden Lizenziatsarbeit werden neben allgemeinen Grundsätzen einige Methoden zur automatischen Klassifikation genauer betrachtet und ihre Möglichkeiten und Grenzen erörtert. Daneben erfolgt die Präsentation der Resultate aus einer Umfrage bei Anbieterrfirmen von Softwarelösungen zur automatische Klassifikation von Text-Dokumenten. Die Ausführungen dienen der myax internet AG als Basis, ein eigenes Klassifikations-Produkt zu entwickeln
    Theme
    Internet
  8. Koch, T.; Ardö, A.; Brümmer, A.: ¬The building and maintenance of robot based internet search services : A review of current indexing and data collection methods. Prepared to meet the requirements of Work Package 3 of EU Telematics for Research, project DESIRE. Version D3.11v0.3 (Draft version 3) (1996) 0.00
    0.0025623199 = product of:
      0.030747838 = sum of:
        0.030747838 = weight(_text_:internet in 1669) [ClassicSimilarity], result of:
          0.030747838 = score(doc=1669,freq=12.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.31958932 = fieldWeight in 1669, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.03125 = fieldNorm(doc=1669)
      0.083333336 = coord(1/12)
    
    Abstract
    After a short outline of problems, possibilities and difficulties of systematic information retrieval on the Internet and a description of efforts for development in this area, a specification of the terminology for this report is required. Although the process of retrieval is generally seen as an iterative process of browsing and information retrieval and several important services on the net have taken this fact into consideration, the emphasis of this report lays on the general retrieval tools for the whole of Internet. In order to be able to evaluate the differences, possibilities and restrictions of the different services it is necessary to begin with organizing the existing varieties in a typological/ taxonomical survey. The possibilities and weaknesses will be briefly compared and described for the most important services in the categories robot-based WWW-catalogues of different types, list- or form-based catalogues and simultaneous or collected search services respectively. It will however for different reasons not be possible to rank them in order of "best" services. Still more important are the weaknesses and problems common for all attempts of indexing the Internet. The problems of the quality of the input, the technical performance and the general problem of indexing virtual hypertext are shown to be at least as difficult as the different aspects of harvesting, indexing and information retrieval. Some of the attempts made in the area of further development of retrieval services will be mentioned in relation to descriptions of the contents of documents and standardization efforts. Internet harvesting and indexing technology and retrieval software is thoroughly reviewed. Details about all services and software are listed in analytical forms in Annex 1-3.
    Theme
    Internet
  9. Choi, B.; Peng, X.: Dynamic and hierarchical classification of Web pages (2004) 0.00
    0.0022190344 = product of:
      0.02662841 = sum of:
        0.02662841 = weight(_text_:internet in 2555) [ClassicSimilarity], result of:
          0.02662841 = score(doc=2555,freq=4.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.27677247 = fieldWeight in 2555, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=2555)
      0.083333336 = coord(1/12)
    
    Abstract
    Automatic classification of Web pages is an effective way to organise the vast amount of information and to assist in retrieving relevant information from the Internet. Although many automatic classification systems have been proposed, most of them ignore the conflict between the fixed number of categories and the growing number of Web pages being added into the systems. They also require searching through all existing categories to make any classification. This article proposes a dynamic and hierarchical classification system that is capable of adding new categories as required, organising the Web pages into a tree structure, and classifying Web pages by searching through only one path of the tree. The proposed single-path search technique reduces the search complexity from (n) to (log(n)). Test results show that the system improves the accuracy of classification by 6 percent in comparison to related systems. The dynamic-category expansion technique also achieves satisfying results for adding new categories into the system as required.
    Theme
    Internet
  10. Chan, L.M.; Lin, X.; Zeng, M.: Structural and multilingual approaches to subject access on the Web (1999) 0.00
    0.0020921256 = product of:
      0.025105506 = sum of:
        0.025105506 = weight(_text_:internet in 162) [ClassicSimilarity], result of:
          0.025105506 = score(doc=162,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.2609436 = fieldWeight in 162, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0625 = fieldNorm(doc=162)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  11. Koch, T.: Nutzung von Klassifikationssystemen zur verbesserten Beschreibung, Organisation und Suche von Internetressourcen (1998) 0.00
    0.0020921256 = product of:
      0.025105506 = sum of:
        0.025105506 = weight(_text_:internet in 1030) [ClassicSimilarity], result of:
          0.025105506 = score(doc=1030,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.2609436 = fieldWeight in 1030, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0625 = fieldNorm(doc=1030)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  12. Koch, T.; Ardö, A.: Automatic classification of full-text HTML-documents from one specific subject area : DESIRE II D3.6a, Working Paper 2 (2000) 0.00
    0.0020921256 = product of:
      0.025105506 = sum of:
        0.025105506 = weight(_text_:internet in 1667) [ClassicSimilarity], result of:
          0.025105506 = score(doc=1667,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.2609436 = fieldWeight in 1667, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0625 = fieldNorm(doc=1667)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  13. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.00
    0.001839732 = product of:
      0.022076784 = sum of:
        0.022076784 = product of:
          0.044153567 = sum of:
            0.044153567 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.044153567 = score(doc=611,freq=2.0), product of:
                0.11412105 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032588977 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Date
    22. 8.2009 12:54:24
  14. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.00
    0.001839732 = product of:
      0.022076784 = sum of:
        0.022076784 = product of:
          0.044153567 = sum of:
            0.044153567 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.044153567 = score(doc=2748,freq=2.0), product of:
                0.11412105 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032588977 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.5 = coord(1/2)
      0.083333336 = coord(1/12)
    
    Date
    1. 2.2016 18:25:22
  15. Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00
    0.0015690941 = product of:
      0.01882913 = sum of:
        0.01882913 = weight(_text_:internet in 316) [ClassicSimilarity], result of:
          0.01882913 = score(doc=316,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.1957077 = fieldWeight in 316, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=316)
      0.083333336 = coord(1/12)
    
    Abstract
    Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC) [10], within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR).
  16. Chung, Y.-M.; Noh, Y.-H.: Developing a specialized directory system by automatically classifying Web documents (2003) 0.00
    0.0015690941 = product of:
      0.01882913 = sum of:
        0.01882913 = weight(_text_:internet in 1566) [ClassicSimilarity], result of:
          0.01882913 = score(doc=1566,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.1957077 = fieldWeight in 1566, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=1566)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  17. Koch, T.; Ardö, A.; Noodén, L.: ¬The construction of a robot-generated subject index : DESIRE II D3.6a, Working Paper 1 (1999) 0.00
    0.0015690941 = product of:
      0.01882913 = sum of:
        0.01882913 = weight(_text_:internet in 1668) [ClassicSimilarity], result of:
          0.01882913 = score(doc=1668,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.1957077 = fieldWeight in 1668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=1668)
      0.083333336 = coord(1/12)
    
    Theme
    Internet
  18. Wu, K.J.; Chen, M.-C.; Sun, Y.: Automatic topics discovery from hyperlinked documents (2004) 0.00
    0.0015690941 = product of:
      0.01882913 = sum of:
        0.01882913 = weight(_text_:internet in 2563) [ClassicSimilarity], result of:
          0.01882913 = score(doc=2563,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.1957077 = fieldWeight in 2563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=2563)
      0.083333336 = coord(1/12)
    
    Abstract
    Topic discovery is an important means for marketing, e-Business and social science studies. As well, it can be applied to various purposes, such as identifying a group with certain properties and observing the emergence and diminishment of a certain cyber community. Previous topic discovery work (J.M. Kleinberg, Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, p. 668) requires manual judgment of usefulness of outcomes and is thus incapable of handling the explosive growth of the Internet. In this paper, we propose the Automatic Topic Discovery (ATD) method, which combines a method of base set construction, a clustering algorithm and an iterative principal eigenvector computation method to discover the topics relevant to a given query without using manual examination. Given a query, ATD returns with topics associated with the query and top representative pages for each topic. Our experiments show that the ATD method performs better than the traditional eigenvector method in terms of computation time and topic discovery quality.
  19. Montesi, M.; Navarrete, T.: Classifying web genres in context : A case study documenting the web genres used by a software engineer (2008) 0.00
    0.0015690941 = product of:
      0.01882913 = sum of:
        0.01882913 = weight(_text_:internet in 2100) [ClassicSimilarity], result of:
          0.01882913 = score(doc=2100,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.1957077 = fieldWeight in 2100, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.046875 = fieldNorm(doc=2100)
      0.083333336 = coord(1/12)
    
    Abstract
    This case study analyzes the Internet-based resources that a software engineer uses in his daily work. Methodologically, we studied the web browser history of the participant, classifying all the web pages he had seen over a period of 12 days into web genres. We interviewed him before and after the analysis of the web browser history. In the first interview, he spoke about his general information behavior; in the second, he commented on each web genre, explaining why and how he used them. As a result, three approaches allow us to describe the set of 23 web genres obtained: (a) the purposes they serve for the participant; (b) the role they play in the various work and search phases; (c) and the way they are used in combination with each other. Further observations concern the way the participant assesses quality of web-based resources, and his information behavior as a software engineer.
  20. Classification, automation, and new media : Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Passau, March 15 - 17, 2000 (2002) 0.00
    0.0013075785 = product of:
      0.015690941 = sum of:
        0.015690941 = weight(_text_:internet in 5997) [ClassicSimilarity], result of:
          0.015690941 = score(doc=5997,freq=2.0), product of:
            0.09621047 = queryWeight, product of:
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.032588977 = queryNorm
            0.16308975 = fieldWeight in 5997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9522398 = idf(docFreq=6276, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5997)
      0.083333336 = coord(1/12)
    
    Abstract
    Given the huge amount of information in the internet and in practically every domain of knowledge that we are facing today, knowledge discovery calls for automation. The book deals with methods from classification and data analysis that respond effectively to this rapidly growing challenge. The interested reader will find new methodological insights as well as applications in economics, management science, finance, and marketing, and in pattern recognition, biology, health, and archaeology.

Years

Languages

  • e 39
  • d 18

Types

  • a 38
  • el 16
  • x 4
  • r 3
  • m 2
  • s 1
  • More… Less…