Search (52 results, page 1 of 3)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.06
    0.06452808 = product of:
      0.12905616 = sum of:
        0.12905616 = sum of:
          0.05842106 = weight(_text_:classification in 3280) [ClassicSimilarity], result of:
            0.05842106 = score(doc=3280,freq=2.0), product of:
              0.16603322 = queryWeight, product of:
                3.1847067 = idf(docFreq=4974, maxDocs=44218)
                0.05213454 = queryNorm
              0.35186368 = fieldWeight in 3280, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1847067 = idf(docFreq=4974, maxDocs=44218)
                0.078125 = fieldNorm(doc=3280)
          0.0706351 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
            0.0706351 = score(doc=3280,freq=2.0), product of:
              0.18256627 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05213454 = queryNorm
              0.38690117 = fieldWeight in 3280, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=3280)
      0.5 = coord(1/2)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  2. Boyack, K.W.; Wylie,B.N.; Davidson, G.S.: Information Visualization, Human-Computer Interaction, and Cognitive Psychology : Domain Visualizations (2002) 0.02
    0.024973279 = product of:
      0.049946558 = sum of:
        0.049946558 = product of:
          0.099893115 = sum of:
            0.099893115 = weight(_text_:22 in 1352) [ClassicSimilarity], result of:
              0.099893115 = score(doc=1352,freq=4.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.54716086 = fieldWeight in 1352, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1352)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 2.2003 17:25:39
    22. 2.2003 18:17:40
  3. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.02
    0.024722286 = product of:
      0.04944457 = sum of:
        0.04944457 = product of:
          0.09888914 = sum of:
            0.09888914 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.09888914 = score(doc=2134,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    30. 3.2001 13:32:22
  4. Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.02
    0.017658776 = product of:
      0.03531755 = sum of:
        0.03531755 = product of:
          0.0706351 = sum of:
            0.0706351 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
              0.0706351 = score(doc=2751,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38690117 = fieldWeight in 2751, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2751)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  5. Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.02
    0.017658776 = product of:
      0.03531755 = sum of:
        0.03531755 = product of:
          0.0706351 = sum of:
            0.0706351 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
              0.0706351 = score(doc=2754,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38690117 = fieldWeight in 2754, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2754)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  6. Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.02
    0.017658776 = product of:
      0.03531755 = sum of:
        0.03531755 = product of:
          0.0706351 = sum of:
            0.0706351 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
              0.0706351 = score(doc=3279,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38690117 = fieldWeight in 3279, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3279)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  7. Sacco, G.M.: Dynamic taxonomies and guided searches (2006) 0.02
    0.017481297 = product of:
      0.034962595 = sum of:
        0.034962595 = product of:
          0.06992519 = sum of:
            0.06992519 = weight(_text_:22 in 5295) [ClassicSimilarity], result of:
              0.06992519 = score(doc=5295,freq=4.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.38301262 = fieldWeight in 5295, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5295)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 17:56:22
  8. Morato, J.; Llorens, J.; Genova, G.; Moreiro, J.A.: Experiments in discourse analysis impact on information classification and retrieval algorithms (2003) 0.02
    0.016329184 = product of:
      0.03265837 = sum of:
        0.03265837 = product of:
          0.06531674 = sum of:
            0.06531674 = weight(_text_:classification in 1083) [ClassicSimilarity], result of:
              0.06531674 = score(doc=1083,freq=10.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.39339557 = fieldWeight in 1083, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1083)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Researchers in indexing and retrieval systems have been advocating the inclusion of more contextual information to improve results. The proliferation of full-text databases and advances in computer storage capacity have made it possible to carry out text analysis by means of linguistic and extra-linguistic knowledge. Since the mid 80s, research has tended to pay more attention to context, giving discourse analysis a more central role. The research presented in this paper aims to check whether discourse variables have an impact on modern information retrieval and classification algorithms. In order to evaluate this hypothesis, a functional framework for information analysis in an automated environment has been proposed, where the n-grams (filtering) and the k-means and Chen's classification algorithms have been tested against sub-collections of documents based on the following discourse variables: "Genre", "Register", "Domain terminology", and "Document structure". The results obtained with the algorithms for the different sub-collections were compared to the MeSH information structure. These demonstrate that n-grams does not appear to have a clear dependence on discourse variables, though the k-means classification algorithm does, but only on domain terminology and document structure, and finally Chen's algorithm has a clear dependence on all of the discourse variables. This information could be used to design better classification algorithms, where discourse variables should be taken into account. Other minor conclusions drawn from these results are also presented.
  9. Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.02
    0.016329184 = product of:
      0.03265837 = sum of:
        0.03265837 = product of:
          0.06531674 = sum of:
            0.06531674 = weight(_text_:classification in 2299) [ClassicSimilarity], result of:
              0.06531674 = score(doc=2299,freq=10.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.39339557 = fieldWeight in 2299, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2299)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.
    Source
    Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro
  10. Jun, W.: ¬A knowledge network constructed by integrating classification, thesaurus and metadata in a digital library (2003) 0.02
    0.015456761 = product of:
      0.030913522 = sum of:
        0.030913522 = product of:
          0.061827045 = sum of:
            0.061827045 = weight(_text_:classification in 1254) [ClassicSimilarity], result of:
              0.061827045 = score(doc=1254,freq=14.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.37237754 = fieldWeight in 1254, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1254)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Knowledge management in digital libraries is a universal problem. Keyword-based searching is applied everywhere no matter whether the resources are indexed databases or full-text Web pages. In keyword matching, the valuable content description and indexing of the metadata, such as the subject descriptors and the classification notations, are merely treated as common keywords to be matched with the user query. Without the support of vocabulary control tools, such as classification systems and thesauri, the intelligent labor of content analysis, description and indexing in metadata production are seriously wasted. New retrieval paradigms are needed to exploit the potential of the metadata resources. Could classification and thesauri, which contain the condensed intelligence of generations of librarians, be used in a digital library to organize the networked information, especially metadata, to facilitate their usability and change the digital library into a knowledge management environment? To examine that question, we designed and implemented a new paradigm that incorporates a classification system, a thesaurus and metadata. The classification and the thesaurus are merged into a concept network, and the metadata are distributed into the nodes of the concept network according to their subjects. The abstract concept node instantiated with the related metadata records becomes a knowledge node. A coherent and consistent knowledge network is thus formed. It is not only a framework for resource organization but also a structure for knowledge navigation, retrieval and learning. We have built an experimental system based on the Chinese Classification and Thesaurus, which is the most comprehensive and authoritative in China, and we have incorporated more than 5000 bibliographic records in the computing domain from the Peking University Library. The result is encouraging. In this article, we review the tools, the architecture and the implementation of our experimental system, which is called Vision.
  11. Wang, Z.; Khoo, C.S.G.; Chaudhry, A.S.: Evaluation of the navigation effectiveness of an organizational taxonomy built on a general classification scheme and domain thesauri (2014) 0.02
    0.015178238 = product of:
      0.030356476 = sum of:
        0.030356476 = product of:
          0.060712952 = sum of:
            0.060712952 = weight(_text_:classification in 1251) [ClassicSimilarity], result of:
              0.060712952 = score(doc=1251,freq=6.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.3656675 = fieldWeight in 1251, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1251)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper presents an evaluation study of the navigation effectiveness of a multifaceted organizational taxonomy that was built on the Dewey Decimal Classification and several domain thesauri in the area of library and information science education. The objective of the evaluation was to detect deficiencies in the taxonomy and to infer problems of applied construction steps from users' navigation difficulties. The evaluation approach included scenario-based navigation exercises and postexercise interviews. Navigation exercise errors and underlying reasons were analyzed in relation to specific components of the taxonomy and applied construction steps. Guidelines for the construction of the hierarchical structure and categories of an organizational taxonomy using existing general classification schemes and domain thesauri were derived from the evaluation results.
  12. Gnoli, C.; Pusterla, L.; Bendiscioli, A.; Recinella, C.: Classification for collections mapping and query expansion (2016) 0.02
    0.015178238 = product of:
      0.030356476 = sum of:
        0.030356476 = product of:
          0.060712952 = sum of:
            0.060712952 = weight(_text_:classification in 3102) [ClassicSimilarity], result of:
              0.060712952 = score(doc=3102,freq=6.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.3656675 = fieldWeight in 3102, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3102)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Dewey Decimal Classification has been used to organize materials owned by the three scientific libraries at the University of Pavia, and to allow integrated browsing in their union catalogue through SciGator, a home built web-based user interface. Classification acts as a bridge between collections located in different places and shelved according to different local schemes. Furthermore, cross-discipline relationships recorded in the system allow for expanded queries that increase recall. Advantages and possible improvements of such a system are discussed.
  13. Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.01
    0.014605265 = product of:
      0.02921053 = sum of:
        0.02921053 = product of:
          0.05842106 = sum of:
            0.05842106 = weight(_text_:classification in 5055) [ClassicSimilarity], result of:
              0.05842106 = score(doc=5055,freq=8.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.35186368 = fieldWeight in 5055, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5055)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Distant supervision (DS) has the advantage of automatically generating large amounts of labelled training data and has been widely used for relation extraction. However, there are usually many wrong labels in the automatically labelled data in distant supervision (Riedel, Yao, & McCallum, 2010). This paper presents a novel method to reduce the wrong labels. The proposed method uses the semantic Jaccard with word embedding to measure the semantic similarity between the relation phrase in the knowledge base and the dependency phrases between two entities in a sentence to filter the wrong labels. In the process of reducing wrong labels, the semantic Jaccard algorithm selects a core dependency phrase to represent the candidate relation in a sentence, which can capture features for relation classification and avoid the negative impact from irrelevant term sequences that previous neural network models of relation extraction often suffer. In the process of relation classification, the core dependency phrases are also used as the input of a convolutional neural network (CNN) for relation classification. The experimental results show that compared with the methods using original DS data, the methods using filtered DS data performed much better in relation extraction. It indicates that the semantic similarity based method is effective in reducing wrong labels. The relation extraction performance of the CNN model using the core dependency phrases as input is the best of all, which indicates that using the core dependency phrases as input of CNN is enough to capture the features for relation classification and could avoid negative impact from irrelevant terms.
  14. Green, R.: See-also relationships in the Dewey Decimal Classification (2011) 0.01
    0.014458476 = product of:
      0.028916951 = sum of:
        0.028916951 = product of:
          0.057833903 = sum of:
            0.057833903 = weight(_text_:classification in 4615) [ClassicSimilarity], result of:
              0.057833903 = score(doc=4615,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.34832728 = fieldWeight in 4615, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4615)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper investigates the semantics of topical, associative see-also relationships in schedule and table entries of the Dewey Decimal Classification (DDC) system. Based on the see-also relationships in a random sample of 100 classes containing one or more of these relationships, a semi-structured inventory of sources of see-also relationships is generated, of which the most important are lexical similarity, complementarity, facet difference, and relational configuration difference. The premise that see-also relationships based on lexical similarity may be language-specific is briefly examined. The paper concludes with recommendations on the continued use of see-also relationships in the DDC.
  15. Moreira, W.; Martínez-Ávila, D.: Concept relationships in knowledge organization systems : elements for analysis and common research among fields (2018) 0.01
    0.014458476 = product of:
      0.028916951 = sum of:
        0.028916951 = product of:
          0.057833903 = sum of:
            0.057833903 = weight(_text_:classification in 5166) [ClassicSimilarity], result of:
              0.057833903 = score(doc=5166,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.34832728 = fieldWeight in 5166, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5166)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Knowledge organization systems have been studied in several fields and for different and complementary aspects. Among the aspects that concentrate common interests, in this article we highlight those related to the terminological and conceptual relationships among the components of any knowledge organization system. This research aims to contribute to the critical analysis of knowledge organization systems, especially ontologies, thesauri, and classification systems, by the comprehension of its similarities and differences when dealing with concepts and their ways of relating to each other as well as to the conceptual design that is adopted.
    Source
    Cataloging and classification quarterly. 56(2018) no.1, S.19-39
  16. Efthimiadis, E.N.: End-users' understanding of thesaural knowledge structures in interactive query expansion (1994) 0.01
    0.014127021 = product of:
      0.028254041 = sum of:
        0.028254041 = product of:
          0.056508083 = sum of:
            0.056508083 = weight(_text_:22 in 5693) [ClassicSimilarity], result of:
              0.056508083 = score(doc=5693,freq=2.0), product of:
                0.18256627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05213454 = queryNorm
                0.30952093 = fieldWeight in 5693, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5693)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    30. 3.2001 13:35:22
  17. Fidel, R.; Efthimiadis, E.N.: Terminological knowledge structure for intermediary expert systems (1995) 0.01
    0.012392979 = product of:
      0.024785958 = sum of:
        0.024785958 = product of:
          0.049571916 = sum of:
            0.049571916 = weight(_text_:classification in 5695) [ClassicSimilarity], result of:
              0.049571916 = score(doc=5695,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.29856625 = fieldWeight in 5695, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5695)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    To provide advice for online searching about term selection and query expansion, an intermediary expert system should indicate a terminological knowledge structure. Terminological attributes could provide the foundation of a knowledge base, and knowledge acquisition could rely on knowledge base techniques coupled with statistical techniques. The strategies of expert searchers would provide 1 source of knowledge. The knowledge structure would include 3 constructs for each term: frequency data, a hedge, and a position in a classification scheme. Switching vocabularies could provide a meta-scheme and facilitate the interoperability of databases in similar subjects. To develop such knowledge structure, research should focus on terminological attributes, word and phrase disambiguation, automated text processing, and the role of thesauri and classification schemes in indexing and retrieval. It should develop techniques that combine knowledge base and statistical methods and that consider user preferences
  18. Poynder, R.: Web research engines? (1996) 0.01
    0.012392979 = product of:
      0.024785958 = sum of:
        0.024785958 = product of:
          0.049571916 = sum of:
            0.049571916 = weight(_text_:classification in 5698) [ClassicSimilarity], result of:
              0.049571916 = score(doc=5698,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.29856625 = fieldWeight in 5698, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5698)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes the shortcomings of search engines for the WWW comparing their current capabilities to those of the first generation CD-ROM products. Some allow phrase searching and most are improving their Boolean searching. Few allow truncation, wild cards or nested logic. They are stateless, losing previous search criteria. Unlike the indexing and classification systems for today's CD-ROMs, those for Web pages are random, unstructured and of variable quality. Considers that at best Web search engines can only offer free text searching. Discusses whether automatic data classification systems such as Infoseek Ultra can overcome the haphazard nature of the Web with neural network technology, and whether Boolean search techniques may be redundant when replaced by technology such as the Euroferret search engine. However, artificial intelligence is rarely successful on huge, varied databases. Relevance ranking and automatic query expansion still use the same simple inverted indexes. Most Web search engines do nothing more than word counting. Further complications arise with foreign languages
  19. Spiteri, L.F.: ¬The essential elements of faceted thesauri (1999) 0.01
    0.012392979 = product of:
      0.024785958 = sum of:
        0.024785958 = product of:
          0.049571916 = sum of:
            0.049571916 = weight(_text_:classification in 5362) [ClassicSimilarity], result of:
              0.049571916 = score(doc=5362,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.29856625 = fieldWeight in 5362, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5362)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The goal of this study is to evaluate, compare, and contrast how facet analysis is used to construct the systematic or faceted displays of a selection of information retrieval thesauri. More specifically, the study seeks to examine which principles of facet analysis are used in the thesauri, and the extent to which different thesauri apply these principles in the same way. A measuring instrument was designed for the purpose of evaluating the structure of faceted thesauri. This instrument was applied to fourteen faceted information retrieval thesauri. The study reveals that the thesauri do not share a common definition of what constitutes a facet. In some cases, the thesauri apply both enumerative-style classification and facet analysis to arrange their indexing terms. A number of the facets used in the thesauri are not homogeneous or mutually exclusive. The principle of synthesis is used in only 50% of the thesauri, and no one citation order is used consistently by the thesauri.
    Source
    Cataloging and classification quarterly. 28(1999) no.4, S.31-52
  20. Huang, L.; Milne, D.; Frank, E.; Witten, I.H.: Learning a concept-based document similarity measure (2012) 0.01
    0.012392979 = product of:
      0.024785958 = sum of:
        0.024785958 = product of:
          0.049571916 = sum of:
            0.049571916 = weight(_text_:classification in 372) [ClassicSimilarity], result of:
              0.049571916 = score(doc=372,freq=4.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.29856625 = fieldWeight in 372, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=372)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Document similarity measures are crucial components of many text-analysis tasks, including information retrieval, document classification, and document clustering. Conventional measures are brittle: They estimate the surface overlap between documents based on the words they mention and ignore deeper semantic connections. We propose a new measure that assesses similarity at both the lexical and semantic levels, and learns from human judgments how to combine them by using machine-learning techniques. Experiments show that the new measure produces values for documents that are more consistent with people's judgments than people are with each other. We also use it to classify and cluster large document sets covering different genres and topics, and find that it improves both classification and clustering performance.

Authors

Years

Languages

  • e 48
  • d 4

Types

  • a 48
  • el 8
  • m 1
  • r 1
  • More… Less…