Search (35 results, page 1 of 2)

Efthimiadis, E.N.: Query expansion (1996) 0.08

0.08309829 = product of:
  0.24929488 = sum of:
    0.24929488 = weight(_text_:query in 4847) [ClassicSimilarity], result of:
      0.24929488 = score(doc=4847,freq=14.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        1.0868655 = fieldWeight in 4847, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.0625 = fieldNorm(doc=4847)
  0.33333334 = coord(1/3)

Abstract: State of the art review of query expansion (or term expansion) as the process of supplementing the original query with additional terms in order to improve retrieval performance. Research in the subject is presented in a highly structured way and is presented according to 3 types of query expansion; manual query expansion; automatic query expansion; and interactive query expansion

Wacholder, N.: Interactive query formulation (2011) 0.05

0.054964356 = product of:
  0.16489306 = sum of:
    0.16489306 = weight(_text_:query in 4196) [ClassicSimilarity], result of:
      0.16489306 = score(doc=4196,freq=2.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.71889395 = fieldWeight in 4196, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.109375 = fieldNorm(doc=4196)
  0.33333334 = coord(1/3)

Sabourin, C.F. (Bearb.): Computational linguistics in information science : bibliography (1994) 0.04

0.039260253 = product of:
  0.11778076 = sum of:
    0.11778076 = weight(_text_:query in 8280) [ClassicSimilarity], result of:
      0.11778076 = score(doc=8280,freq=2.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.5134957 = fieldWeight in 8280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.078125 = fieldNorm(doc=8280)
  0.33333334 = coord(1/3)

Abstract: The bibliography covers information retrieval (2100 refs.), fulltext (890) or conceptual (60), automatic indexing (930), information extraction (520), query languages (1090), etc.; altogether 6390 references, fully indexed

Kantor, P.B.: Information retrieval techniques (1994) 0.03
```
0.03331343 = product of:
  0.09994029 = sum of:
    0.09994029 = weight(_text_:query in 1056) [ClassicSimilarity], result of:
      0.09994029 = score(doc=1056,freq=4.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.43571556 = fieldWeight in 1056, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.046875 = fieldNorm(doc=1056)
  0.33333334 = coord(1/3)
```
Abstract

State of the art review of information retrieval techniques viewed in terms of the growing effort to implement concept based retrieval in content based algorithms. Identifies trends in the automation of indexing, retrieval, and the interaction between systems and users. Identifies 3 central issues: ways in which systems describe documents for purposes of information retrieval; ways in which systems compute the degree of match between a given document and the current state of the query; amd what the systems do with the information that they obtain from the users. Looks at information retrieval techniques in terms of: location, navigation; indexing; documents; queries; structures; concepts; matching documents to queries; restoring query structure; algorithms and content versus concepts; formulation of concepts in terms of contents; formulation of concepts with the assistance of the users; complex system codes versus underlying principles; and system evaluation
Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.03
```
0.027761191 = product of:
  0.08328357 = sum of:
    0.08328357 = weight(_text_:query in 4277) [ClassicSimilarity], result of:
      0.08328357 = score(doc=4277,freq=4.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.3630963 = fieldWeight in 4277, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4277)
  0.33333334 = coord(1/3)
```
Abstract

This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.
Dumais, S.T.: Latent semantic analysis (2003) 0.03
```
0.026336579 = product of:
  0.079009734 = sum of:
    0.079009734 = weight(_text_:query in 2462) [ClassicSimilarity], result of:
      0.079009734 = score(doc=2462,freq=10.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.34446338 = fieldWeight in 2462, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.0234375 = fieldNorm(doc=2462)
  0.33333334 = coord(1/3)
```
Abstract

Latent Semantic Analysis (LSA) was first introduced in Dumais, Furnas, Landauer, and Deerwester (1988) and Deerwester, Dumais, Furnas, Landauer, and Harshman (1990) as a technique for improving information retrieval. The key insight in LSA was to reduce the dimensionality of the information retrieval problem. Most approaches to retrieving information depend an a lexical match between words in the user's query and those in documents. Indeed, this lexical matching is the way that the popular Web and enterprise search engines work. Such systems are, however, far from ideal. We are all aware of the tremendous amount of irrelevant information that is retrieved when searching. We also fail to find much of the existing relevant material. LSA was designed to address these retrieval problems, using dimension reduction techniques. Fundamental characteristics of human word usage underlie these retrieval failures. People use a wide variety of words to describe the same object or concept (synonymy). Furnas, Landauer, Gomez, and Dumais (1987) showed that people generate the same keyword to describe well-known objects only 20 percent of the time. Poor agreement was also observed in studies of inter-indexer consistency (e.g., Chan, 1989; Tarr & Borko, 1974) in the generation of search terms (e.g., Fidel, 1985; Bates, 1986), and in the generation of hypertext links (Furner, Ellis, & Willett, 1999). Because searchers and authors often use different words, relevant materials are missed. Someone looking for documents an "human-computer interaction" will not find articles that use only the phrase "man-machine studies" or "human factors." People also use the same word to refer to different things (polysemy). Words like "saturn," "jaguar," or "chip" have several different meanings. A short query like "saturn" will thus return many irrelevant documents. The query "Saturn Gar" will return fewer irrelevant items, but it will miss some documents that use only the terms "Saturn automobile." In searching, there is a constant tension between being overly specific and missing relevant information, and being more general and returning irrelevant information.
A number of approaches have been developed in information retrieval to address the problems caused by the variability in word usage. Stemming is a popular technique used to normalize some kinds of surface-level variability by converting words to their morphological root. For example, the words "retrieve," "retrieval," "retrieved," and "retrieving" would all be converted to their root form, "retrieve." The root form is used for both document and query processing. Stemming sometimes helps retrieval, although not much (Harman, 1991; Hull, 1996). And, it does not address Gases where related words are not morphologically related (e.g., physician and doctor). Controlled vocabularies have also been used to limit variability by requiring that query and index terms belong to a pre-defined set of terms. Documents are indexed by a specified or authorized list of subject headings or index terms, called the controlled vocabulary. Library of Congress Subject Headings, Medical Subject Headings, Association for Computing Machinery (ACM) keywords, and Yellow Pages headings are examples of controlled vocabularies. If searchers can find the right controlled vocabulary terms, they do not have to think of all the morphologically related or synonymous terms that authors might have used. However, assigning controlled vocabulary terms in a consistent and thorough manner is a time-consuming and usually manual process. A good deal of research has been published about the effectiveness of controlled vocabulary indexing compared to full text indexing (e.g., Bates, 1998; Lancaster, 1986; Svenonius, 1986). The combination of both full text and controlled vocabularies is often better than either alone, although the size of the advantage is variable (Lancaster, 1986; Markey, Atherton, & Newton, 1982; Srinivasan, 1996). Richer thesauri have also been used to provide synonyms, generalizations, and specializations of users' search terms (see Srinivasan, 1992, for a review). Controlled vocabularies and thesaurus entries can be generated either manually or by the automatic analysis of large collections of texts.
Chen, H.; Chau, M.: Web mining : machine learning for Web applications (2003) 0.02
```
0.024056775 = product of:
  0.072170325 = sum of:
    0.072170325 = product of:
      0.14434065 = sum of:
        0.14434065 = weight(_text_:page in 4242) [ClassicSimilarity], result of:
          0.14434065 = score(doc=4242,freq=4.0), product of:
            0.27565226 = queryWeight, product of:
              5.5854197 = idf(docFreq=450, maxDocs=44218)
              0.049352113 = queryNorm
            0.5236331 = fieldWeight in 4242, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.5854197 = idf(docFreq=450, maxDocs=44218)
              0.046875 = fieldNorm(doc=4242)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich knowledge base. The knowledge comes not only from the content of the pages themselves, but also from the unique characteristics of the Web, such as its hyperlink structure and its diversity of content and languages. Analysis of these characteristics often reveals interesting patterns and new knowledge. Such knowledge can be used to improve users' efficiency and effectiveness in searching for information an the Web, and also for applications unrelated to the Web, such as support for decision making or business management. The Web's size and its unstructured and dynamic content, as well as its multilingual nature, make the extraction of useful knowledge a challenging research problem. Furthermore, the Web generates a large amount of data in other formats that contain valuable information. For example, Web server logs' information about user access patterns can be used for information personalization or improving Web page design.

Enser, P.G.B.: Visual image retrieval (2008) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 3281) [ClassicSimilarity], result of:
          0.10698449 = score(doc=3281,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 3281, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=3281)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 1.2012 13:01:26

Morris, S.A.: Mapping research specialties (2008) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 3962) [ClassicSimilarity], result of:
          0.10698449 = score(doc=3962,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 3962, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=3962)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 13. 7.2008 9:30:22

Fallis, D.: Social epistemology and information science (2006) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 4368) [ClassicSimilarity], result of:
          0.10698449 = score(doc=4368,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 4368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=4368)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 13. 7.2008 19:22:28

Nicolaisen, J.: Citation analysis (2007) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 6091) [ClassicSimilarity], result of:
          0.10698449 = score(doc=6091,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 6091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6091)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 13. 7.2008 19:53:22

Metz, A.: Community service : a bibliography (1996) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 5341) [ClassicSimilarity], result of:
          0.10698449 = score(doc=5341,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 5341, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=5341)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 17.10.1996 14:22:33

Belkin, N.J.; Croft, W.B.: Retrieval techniques (1987) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 334) [ClassicSimilarity], result of:
          0.10698449 = score(doc=334,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 334, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=334)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Annual review of information science and technology. 22(1987), S.109-145

Smith, L.C.: Artificial intelligence and information retrieval (1987) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 335) [ClassicSimilarity], result of:
          0.10698449 = score(doc=335,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 335, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=335)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Annual review of information science and technology. 22(1987), S.41-77

Warner, A.J.: Natural language processing (1987) 0.02

0.017830748 = product of:
  0.053492244 = sum of:
    0.053492244 = product of:
      0.10698449 = sum of:
        0.10698449 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
          0.10698449 = score(doc=337,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.61904186 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Annual review of information science and technology. 22(1987), S.79-108

Yang, K.: Information retrieval on the Web (2004) 0.02
```
0.015704103 = product of:
  0.047112305 = sum of:
    0.047112305 = weight(_text_:query in 4278) [ClassicSimilarity], result of:
      0.047112305 = score(doc=4278,freq=2.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.20539828 = fieldWeight in 4278, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.03125 = fieldNorm(doc=4278)
  0.33333334 = coord(1/3)
```
Abstract

How do we find information an the Web? Although information on the Web is distributed and decentralized, the Web can be viewed as a single, virtual document collection. In that regard, the fundamental questions and approaches of traditional information retrieval (IR) research (e.g., term weighting, query expansion) are likely to be relevant in Web document retrieval. Findings from traditional IR research, however, may not always be applicable in a Web setting. The Web document collection - massive in size and diverse in content, format, purpose, and quality - challenges the validity of previous research findings that are based an relatively small and homogeneous test collections. Moreover, some traditional IR approaches, although applicable in theory, may be impossible or impractical to implement in a Web setting. For instance, the size, distribution, and dynamic nature of Web information make it extremely difficult to construct a complete and up-to-date data representation of the kind required for a model IR system. To further complicate matters, information seeking on the Web is diverse in character and unpredictable in nature. Web searchers come from all walks of life and are motivated by many kinds of information needs. The wide range of experience, knowledge, motivation, and purpose means that searchers can express diverse types of information needs in a wide variety of ways with differing criteria for satisfying those needs. Conventional evaluation measures, such as precision and recall, may no longer be appropriate for Web IR, where a representative test collection is all but impossible to construct. Finding information on the Web creates many new challenges for, and exacerbates some old problems in, IR research. At the same time, the Web is rich in new types of information not present in most IR test collections. Hyperlinks, usage statistics, document markup tags, and collections of topic hierarchies such as Yahoo! (http://www.yahoo.com) present an opportunity to leverage Web-specific document characteristics in novel ways that go beyond the term-based retrieval framework of traditional IR. Consequently, researchers in Web IR have reexamined the findings from traditional IR research.
Cho, H.; Pham, M.T.N.; Leonard, K.N.; Urban, A.C.: ¬A systematic literature review on image information needs and behaviors (2022) 0.02
```
0.015704103 = product of:
  0.047112305 = sum of:
    0.047112305 = weight(_text_:query in 606) [ClassicSimilarity], result of:
      0.047112305 = score(doc=606,freq=2.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.20539828 = fieldWeight in 606, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.03125 = fieldNorm(doc=606)
  0.33333334 = coord(1/3)
```
Abstract

Purpose With ready access to search engines and social media platforms, the way people find image information has evolved and diversified in the past two decades. The purpose of this paper is to provide an overview of the literature on image information needs and behaviors. Design/methodology/approach Following an eight-step procedure for conducting systematic literature reviews, the paper presents an analysis of peer-reviewed work on image information needs and behaviors, with publications ranging from the years 1997 to 2019. Findings Application of the inclusion criteria led to 69 peer-reviewed works. These works were synthesized according to the following categories: research methods, users targeted, image types, identified needs, search behaviors and search obstacles. The reviewed studies show that people seek and use images for multiple reasons, including entertainment, illustration, aesthetic appreciation, knowledge construction, engagement, inspiration and social interactions. The reviewed studies also report that common strategies for image searches include keyword searches with short queries, browsing, specialization and reformulation. Observed trends suggest common deployment of query analysis, survey questionnaires and undergraduate participant pools to research image information needs and behavior. Originality/value At this point, after more than two decades of image information needs research, a holistic systematic review of the literature was long overdue. The way users find image information has evolved and diversified due to technological developments in image retrieval. By synthesizing this burgeoning field into specific foci, this systematic literature review provides a foundation for future empirical investigation. With this foundation set, the paper then pinpoints key research gaps to investigate, particularly the influence of user expertise, a need for more diverse population samples, a dearth of qualitative data, new search features and information and visual literacies instruction.

Grudin, J.: Human-computer interaction (2011) 0.02

0.015601905 = product of:
  0.046805713 = sum of:
    0.046805713 = product of:
      0.09361143 = sum of:
        0.09361143 = weight(_text_:22 in 1601) [ClassicSimilarity], result of:
          0.09361143 = score(doc=1601,freq=2.0), product of:
            0.1728227 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049352113 = queryNorm
            0.5416616 = fieldWeight in 1601, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1601)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 27.12.2014 18:54:22

Rasmussen, E.M.: Indexing and retrieval for the Web (2002) 0.01
```
0.013741089 = product of:
  0.041223265 = sum of:
    0.041223265 = weight(_text_:query in 4285) [ClassicSimilarity], result of:
      0.041223265 = score(doc=4285,freq=2.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.17972349 = fieldWeight in 4285, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
  0.33333334 = coord(1/3)
```
Abstract

Techniques for automated indexing and information retrieval (IR) have been developed, tested, and refined over the past 40 years, and are well documented (see, for example, Agosti & Smeaton, 1996; BaezaYates & Ribeiro-Neto, 1999a; Frakes & Baeza-Yates, 1992; Korfhage, 1997; Salton, 1989; Witten, Moffat, & Bell, 1999). With the introduction of the Web, and the capability to index and retrieve via search engines, these techniques have been extended to a new environment. They have been adopted, altered, and in some Gases extended to include new methods. "In short, search engines are indispensable for searching the Web, they employ a variety of relatively advanced IR techniques, and there are some peculiar aspects of search engines that make searching the Web different than more conventional information retrieval" (Gordon & Pathak, 1999, p. 145). The environment for information retrieval an the World Wide Web differs from that of "conventional" information retrieval in a number of fundamental ways. The collection is very large and changes continuously, with pages being added, deleted, and altered. Wide variability between the size, structure, focus, quality, and usefulness of documents makes Web documents much more heterogeneous than a typical electronic document collection. The wide variety of document types includes images, video, audio, and scripts, as well as many different document languages. Duplication of documents and sites is common. Documents are interconnected through networks of hyperlinks. Because of the size and dynamic nature of the Web, preprocessing all documents requires considerable resources and is often not feasible, certainly not an the frequent basis required to ensure currency. Query length is usually much shorter than in other environments-only a few words-and user behavior differs from that in other environments. These differences make the Web a novel environment for information retrieval (Baeza-Yates & Ribeiro-Neto, 1999b; Bharat & Henzinger, 1998; Huang, 2000).
Khoo, S.G.; Na, J.-C.: Semantic relations in information science (2006) 0.01
```
0.011778077 = product of:
  0.03533423 = sum of:
    0.03533423 = weight(_text_:query in 1978) [ClassicSimilarity], result of:
      0.03533423 = score(doc=1978,freq=2.0), product of:
        0.22937049 = queryWeight, product of:
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.049352113 = queryNorm
        0.15404871 = fieldWeight in 1978, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6476326 = idf(docFreq=1151, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1978)
  0.33333334 = coord(1/3)
```
Abstract

Linguists in the structuralist tradition (e.g., Lyons, 1977; Saussure, 1959) have asserted that concepts cannot be defined on their own but only in relation to other concepts. Semantic relations appear to reflect a logical structure in the fundamental nature of thought (Caplan & Herrmann, 1993). Green, Bean, and Myaeng (2002) noted that semantic relations play a critical role in how we represent knowledge psychologically, linguistically, and computationally, and that many systems of knowledge representation start with a basic distinction between entities and relations. Green (2001, p. 3) said that "relationships are involved as we combine simple entities to form more complex entities, as we compare entities, as we group entities, as one entity performs a process on another entity, and so forth. Indeed, many things that we might initially regard as basic and elemental are revealed upon further examination to involve internal structure, or in other words, internal relationships." Concepts and relations are often expressed in language and text. Language is used not just for communicating concepts and relations, but also for representing, storing, and reasoning with concepts and relations. We shall examine the nature of semantic relations from a linguistic and psychological perspective, with an emphasis on relations expressed in text. The usefulness of semantic relations in information science, especially in ontology construction, information extraction, information retrieval, question-answering, and text summarization is discussed. Research and development in information science have focused on concepts and terms, but the focus will increasingly shift to the identification, processing, and management of relations to achieve greater effectiveness and refinement in information science techniques. Previous chapters in ARIST on natural language processing (Chowdhury, 2003), text mining (Trybula, 1999), information retrieval and the philosophy of language (Blair, 2003), and query expansion (Efthimiadis, 1996) provide a background for this discussion, as semantic relations are an important part of these applications.

Search (35 results, page 1 of 2)

Authors

Years

Languages

Types

Themes