Search (260 results, page 1 of 13)

  • × theme_ss:"Computerlinguistik"
  1. Chiba, K.; Kyojima, M.: Document transformation based on syntax-directed free translation (1995) 0.10
    0.09571206 = product of:
      0.15952009 = sum of:
        0.09129799 = weight(_text_:context in 4069) [ClassicSimilarity], result of:
          0.09129799 = score(doc=4069,freq=4.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.51808125 = fieldWeight in 4069, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0625 = fieldNorm(doc=4069)
        0.05272096 = weight(_text_:system in 4069) [ClassicSimilarity], result of:
          0.05272096 = score(doc=4069,freq=4.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.3936941 = fieldWeight in 4069, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=4069)
        0.015501143 = product of:
          0.04650343 = sum of:
            0.04650343 = weight(_text_:29 in 4069) [ClassicSimilarity], result of:
              0.04650343 = score(doc=4069,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.31092256 = fieldWeight in 4069, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4069)
          0.33333334 = coord(1/3)
      0.6 = coord(3/5)
    
    Abstract
    Presents a description system for the transformation of structured documents based on Context Free Grammars (CFGs). The system caters to transformations between different document class descriptions, and is presented mainly in Terms of logical structure transformation. Proposes 2 requirements for transformation; the output document class must be explicitly represented; and inconsistency must be avoided. Introducs a grammar for document class descriptions; tree-preserving Context Free Grammar, and gives Syntax-Directed Tree Translation (SDTT) for transformations of a document. The SDTT transformation is formal, concise and consistent with the above 2 requirements
    Source
    Electronic publishing. 8(1995) no.1, S.15-29
  2. SIGIR'92 : Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1992) 0.06
    0.063462004 = product of:
      0.10577001 = sum of:
        0.028243875 = weight(_text_:context in 6671) [ClassicSimilarity], result of:
          0.028243875 = score(doc=6671,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.16027321 = fieldWeight in 6671, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.02734375 = fieldNorm(doc=6671)
        0.03139529 = weight(_text_:index in 6671) [ClassicSimilarity], result of:
          0.03139529 = score(doc=6671,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.16897833 = fieldWeight in 6671, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.02734375 = fieldNorm(doc=6671)
        0.04613084 = weight(_text_:system in 6671) [ClassicSimilarity], result of:
          0.04613084 = score(doc=6671,freq=16.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.34448233 = fieldWeight in 6671, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.02734375 = fieldNorm(doc=6671)
      0.6 = coord(3/5)
    
    Content
    HARMAN, D.: Relevance feedback revisited; AALBERSBERG, I.J.: Incremental relevance feedback; TAGUE-SUTCLIFFE, J.: Measuring the informativeness of a retrieval process; LEWIS, D.D.: An evaluation of phrasal and clustered representations on a text categorization task; BLOSSEVILLE, M.J., G. HÉBRAIL, M.G. MONTEIL u. N. PÉNOT: Automatic document classification: natural language processing, statistical analysis, and expert system techniques used together; MASAND, B., G. LINOFF u. D. WALTZ: Classifying news stories using memory based reasoning; KEEN, E.M.: Term position ranking: some new test results; CROUCH, C.J. u. B. YANG: Experiments in automatic statistical thesaurus construction; GREFENSTETTE, G.: Use of syntactic context to produce term association lists for text retrieval; ANICK, P.G. u. R.A. FLYNN: Versioning of full-text information retrieval system; BURKOWSKI, F.J.: Retrieval activities in a database consisting of heterogeneous collections; DEERWESTER, S.C., K. WACLENA u. M. LaMAR: A textual object management system; NIE, J.-Y.:Towards a probabilistic modal logic for semantic-based information retrieval; WANG, A.W., S.K.M. WONG u. Y.Y. YAO: An analysis of vector space models based on computational geometry; BARTELL, B.T., G.W. COTTRELL u. R.K. BELEW: Latent semantic indexing is an optimal special case of multidimensional scaling; GLAVITSCH, U. u. P. SCHÄUBLE: A system for retrieving speech documents; MARGULIS, E.L.: N-Poisson document modelling; HESS, M.: An incrementally extensible document retrieval system based on linguistics and logical principles; COOPER, W.S., F.C. GEY u. D.P. DABNEY: Probabilistic retrieval based on staged logistic regression; FUHR, N.: Integration of probabilistic fact and text retrieval; CROFT, B., L.A. SMITH u. H. TURTLE: A loosely-coupled integration of a text retrieval system and an object-oriented database system; DUMAIS, S.T. u. J. NIELSEN: Automating the assignement of submitted manuscripts to reviewers; GOST, M.A. u. M. MASOTTI: Design of an OPAC database to permit different subject searching accesses; ROBERTSON, A.M. u. P. WILLETT: Searching for historical word forms in a database of 17th century English text using spelling correction methods; FAX, E.A., Q.F. CHEN u. L.S. HEATH: A faster algorithm for constructing minimal perfect hash functions; MOFFAT, A. u. J. ZOBEL: Parameterised compression for sparse bitmaps; GRANDI, F., P. TIBERIO u. P. Zezula: Frame-sliced patitioned parallel signature files; ALLEN, B.: Cognitive differences in end user searching of a CD-ROM index; SONNENWALD, D.H.: Developing a theory to guide the process of designing information retrieval systems; CUTTING, D.R., J.O. PEDERSEN, D. KARGER, u. J.W. TUKEY: Scatter/ Gather: a cluster-based approach to browsing large document collections; CHALMERS, M. u. P. CHITSON: Bead: Explorations in information visualization; WILLIAMSON, C. u. B. SHNEIDERMAN: The dynamic HomeFinder: evaluating dynamic queries in a real-estate information exploring system
  3. Herrera-Viedma, E.: Modeling the retrieval process for an information retrieval system using an ordinal fuzzy linguistic approach (2001) 0.06
    0.056936827 = product of:
      0.09489471 = sum of:
        0.044850416 = weight(_text_:index in 5752) [ClassicSimilarity], result of:
          0.044850416 = score(doc=5752,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.24139762 = fieldWeight in 5752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5752)
        0.04035608 = weight(_text_:system in 5752) [ClassicSimilarity], result of:
          0.04035608 = score(doc=5752,freq=6.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.30135927 = fieldWeight in 5752, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5752)
        0.009688215 = product of:
          0.029064644 = sum of:
            0.029064644 = weight(_text_:29 in 5752) [ClassicSimilarity], result of:
              0.029064644 = score(doc=5752,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.19432661 = fieldWeight in 5752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5752)
          0.33333334 = coord(1/3)
      0.6 = coord(3/5)
    
    Abstract
    A linguistic model for an Information Retrieval System (IRS) defined using an ordinal fuzzy linguistic approach is proposed. The ordinal fuzzy linguistic approach is presented, and its use for modeling the imprecision and subjectivity that appear in the user-IRS interaction is studied. The user queries and IRS responses are modeled linguistically using the concept of fuzzy linguistic variables. The system accepts Boolean queries whose terms can be weighted simultaneously by means of ordinal linguistic values according to three possible semantics: a symmetrical threshold semantic, a quantitative semantic, and an importance semantic. The first one identifies a new threshold semantic used to express qualitative restrictions on the documents retrieved for a given term. It is monotone increasing in index term weight for the threshold values that are on the right of the mid-value, and decreasing for the threshold values that are on the left of the mid-value. The second one is a new semantic proposal introduced to express quantitative restrictions on the documents retrieved for a term, i.e., restrictions on the number of documents that must be retrieved containing that term. The last one is the usual semantic of relative importance that has an effect when the term is in a Boolean expression. A bottom-up evaluation mechanism of queries is presented that coherently integrates the use of the three semantics and satisfies the separability property. The advantage of this IRS with respect to others is that users can express linguistically different semantic restrictions on the desired documents simultaneously, incorporating more flexibility in the user-IRS interaction
    Date
    29. 9.2001 14:00:25
  4. Abu-Salem, H.; Al-Omari, M.; Evens, M.W.: Stemming methodologies over individual query words for an Arabic information retrieval system (1999) 0.06
    0.05625787 = product of:
      0.14064467 = sum of:
        0.100288585 = weight(_text_:index in 3672) [ClassicSimilarity], result of:
          0.100288585 = score(doc=3672,freq=10.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.5397815 = fieldWeight in 3672, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3672)
        0.04035608 = weight(_text_:system in 3672) [ClassicSimilarity], result of:
          0.04035608 = score(doc=3672,freq=6.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.30135927 = fieldWeight in 3672, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3672)
      0.4 = coord(2/5)
    
    Abstract
    Stemming is one of the most important factors that affect the performance of information retrieval systems. This article investigates how to improve the performance of an Arabic information retrieval system by imposing the retrieval method over individual words of a query depending on the importance of the WORD, the STEM, or the ROOT of the query terms in the database. This method, called Mxed Stemming, computes term importance using a weighting scheme that use the Term Frequency (TF) and the Inverse Document Frequency (IDF), called TFxIDF. An extended version of the Arabic IRS system is designed, implemented, and evaluated to reduce the number of irrelevant documents retrieved. The results of the experiment suggest that the proposed method outperforms the Word index method using the TFxIDF weighting scheme. It also outperforms the Stem index method using the Binary weighting scheme but does not outperform the Stem index method using the TFxIDF weighting scheme, and again it outperforms the Root index method using the Binary weighting scheme but does not outperform the Root index method using the TFxIDF weighting scheme
  5. Hess, M.: ¬An incrementally extensible document retrieval system based on linguistic and logical principles (1992) 0.05
    0.04981639 = product of:
      0.12454097 = sum of:
        0.07611368 = weight(_text_:index in 2413) [ClassicSimilarity], result of:
          0.07611368 = score(doc=2413,freq=4.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.40966535 = fieldWeight in 2413, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=2413)
        0.048427295 = weight(_text_:system in 2413) [ClassicSimilarity], result of:
          0.048427295 = score(doc=2413,freq=6.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.36163113 = fieldWeight in 2413, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2413)
      0.4 = coord(2/5)
    
    Abstract
    Most natural language based document retrieval systems use the syntax structures of constituent phrases of documents as index terms. Many of these systems also attempt to reduce the syntactic variability of natural language by some normalisation procedure applied to these syntax structures. However, the retrieval performance of such systems remains fairly disappointing. Some systems therefore use a meaning representation language to index and retrieve documents. In this paper, a system is presented that uses Horn Clause Logic as meaning representation language, employs advanced techniques from Natural Language Processing to achieve incremental extensibility, and uses methods from Logic Programming to achieve robustness in the face of insufficient data. An Incrementally Extensible Document Retrieval System Based on Linguistic and Logical Principles.
  6. Mock, K.J.; Vemuri, V.R.: Information filtering via hill climbing, WordNet, and index patterns (1997) 0.05
    0.04856749 = product of:
      0.12141872 = sum of:
        0.08879929 = weight(_text_:index in 1517) [ClassicSimilarity], result of:
          0.08879929 = score(doc=1517,freq=4.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.4779429 = fieldWeight in 1517, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1517)
        0.03261943 = weight(_text_:system in 1517) [ClassicSimilarity], result of:
          0.03261943 = score(doc=1517,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.2435858 = fieldWeight in 1517, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1517)
      0.4 = coord(2/5)
    
    Abstract
    The INFOS (Intelligent News Filtering Organizational System) project is designed to reduce the user's search burden by automatically categorising data as relevant or irrelevant based upon user interests. These predictions are learned automatically based upon features taken from input articles and collaborative features derived from other users. The filtering is performed by a hybrid technique that combines elements of a keyword-based hill climbing method, knowledge-based conceptual representation via WordNet, and partial parsing via index patterns. The hybrid systems integrating all these approaches combines the benefits of each while maintaing robustness and acalability
  7. Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.04
    0.041629277 = product of:
      0.10407319 = sum of:
        0.07611368 = weight(_text_:index in 1061) [ClassicSimilarity], result of:
          0.07611368 = score(doc=1061,freq=4.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.40966535 = fieldWeight in 1061, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=1061)
        0.027959513 = weight(_text_:system in 1061) [ClassicSimilarity], result of:
          0.027959513 = score(doc=1061,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.20878783 = fieldWeight in 1061, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=1061)
      0.4 = coord(2/5)
    
    Abstract
    The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.
  8. Kostoff, R.N.; Block, J.A.: Factor matrix text filtering and clustering (2005) 0.04
    0.038195368 = product of:
      0.095488414 = sum of:
        0.08386256 = weight(_text_:context in 3683) [ClassicSimilarity], result of:
          0.08386256 = score(doc=3683,freq=6.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.475888 = fieldWeight in 3683, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.046875 = fieldNorm(doc=3683)
        0.011625858 = product of:
          0.034877572 = sum of:
            0.034877572 = weight(_text_:29 in 3683) [ClassicSimilarity], result of:
              0.034877572 = score(doc=3683,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.23319192 = fieldWeight in 3683, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3683)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    The presence of trivial words in text databases can affect record or concept (words/phrases) clustering adversely. Additionally, the determination of whether a word/phrase is trivial is context-dependent. Our objective in the present article is to demonstrate a context-dependent trivial word filter to improve clustering quality. Factor analysis was used as a context-dependent trivial word filter for subsequent term clustering. Medline records for Raynaud's Phenomenon were used as the database, and words were extracted from the record abstracts. A factor matrix of these words was generated, and the words that had low factor loadings across all factors were identified, and eliminated. The remaining words, which had high factor loading values for at least one factor and therefore were influential in determining the theme of that factor, were input to the clustering algorithm. Both quantitative and qualitative analyses were used to show that factor matrix filtering leads to higher quality clusters and subsequent taxonomies.
    Date
    21. 7.2005 16:29:47
  9. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.04
    0.037330892 = product of:
      0.09332723 = sum of:
        0.07988574 = weight(_text_:context in 156) [ClassicSimilarity], result of:
          0.07988574 = score(doc=156,freq=4.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.4533211 = fieldWeight in 156, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.013441487 = product of:
          0.04032446 = sum of:
            0.04032446 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
              0.04032446 = score(doc=156,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.2708308 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
    Date
    8. 3.2007 19:55:22
    Source
    Context: nature, impact and role. 5th International Conference an Conceptions of Library and Information Sciences, CoLIS 2005 Glasgow, UK, June 2005. Ed. by F. Crestani u. I. Ruthven
  10. Way, E.C.: Knowledge representation and metaphor (oder: meaning) (1994) 0.03
    0.034848943 = product of:
      0.08712236 = sum of:
        0.07176066 = weight(_text_:index in 771) [ClassicSimilarity], result of:
          0.07176066 = score(doc=771,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.3862362 = fieldWeight in 771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0625 = fieldNorm(doc=771)
        0.015361699 = product of:
          0.046085097 = sum of:
            0.046085097 = weight(_text_:22 in 771) [ClassicSimilarity], result of:
              0.046085097 = score(doc=771,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.30952093 = fieldWeight in 771, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=771)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Content
    Enthält folgende 9 Kapitel: The literal and the metaphoric; Views of metaphor; Knowledge representation; Representation schemes and conceptual graphs; The dynamic type hierarchy theory of metaphor; Computational approaches to metaphor; Thenature and structure of semantic hierarchies; Language games, open texture and family resemblance; Programming the dynamic type hierarchy; Subject index
    Footnote
    Bereits 1991 bei Kluwer publiziert // Rez. in: Knowledge organization 22(1995) no.1, S.48-49 (O. Sechser)
  11. Ruge, G.: Sprache und Computer : Wortbedeutung und Termassoziation. Methoden zur automatischen semantischen Klassifikation (1995) 0.03
    0.034848943 = product of:
      0.08712236 = sum of:
        0.07176066 = weight(_text_:index in 1534) [ClassicSimilarity], result of:
          0.07176066 = score(doc=1534,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.3862362 = fieldWeight in 1534, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0625 = fieldNorm(doc=1534)
        0.015361699 = product of:
          0.046085097 = sum of:
            0.046085097 = weight(_text_:22 in 1534) [ClassicSimilarity], result of:
              0.046085097 = score(doc=1534,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.30952093 = fieldWeight in 1534, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1534)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Content
    Enthält folgende Kapitel: (1) Motivation; (2) Language philosophical foundations; (3) Structural comparison of extensions; (4) Earlier approaches towards term association; (5) Experiments; (6) Spreading-activation networks or memory models; (7) Perspective. Appendices: Heads and modifiers of 'car'. Glossary. Index. Language and computer. Word semantics and term association. Methods towards an automatic semantic classification
    Footnote
    Rez. in: Knowledge organization 22(1995) no.3/4, S.182-184 (M.T. Rolland)
  12. Sparck Jones, K.; Galliers, J.R.: Evaluating natural language processing systems : an analysis and review (1996) 0.03
    0.032712005 = product of:
      0.08178001 = sum of:
        0.0538205 = weight(_text_:index in 2934) [ClassicSimilarity], result of:
          0.0538205 = score(doc=2934,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.28967714 = fieldWeight in 2934, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=2934)
        0.027959513 = weight(_text_:system in 2934) [ClassicSimilarity], result of:
          0.027959513 = score(doc=2934,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.20878783 = fieldWeight in 2934, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2934)
      0.4 = coord(2/5)
    
    Abstract
    This comprehensive state-of-the-art book is the first devoted to the important and timely issue of evaluating NLP systems. It addresses the whole area of NLP system evaluation, including aims and scope, problems and methodology. The authors provide a wide-ranging and careful analysis of evaluation concepts, reinforced with extensive illustrations; they relate systems to their environments and develop a framework for proper evaluation. The discussion of principles is completed by a detailed review of practice and strategies in the field, covering both systems for specific tasks, like translation, and core language processors. The methodology lessons drawn from the analysis and review are applied in a series of example cases. A comprehensive bibliography, a subject index, and term glossary are included
  13. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.03
    0.031620618 = product of:
      0.07905155 = sum of:
        0.067530274 = product of:
          0.20259081 = sum of:
            0.20259081 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.20259081 = score(doc=562,freq=2.0), product of:
                0.3604703 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04251826 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.011521274 = product of:
          0.03456382 = sum of:
            0.03456382 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.03456382 = score(doc=562,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  14. Wright, L.W.; Nardini, H.K.G.; Aronson, A.R.; Rindflesch, T.C.: Hierarchical concept indexing of full-text documents in the Unified Medical Language System Information sources Map (1999) 0.03
    0.030551035 = product of:
      0.076377586 = sum of:
        0.04841807 = weight(_text_:context in 2111) [ClassicSimilarity], result of:
          0.04841807 = score(doc=2111,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.27475408 = fieldWeight in 2111, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.046875 = fieldNorm(doc=2111)
        0.027959513 = weight(_text_:system in 2111) [ClassicSimilarity], result of:
          0.027959513 = score(doc=2111,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.20878783 = fieldWeight in 2111, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2111)
      0.4 = coord(2/5)
    
    Abstract
    Full-text documents are a vital and rapidly growing part of online biomedical information. A single large document can contain as much information as a small database, but normally lacks the tight structure and consistent indexing of a database. Retrieval systems will often miss highly relevant parts of a document if the document as a whole appears irrelevant. Access to full-text information is further complicated by the need to search separately many disparate information resources. This research explores how these problems can be addressed by the combined use of 2 techniques: 1) natural language processing for automatic concept-based indexing of full text, and 2) methods for exploiting the structure and hierarchy of full-text documents. We describe methods for applying these techniques to a large collection of full-text documents drawn from the Health Services / Technology Assessment Text (HSTAT) database at the NLM and examine how this hierarchical concept indexing can assist both document- and source-level retrieval in the context of NLM's Information Source Map project
  15. Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.03
    0.029319597 = product of:
      0.07329899 = sum of:
        0.040348392 = weight(_text_:context in 2861) [ClassicSimilarity], result of:
          0.040348392 = score(doc=2861,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.22896172 = fieldWeight in 2861, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2861)
        0.032950602 = weight(_text_:system in 2861) [ClassicSimilarity], result of:
          0.032950602 = score(doc=2861,freq=4.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.24605882 = fieldWeight in 2861, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2861)
      0.4 = coord(2/5)
    
    Abstract
    Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.
  16. Salton, G.: Automatic processing of foreign language documents (1985) 0.03
    0.029263873 = product of:
      0.07315968 = sum of:
        0.03588033 = weight(_text_:index in 3650) [ClassicSimilarity], result of:
          0.03588033 = score(doc=3650,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.1931181 = fieldWeight in 3650, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.03125 = fieldNorm(doc=3650)
        0.03727935 = weight(_text_:system in 3650) [ClassicSimilarity], result of:
          0.03727935 = score(doc=3650,freq=8.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.27838376 = fieldWeight in 3650, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=3650)
      0.4 = coord(2/5)
    
    Abstract
    The attempt to computerize a process, such as indexing, abstracting, classifying, or retrieving information, begins with an analysis of the process into its intellectual and nonintellectual components. That part of the process which is amenable to computerization is mechanical or algorithmic. What is not is intellectual or creative and requires human intervention. Gerard Salton has been an innovator, experimenter, and promoter in the area of mechanized information systems since the early 1960s. He has been particularly ingenious at analyzing the process of information retrieval into its algorithmic components. He received a doctorate in applied mathematics from Harvard University before moving to the computer science department at Cornell, where he developed a prototype automatic retrieval system called SMART. Working with this system he and his students contributed for over a decade to our theoretical understanding of the retrieval process. On a more practical level, they have contributed design criteria for operating retrieval systems. The following selection presents one of the early descriptions of the SMART system; it is valuable as it shows the direction automatic retrieval methods were to take beyond simple word-matching techniques. These include various word normalization techniques to improve recall, for instance, the separation of words into stems and affixes; the correlation and clustering, using statistical association measures, of related terms; and the identification, using a concept thesaurus, of synonymous, broader, narrower, and sibling terms. They include, as weIl, techniques, both linguistic and statistical, to deal with the thorny problem of how to automatically extract from texts index terms that consist of more than one word. They include weighting techniques and various documentrequest matching algorithms. Significant among the latter are those which produce a retrieval output of citations ranked in relevante order. During the 1970s, Salton and his students went an to further refine these various techniques, particularly the weighting and statistical association measures. Many of their early innovations seem commonplace today. Some of their later techniques are still ahead of their time and await technological developments for implementation. The particular focus of the selection that follows is an the evaluation of a particular component of the SMART system, a multilingual thesaurus. By mapping English language expressions and their German equivalents to a common concept number, the thesaurus permitted the automatic processing of German language documents against English language queries and vice versa. The results of the evaluation, as it turned out, were somewhat inconclusive. However, this SMART experiment suggested in a bold and optimistic way how one might proceed to answer such complex questions as What is meant by retrieval language compatability? How it is to be achieved, and how evaluated?
  17. Paolillo, J.C.: Linguistics and the information sciences (2009) 0.03
    0.027971694 = product of:
      0.069929235 = sum of:
        0.05648775 = weight(_text_:context in 3840) [ClassicSimilarity], result of:
          0.05648775 = score(doc=3840,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.32054642 = fieldWeight in 3840, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3840)
        0.013441487 = product of:
          0.04032446 = sum of:
            0.04032446 = weight(_text_:22 in 3840) [ClassicSimilarity], result of:
              0.04032446 = score(doc=3840,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.2708308 = fieldWeight in 3840, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3840)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Linguistics is the scientific study of language which emphasizes language spoken in everyday settings by human beings. It has a long history of interdisciplinarity, both internally and in contribution to other fields, including information science. A linguistic perspective is beneficial in many ways in information science, since it examines the relationship between the forms of meaningful expressions and their social, cognitive, institutional, and communicative context, these being two perspectives on information that are actively studied, to different degrees, in information science. Examples of issues relevant to information science are presented for which the approach taken under a linguistic perspective is illustrated.
    Date
    27. 8.2011 14:22:33
  18. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.03
    0.027932769 = product of:
      0.06983192 = sum of:
        0.023299592 = weight(_text_:system in 2541) [ClassicSimilarity], result of:
          0.023299592 = score(doc=2541,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.17398985 = fieldWeight in 2541, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.046532333 = product of:
          0.0697985 = sum of:
            0.029064644 = weight(_text_:29 in 2541) [ClassicSimilarity], result of:
              0.029064644 = score(doc=2541,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.19432661 = fieldWeight in 2541, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
            0.040733855 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
              0.040733855 = score(doc=2541,freq=4.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.27358043 = fieldWeight in 2541, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.6666667 = coord(2/3)
      0.4 = coord(2/5)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  19. Bellaachia, A.; Amor-Tijani, G.: Proper nouns in English-Arabic cross language information retrieval (2008) 0.03
    0.027260004 = product of:
      0.068150006 = sum of:
        0.044850416 = weight(_text_:index in 2372) [ClassicSimilarity], result of:
          0.044850416 = score(doc=2372,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.24139762 = fieldWeight in 2372, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2372)
        0.023299592 = weight(_text_:system in 2372) [ClassicSimilarity], result of:
          0.023299592 = score(doc=2372,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.17398985 = fieldWeight in 2372, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2372)
      0.4 = coord(2/5)
    
    Abstract
    Out of vocabulary words, mostly proper nouns and technical terms, are one main source of performance degradation in Cross Language Information Retrieval (CLIR) systems. Those are words not found in the dictionary. Bilingual dictionaries in general do not cover most proper nouns, which are usually primary keys in the query. As they are spelling variants of each other in most languages, using an approximate string matching technique against the target database index is the common approach taken to find the target language correspondents of the original query key. N-gram technique proved to be the most effective among other string matching techniques. The issue arises when the languages dealt with have different alphabets. Transliteration is then applied based on phonetic similarities between the languages involved. In this study, both transliteration and the n-gram technique are combined to generate possible transliterations in an English-Arabic CLIR system. We refer to this technique as Transliteration N-Gram (TNG). We further enhance TNG by applying Part Of Speech disambiguation on the set of transliterations so that words with a similar spelling, but a different meaning, are excluded. Experimental results show that TNG gives promising results, and enhanced TNG further improves performance.
  20. Whitelock, P.; Kilby, K.: Linguistic and computational techniques in machine translation system design : 2nd ed (1995) 0.03
    0.026390245 = product of:
      0.065975614 = sum of:
        0.046599183 = weight(_text_:system in 1750) [ClassicSimilarity], result of:
          0.046599183 = score(doc=1750,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.3479797 = fieldWeight in 1750, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.078125 = fieldNorm(doc=1750)
        0.01937643 = product of:
          0.05812929 = sum of:
            0.05812929 = weight(_text_:29 in 1750) [ClassicSimilarity], result of:
              0.05812929 = score(doc=1750,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.38865322 = fieldWeight in 1750, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1750)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Date
    29. 3.1996 18:28:09

Years

Languages

  • e 198
  • d 54
  • m 3
  • ru 2
  • f 1
  • More… Less…

Types

  • a 211
  • el 23
  • m 23
  • s 13
  • x 8
  • p 3
  • d 2
  • b 1
  • pat 1
  • r 1
  • More… Less…

Subjects

Classifications