Search (172 results, page 2 of 9)

Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.01
```
0.014763533 = product of:
  0.06889649 = sum of:
    0.016822865 = weight(_text_:classification in 1842) [ClassicSimilarity], result of:
      0.016822865 = score(doc=1842,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.17593184 = fieldWeight in 1842, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1842)
    0.016822865 = weight(_text_:classification in 1842) [ClassicSimilarity], result of:
      0.016822865 = score(doc=1842,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.17593184 = fieldWeight in 1842, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1842)
    0.035250753 = product of:
      0.07050151 = sum of:
        0.07050151 = weight(_text_:texts in 1842) [ClassicSimilarity], result of:
          0.07050151 = score(doc=1842,freq=4.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.42830306 = fieldWeight in 1842, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1842)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)
```
Abstract

Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.

Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.01

0.014128265 = product of:
  0.09889785 = sum of:
    0.049448926 = weight(_text_:classification in 5480) [ClassicSimilarity], result of:
      0.049448926 = score(doc=5480,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.5171319 = fieldWeight in 5480, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=5480)
    0.049448926 = weight(_text_:classification in 5480) [ClassicSimilarity], result of:
      0.049448926 = score(doc=5480,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.5171319 = fieldWeight in 5480, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=5480)
  0.14285715 = coord(2/14)

Abstract: (Automatic) document classification is generally defined as content-based assignment of one or more predefined categories to documents. Usually, machine learning, statistical pattern recognition, or neural network approaches are used to construct classifiers automatically. In this paper we thoroughly evaluate a wide variety of these methods on a document classification task for German text. We evaluate different feature construction and selection methods and various classifiers. Our main results are: (1) feature selection is necessary not only to reduce learning and classification time, but also to avoid overfitting (even for Support Vector Machines); (2) surprisingly, our morphological analysis does not improve classification quality compared to a letter 5-gram approach; (3) Support Vector Machines are significantly better than all other classification methods

Chibout, K.; Vilnat, A.: Primitive sémantiques, classification des verbes et polysémie (1999) 0.01

0.013594929 = product of:
  0.0951645 = sum of:
    0.04758225 = weight(_text_:classification in 6229) [ClassicSimilarity], result of:
      0.04758225 = score(doc=6229,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49761042 = fieldWeight in 6229, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=6229)
    0.04758225 = weight(_text_:classification in 6229) [ClassicSimilarity], result of:
      0.04758225 = score(doc=6229,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49761042 = fieldWeight in 6229, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=6229)
  0.14285715 = coord(2/14)

Footnote: Übers. d. Titels: Semantic primitives, classification of verbs and polysemy

Sager, J.A.: ¬A practical course in terminology processing : with a bibliography by Blaise Nkwenti-Azeh (1990) 0.01

0.013458293 = product of:
  0.09420805 = sum of:
    0.047104023 = weight(_text_:classification in 5028) [ClassicSimilarity], result of:
      0.047104023 = score(doc=5028,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49260917 = fieldWeight in 5028, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.109375 = fieldNorm(doc=5028)
    0.047104023 = weight(_text_:classification in 5028) [ClassicSimilarity], result of:
      0.047104023 = score(doc=5028,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49260917 = fieldWeight in 5028, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.109375 = fieldNorm(doc=5028)
  0.14285715 = coord(2/14)

Footnote: Rez. in: International classification 19(1992) no.3, S.169.

Lewis, D.D.; Sparck Jones, K.: Natural language processing for information retrieval (1997) 0.01

0.013320256 = product of:
  0.09324179 = sum of:
    0.046620894 = weight(_text_:classification in 575) [ClassicSimilarity], result of:
      0.046620894 = score(doc=575,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.48755667 = fieldWeight in 575, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=575)
    0.046620894 = weight(_text_:classification in 575) [ClassicSimilarity], result of:
      0.046620894 = score(doc=575,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.48755667 = fieldWeight in 575, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=575)
  0.14285715 = coord(2/14)

Source: From classification to 'knowledge organization': Dorking revisited or 'past is prelude'. A collection of reprints to commemorate the firty year span between the Dorking Conference (First International Study Conference on Classification Research 1957) and the Sixth International Study Conference on Classification Research (London 1997). Ed.: A. Gilchrist

Tao, J.; Zhou, L.; Hickey, K.: Making sense of the black-boxes : toward interpretable text classification using deep learning models (2023) 0.01
```
0.011773554 = product of:
  0.08241487 = sum of:
    0.041207436 = weight(_text_:classification in 990) [ClassicSimilarity], result of:
      0.041207436 = score(doc=990,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.43094325 = fieldWeight in 990, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=990)
    0.041207436 = weight(_text_:classification in 990) [ClassicSimilarity], result of:
      0.041207436 = score(doc=990,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.43094325 = fieldWeight in 990, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=990)
  0.14285715 = coord(2/14)
```
Abstract

Text classification is a common task in data science. Despite the superior performances of deep learning based models in various text classification tasks, their black-box nature poses significant challenges for wide adoption. The knowledge-to-action framework emphasizes several principles concerning the application and use of knowledge, such as ease-of-use, customization, and feedback. With the guidance of the above principles and the properties of interpretable machine learning, we identify the design requirements for and propose an interpretable deep learning (IDeL) based framework for text classification models. IDeL comprises three main components: feature penetration, instance aggregation, and feature perturbation. We evaluate our implementation of the framework with two distinct case studies: fake news detection and social question categorization. The experiment results provide evidence for the efficacy of IDeL components in enhancing the interpretability of text classification models. Moreover, the findings are generalizable across binary and multi-label, multi-class classification problems. The proposed IDeL framework introduce a unique iField perspective for building trusted models in data science by improving the transparency and access to advanced black-box models.

Flanders, J.; Bauman, S.; Caton, P.; Courname, M.: Names proper and improper : applying the TEI to the classification of proper nouns (1998) 0.01

0.01153568 = product of:
  0.08074976 = sum of:
    0.04037488 = weight(_text_:classification in 3146) [ClassicSimilarity], result of:
      0.04037488 = score(doc=3146,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42223644 = fieldWeight in 3146, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.09375 = fieldNorm(doc=3146)
    0.04037488 = weight(_text_:classification in 3146) [ClassicSimilarity], result of:
      0.04037488 = score(doc=3146,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42223644 = fieldWeight in 3146, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.09375 = fieldNorm(doc=3146)
  0.14285715 = coord(2/14)

Hull, D.; Ait-Mokhtar, S.; Chuat, M.; Eisele, A.; Gaussier, E.; Grefenstette, G.; Isabelle, P.; Samulesson, C.; Segand, F.: Language technologies and patent search and classification (2001) 0.01

0.01153568 = product of:
  0.08074976 = sum of:
    0.04037488 = weight(_text_:classification in 6318) [ClassicSimilarity], result of:
      0.04037488 = score(doc=6318,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42223644 = fieldWeight in 6318, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.09375 = fieldNorm(doc=6318)
    0.04037488 = weight(_text_:classification in 6318) [ClassicSimilarity], result of:
      0.04037488 = score(doc=6318,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42223644 = fieldWeight in 6318, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.09375 = fieldNorm(doc=6318)
  0.14285715 = coord(2/14)

Moens, M.F.; Dumortier, J.: Use of a text grammar for generating highlight abstracts of magazine articles (2000) 0.01

0.011293716 = product of:
  0.07905601 = sum of:
    0.029704956 = weight(_text_:subject in 4540) [ClassicSimilarity], result of:
      0.029704956 = score(doc=4540,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.27661324 = fieldWeight in 4540, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4540)
    0.04935105 = product of:
      0.0987021 = sum of:
        0.0987021 = weight(_text_:texts in 4540) [ClassicSimilarity], result of:
          0.0987021 = score(doc=4540,freq=4.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.5996243 = fieldWeight in 4540, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4540)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: Browsing a database of article abstracts is one way to select and buy relevant magazine articles online. Our research contributes to the design and development of text grammars for abstracting texts in unlimited subject domains. We developed a system that parses texts based on the text grammar of a specific text type and that extracts sentences and statements which are relevant for inclusion in the abstracts. The system employs knowledge of the discourse patterns that are typical of news stories. The results are encouraging and demonstrate the importance of discourse structures in text summarisation.

Rahmstorf, G.: Concept structures for large vocabularies (1998) 0.01

0.011266903 = product of:
  0.05257888 = sum of:
    0.02018744 = weight(_text_:classification in 75) [ClassicSimilarity], result of:
      0.02018744 = score(doc=75,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 75, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=75)
    0.02018744 = weight(_text_:classification in 75) [ClassicSimilarity], result of:
      0.02018744 = score(doc=75,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 75, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=75)
    0.0122040035 = product of:
      0.024408007 = sum of:
        0.024408007 = weight(_text_:22 in 75) [ClassicSimilarity], result of:
          0.024408007 = score(doc=75,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.23214069 = fieldWeight in 75, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=75)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)

Abstract: A technology is described which supports the acquisition, visualisation and manipulation of large vocabularies with associated structures. It is used for dictionary production, terminology data bases, thesauri, library classification systems etc. Essential features of the technology are a lexicographic user interface, variable word description, unlimited list of word readings, a concept language, automatic transformations of formulas into graphic structures, structure manipulation operations and retransformation into formulas. The concept language includes notations for undefined concepts. The structure of defined concepts can be constructed interactively. The technology supports the generation of large vocabularies with structures representing word senses. Concept structures and ordering systems for indexing and retrieval can be constructed separately and connected by associating relations.
Date: 30.12.2001 19:01:22

Rorvig, M.; Smith, M.M.; Uemura, A.: ¬The N-gram hypothesis applied to matched sets of visualized Japanese-English technical documents (1999) 0.01

0.010986518 = product of:
  0.07690562 = sum of:
    0.042009152 = weight(_text_:subject in 6675) [ClassicSimilarity], result of:
      0.042009152 = score(doc=6675,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.3911902 = fieldWeight in 6675, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6675)
    0.034896467 = product of:
      0.069792934 = sum of:
        0.069792934 = weight(_text_:texts in 6675) [ClassicSimilarity], result of:
          0.069792934 = score(doc=6675,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.42399842 = fieldWeight in 6675, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6675)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: Shape Recovery Analysis (SHERA), a new visual analytical technique, is applied to the N-Gram hypothesis on matched Japanese-English technical documents supplied by the National Center for Science Information Systems (NACSIS) in Japan. The results of the SHERA study reveal compaction in the translation of Japanese subject terms to English subject terms. Surprisingly, the bigram approach to the Japanese data yields a remarkable similarity to the matching visualized English texts

Moohebat, M.; Raj, R.G.; Kareem, S.B.A.; Thorleuchter, D.: Identifying ISI-indexed articles by their lexical usage : a text analysis approach (2015) 0.01

0.009990192 = product of:
  0.06993134 = sum of:
    0.03496567 = weight(_text_:classification in 1664) [ClassicSimilarity], result of:
      0.03496567 = score(doc=1664,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3656675 = fieldWeight in 1664, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1664)
    0.03496567 = weight(_text_:classification in 1664) [ClassicSimilarity], result of:
      0.03496567 = score(doc=1664,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3656675 = fieldWeight in 1664, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1664)
  0.14285715 = coord(2/14)

Abstract: This research creates an architecture for investigating the existence of probable lexical divergences between articles, categorized as Institute for Scientific Information (ISI) and non-ISI, and consequently, if such a difference is discovered, to propose the best available classification method. Based on a collection of ISI- and non-ISI-indexed articles in the areas of business and computer science, three classification models are trained. A sensitivity analysis is applied to demonstrate the impact of words in different syntactical forms on the classification decision. The results demonstrate that the lexical domains of ISI and non-ISI articles are distinguishable by machine learning techniques. Our findings indicate that the support vector machine identifies ISI-indexed articles in both disciplines with higher precision than do the Naïve Bayesian and K-Nearest Neighbors techniques.

Ruge, G.; Schwarz, C.: Term association and computational linguistics (1991) 0.01

0.009613066 = product of:
  0.06729146 = sum of:
    0.03364573 = weight(_text_:classification in 2310) [ClassicSimilarity], result of:
      0.03364573 = score(doc=2310,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 2310, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=2310)
    0.03364573 = weight(_text_:classification in 2310) [ClassicSimilarity], result of:
      0.03364573 = score(doc=2310,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 2310, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=2310)
  0.14285715 = coord(2/14)

Source: International classification. 18(1991) no.1, S.19-25

Sparck Jones, K.: Synonymy and semantic classification (1986) 0.01

0.009613066 = product of:
  0.06729146 = sum of:
    0.03364573 = weight(_text_:classification in 1304) [ClassicSimilarity], result of:
      0.03364573 = score(doc=1304,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 1304, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=1304)
    0.03364573 = weight(_text_:classification in 1304) [ClassicSimilarity], result of:
      0.03364573 = score(doc=1304,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 1304, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=1304)
  0.14285715 = coord(2/14)

Zimmermann, H.H.: Language and language technology (1991) 0.01

0.009613066 = product of:
  0.06729146 = sum of:
    0.03364573 = weight(_text_:classification in 2568) [ClassicSimilarity], result of:
      0.03364573 = score(doc=2568,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 2568, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=2568)
    0.03364573 = weight(_text_:classification in 2568) [ClassicSimilarity], result of:
      0.03364573 = score(doc=2568,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 2568, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=2568)
  0.14285715 = coord(2/14)

Source: International classification. 18(1991) no.4, S.196-199

Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.01

0.009516451 = product of:
  0.06661515 = sum of:
    0.033307575 = weight(_text_:classification in 1595) [ClassicSimilarity], result of:
      0.033307575 = score(doc=1595,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.34832728 = fieldWeight in 1595, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1595)
    0.033307575 = weight(_text_:classification in 1595) [ClassicSimilarity], result of:
      0.033307575 = score(doc=1595,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.34832728 = fieldWeight in 1595, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1595)
  0.14285715 = coord(2/14)

Source: Advances in classification research, vol.10: proceedings of the 10th ASIS SIG/CR Classification Research Workshop. Ed.: Albrechtsen, H. u. J.E. Mai

Fóris, A.: Network theory and terminology (2013) 0.01

0.009389086 = product of:
  0.043815732 = sum of:
    0.016822865 = weight(_text_:classification in 1365) [ClassicSimilarity], result of:
      0.016822865 = score(doc=1365,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.17593184 = fieldWeight in 1365, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1365)
    0.016822865 = weight(_text_:classification in 1365) [ClassicSimilarity], result of:
      0.016822865 = score(doc=1365,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.17593184 = fieldWeight in 1365, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1365)
    0.010170003 = product of:
      0.020340007 = sum of:
        0.020340007 = weight(_text_:22 in 1365) [ClassicSimilarity], result of:
          0.020340007 = score(doc=1365,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.19345059 = fieldWeight in 1365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1365)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)

Abstract: The paper aims to present the relations of network theory and terminology. The model of scale-free networks, which has been recently developed and widely applied since, can be effectively used in terminology research as well. Operation based on the principle of networks is a universal characteristic of complex systems. Networks are governed by general laws. The model of scale-free networks can be viewed as a statistical-probability model, and it can be described with mathematical tools. Its main feature is that "everything is connected to everything else," that is, every node is reachable (in a few steps) starting from any other node; this phenomena is called "the small world phenomenon." The existence of a linguistic network and the general laws of the operation of networks enable us to place issues of language use in the complex system of relations that reveal the deeper connection s between phenomena with the help of networks embedded in each other. The realization of the metaphor that language also has a network structure is the basis of the classification methods of the terminological system, and likewise of the ways of creating terminology databases, which serve the purpose of providing easy and versatile accessibility to specialised knowledge.
Date: 2. 9.2014 21:22:48

Szpakowicz, S.; Bond, F.; Nakov, P.; Kim, S.N.: On the semantics of noun compounds (2013) 0.01

0.009228775 = product of:
  0.06460142 = sum of:
    0.029704956 = weight(_text_:subject in 120) [ClassicSimilarity], result of:
      0.029704956 = score(doc=120,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.27661324 = fieldWeight in 120, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0546875 = fieldNorm(doc=120)
    0.034896467 = product of:
      0.069792934 = sum of:
        0.069792934 = weight(_text_:texts in 120) [ClassicSimilarity], result of:
          0.069792934 = score(doc=120,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.42399842 = fieldWeight in 120, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=120)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: The noun compound - a sequence of nouns which functions as a single noun - is very common in English texts. No language processing system should ignore expressions like steel soup pot cover if it wants to be serious about such high-end applications of computational linguistics as question answering, information extraction, text summarization, machine translation - the list goes on. Processing noun compounds, however, is far from trouble-free. For one thing, they can be bracketed in various ways: is it steel soup, steel pot, or steel cover? Then there are relations inside a compound, annoyingly not signalled by any words: does pot contain soup or is it for cooking soup? These and many other research challenges are the subject of this special issue.

Corbara, S.; Moreo, A.; Sebastiani, F.: Syllabic quantity patterns as rhythmic features for Latin authorship attribution (2023) 0.01

0.00834426 = product of:
  0.058409818 = sum of:
    0.02849856 = product of:
      0.05699712 = sum of:
        0.05699712 = weight(_text_:schemes in 846) [ClassicSimilarity], result of:
          0.05699712 = score(doc=846,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.35474116 = fieldWeight in 846, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=846)
      0.5 = coord(1/2)
    0.029911257 = product of:
      0.059822515 = sum of:
        0.059822515 = weight(_text_:texts in 846) [ClassicSimilarity], result of:
          0.059822515 = score(doc=846,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.36342722 = fieldWeight in 846, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.046875 = fieldNorm(doc=846)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: It is well known that, within the Latin production of written text, peculiar metric schemes were followed not only in poetic compositions, but also in many prose works. Such metric patterns were based on so-called syllabic quantity, that is, on the length of the involved syllables, and there is substantial evidence suggesting that certain authors had a preference for certain metric patterns over others. In this research we investigate the possibility to employ syllabic quantity as a base for deriving rhythmic features for the task of computational authorship attribution of Latin prose texts. We test the impact of these features on the authorship attribution task when combined with other topic-agnostic features. Our experiments, carried out on three different datasets using support vector machines (SVMs) show that rhythmic features based on syllabic quantity are beneficial in discriminating among Latin prose authors.

Zaitseva, E.M.: Developing linguistic tools of thematic search in library information systems (2023) 0.01
```
0.00832516 = product of:
  0.058276117 = sum of:
    0.029138058 = weight(_text_:classification in 1187) [ClassicSimilarity], result of:
      0.029138058 = score(doc=1187,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3047229 = fieldWeight in 1187, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1187)
    0.029138058 = weight(_text_:classification in 1187) [ClassicSimilarity], result of:
      0.029138058 = score(doc=1187,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3047229 = fieldWeight in 1187, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1187)
  0.14285715 = coord(2/14)
```
Abstract

Within the R&D program "Information support of research by scientists and specialists on the basis of RNPLS&T Open Archive - the system of scientific knowledge aggregation", the RNPLS&T analyzes the use of linguistic tools of thematic search in the modern library information systems and the prospects for their development. The author defines the key common characteristics of e-catalogs of the largest Russian libraries revealed at the first stage of the analysis. Based on the specified common characteristics and detailed comparison analysis, the author outlines and substantiates the vectors for enhancing search inter faces of e-catalogs. The focus is made on linguistic tools of thematic search in library information systems; the key vectors are suggested: use of thematic search at different search levels with the clear-cut level differentiation; use of combined functionality within thematic search system; implementation of classification search in all e-catalogs; hierarchical representation of classifications; use of the matching systems for classification information retrieval languages, and in the long term classification and verbal information retrieval languages, and various verbal information retrieval languages. The author formulates practical recommendations to improve thematic search in library information systems.

Search (172 results, page 2 of 9)

Authors

Years

Languages

Types

Themes

Subjects

Classifications