Search (19 results, page 1 of 1)

Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.06
```
0.061366733 = product of:
  0.12273347 = sum of:
    0.12273347 = sum of:
      0.08840178 = weight(_text_:maps in 1794) [ClassicSimilarity], result of:
        0.08840178 = score(doc=1794,freq=2.0), product of:
          0.28477904 = queryWeight, product of:
            5.619245 = idf(docFreq=435, maxDocs=44218)
            0.050679237 = queryNorm
          0.31042236 = fieldWeight in 1794, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.619245 = idf(docFreq=435, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
      0.034331687 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
        0.034331687 = score(doc=1794,freq=2.0), product of:
          0.17747006 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679237 = queryNorm
          0.19345059 = fieldWeight in 1794, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
  0.5 = coord(1/2)
```
Abstract

In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document

Date

11. 9.2000 19:53:22
Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.05
```
0.049093388 = product of:
  0.098186776 = sum of:
    0.098186776 = sum of:
      0.070721425 = weight(_text_:maps in 5499) [ClassicSimilarity], result of:
        0.070721425 = score(doc=5499,freq=2.0), product of:
          0.28477904 = queryWeight, product of:
            5.619245 = idf(docFreq=435, maxDocs=44218)
            0.050679237 = queryNorm
          0.2483379 = fieldWeight in 5499, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.619245 = idf(docFreq=435, maxDocs=44218)
            0.03125 = fieldNorm(doc=5499)
      0.027465349 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
        0.027465349 = score(doc=5499,freq=2.0), product of:
          0.17747006 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679237 = queryNorm
          0.15476047 = fieldWeight in 5499, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=5499)
  0.5 = coord(1/2)
```
Abstract

Purpose Modern mathematicians and scientists of math-related disciplines often use Document Preparation Systems (DPS) to write and Computer Algebra Systems (CAS) to calculate mathematical expressions. Usually, they translate the expressions manually between DPS and CAS. This process is time-consuming and error-prone. The purpose of this paper is to automate this translation. This paper uses Maple and Mathematica as the CAS, and LaTeX as the DPS. Design/methodology/approach Bruce Miller at the National Institute of Standards and Technology (NIST) developed a collection of special LaTeX macros that create links from mathematical symbols to their definitions in the NIST Digital Library of Mathematical Functions (DLMF). The authors are using these macros to perform rule-based translations between the formulae in the DLMF and CAS. Moreover, the authors develop software to ease the creation of new rules and to discover inconsistencies. Findings The authors created 396 mappings and translated 58.8 percent of DLMF formulae (2,405 expressions) successfully between Maple and DLMF. For a significant percentage, the special function definitions in Maple and the DLMF were different. An atomic symbol in one system maps to a composite expression in the other system. The translator was also successfully used for automatic verification of mathematical online compendia and CAS. The evaluation techniques discovered two errors in the DLMF and one defect in Maple. Originality/value This paper introduces the first translation tool for special functions between LaTeX and CAS. The approach improves error-prone manual translations and can be used to verify mathematical online compendia and CAS.

Date

20. 1.2015 18:30:22

Salton, G.; Allan, J.; Singhal, A.: Automatic text decomposition and structuring (1996) 0.04

0.035360713 = product of:
  0.070721425 = sum of:
    0.070721425 = product of:
      0.14144285 = sum of:
        0.14144285 = weight(_text_:maps in 4067) [ClassicSimilarity], result of:
          0.14144285 = score(doc=4067,freq=2.0), product of:
            0.28477904 = queryWeight, product of:
              5.619245 = idf(docFreq=435, maxDocs=44218)
              0.050679237 = queryNorm
            0.4966758 = fieldWeight in 4067, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.619245 = idf(docFreq=435, maxDocs=44218)
              0.0625 = fieldNorm(doc=4067)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Sophisticated text similarity measurements are used to determine relationships between natural language text and text excerpts. The resulting linked hypertext maps can be decomposed into text segments and text theme, and these decompositions are usable to identify different text types and text structures, leading to improved text access and utilization. Gives examples of text decomposition for expository and non expository texts

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03

0.027465349 = product of:
  0.054930698 = sum of:
    0.054930698 = product of:
      0.109861396 = sum of:
        0.109861396 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.109861396 = score(doc=402,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information processing and management. 22(1986) no.6, S.465-476

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02

0.02403218 = product of:
  0.04806436 = sum of:
    0.04806436 = product of:
      0.09612872 = sum of:
        0.09612872 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.09612872 = score(doc=6265,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information outlook. 9(2005) no.8, S.22-23

Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.02
```
0.022100445 = product of:
  0.04420089 = sum of:
    0.04420089 = product of:
      0.08840178 = sum of:
        0.08840178 = weight(_text_:maps in 1842) [ClassicSimilarity], result of:
          0.08840178 = score(doc=1842,freq=2.0), product of:
            0.28477904 = queryWeight, product of:
              5.619245 = idf(docFreq=435, maxDocs=44218)
              0.050679237 = queryNorm
            0.31042236 = fieldWeight in 1842, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.619245 = idf(docFreq=435, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1842)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.02

0.017165843 = product of:
  0.034331687 = sum of:
    0.034331687 = product of:
      0.06866337 = sum of:
        0.06866337 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.06866337 = score(doc=1952,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 16. 8.1998 12:51:22

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02

0.017165843 = product of:
  0.034331687 = sum of:
    0.034331687 = product of:
      0.06866337 = sum of:
        0.06866337 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.06866337 = score(doc=4157,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.02

0.017165843 = product of:
  0.034331687 = sum of:
    0.034331687 = product of:
      0.06866337 = sum of:
        0.06866337 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
          0.06866337 = score(doc=2759,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.38690117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 2.2016 18:25:22

Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.01

0.0137326745 = product of:
  0.027465349 = sum of:
    0.027465349 = product of:
      0.054930698 = sum of:
        0.054930698 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
          0.054930698 = score(doc=4709,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.30952093 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01

0.0137326745 = product of:
  0.027465349 = sum of:
    0.027465349 = product of:
      0.054930698 = sum of:
        0.054930698 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
          0.054930698 = score(doc=6752,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.30952093 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 6. 3.1997 16:22:15

Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.01

0.01201609 = product of:
  0.02403218 = sum of:
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
          0.04806436 = score(doc=5001,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.2708308 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 14. 3.1996 13:22:21

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.01

0.01201609 = product of:
  0.02403218 = sum of:
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
          0.04806436 = score(doc=530,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.2708308 = fieldWeight in 530, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.01

0.01201609 = product of:
  0.02403218 = sum of:
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
          0.04806436 = score(doc=2673,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.2708308 = fieldWeight in 2673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 8.1996 22:08:06

Newman, D.J.; Block, S.: Probabilistic topic decomposition of an eighteenth-century American newspaper (2006) 0.01

0.01201609 = product of:
  0.02403218 = sum of:
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = weight(_text_:22 in 5291) [ClassicSimilarity], result of:
          0.04806436 = score(doc=5291,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.2708308 = fieldWeight in 5291, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5291)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2006 17:32:00

Ward, M.L.: ¬The future of the human indexer (1996) 0.01

0.010299506 = product of:
  0.020599011 = sum of:
    0.020599011 = product of:
      0.041198023 = sum of:
        0.041198023 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
          0.041198023 = score(doc=7244,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.23214069 = fieldWeight in 7244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 9. 2.1997 18:44:22

Milstead, J.L.: Thesauri in a full-text world (1998) 0.01

0.008582922 = product of:
  0.017165843 = sum of:
    0.017165843 = product of:
      0.034331687 = sum of:
        0.034331687 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
          0.034331687 = score(doc=2337,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.19345059 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 9.1997 19:16:05

Martins, A.L.; Souza, R.R.; Ribeiro de Mello, H.: ¬The use of noun phrases in information retrieval : proposing a mechanism for automatic classification (2014) 0.01

0.0068663373 = product of:
  0.0137326745 = sum of:
    0.0137326745 = product of:
      0.027465349 = sum of:
        0.027465349 = weight(_text_:22 in 1441) [ClassicSimilarity], result of:
          0.027465349 = score(doc=1441,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.15476047 = fieldWeight in 1441, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1441)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Mesquita, L.A.P.; Souza, R.R.; Baracho Porto, R.M.A.: Noun phrases in automatic indexing: : a structural analysis of the distribution of relevant terms in doctoral theses (2014) 0.01

0.0068663373 = product of:
  0.0137326745 = sum of:
    0.0137326745 = product of:
      0.027465349 = sum of:
        0.027465349 = weight(_text_:22 in 1442) [ClassicSimilarity], result of:
          0.027465349 = score(doc=1442,freq=2.0), product of:
            0.17747006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679237 = queryNorm
            0.15476047 = fieldWeight in 1442, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1442)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Search (19 results, page 1 of 1)

Authors

Years

Themes