-
Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998)
0.06
0.061366733 = product of:
0.12273347 = sum of:
0.12273347 = sum of:
0.08840178 = weight(_text_:maps in 1794) [ClassicSimilarity], result of:
0.08840178 = score(doc=1794,freq=2.0), product of:
0.28477904 = queryWeight, product of:
5.619245 = idf(docFreq=435, maxDocs=44218)
0.050679237 = queryNorm
0.31042236 = fieldWeight in 1794, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.619245 = idf(docFreq=435, maxDocs=44218)
0.0390625 = fieldNorm(doc=1794)
0.034331687 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
0.034331687 = score(doc=1794,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.19345059 = fieldWeight in 1794, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0390625 = fieldNorm(doc=1794)
0.5 = coord(1/2)
- Abstract
- In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document
- Date
- 11. 9.2000 19:53:22
-
Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019)
0.05
0.049093388 = product of:
0.098186776 = sum of:
0.098186776 = sum of:
0.070721425 = weight(_text_:maps in 5499) [ClassicSimilarity], result of:
0.070721425 = score(doc=5499,freq=2.0), product of:
0.28477904 = queryWeight, product of:
5.619245 = idf(docFreq=435, maxDocs=44218)
0.050679237 = queryNorm
0.2483379 = fieldWeight in 5499, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.619245 = idf(docFreq=435, maxDocs=44218)
0.03125 = fieldNorm(doc=5499)
0.027465349 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
0.027465349 = score(doc=5499,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.15476047 = fieldWeight in 5499, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.03125 = fieldNorm(doc=5499)
0.5 = coord(1/2)
- Abstract
- Purpose Modern mathematicians and scientists of math-related disciplines often use Document Preparation Systems (DPS) to write and Computer Algebra Systems (CAS) to calculate mathematical expressions. Usually, they translate the expressions manually between DPS and CAS. This process is time-consuming and error-prone. The purpose of this paper is to automate this translation. This paper uses Maple and Mathematica as the CAS, and LaTeX as the DPS. Design/methodology/approach Bruce Miller at the National Institute of Standards and Technology (NIST) developed a collection of special LaTeX macros that create links from mathematical symbols to their definitions in the NIST Digital Library of Mathematical Functions (DLMF). The authors are using these macros to perform rule-based translations between the formulae in the DLMF and CAS. Moreover, the authors develop software to ease the creation of new rules and to discover inconsistencies. Findings The authors created 396 mappings and translated 58.8 percent of DLMF formulae (2,405 expressions) successfully between Maple and DLMF. For a significant percentage, the special function definitions in Maple and the DLMF were different. An atomic symbol in one system maps to a composite expression in the other system. The translator was also successfully used for automatic verification of mathematical online compendia and CAS. The evaluation techniques discovered two errors in the DLMF and one defect in Maple. Originality/value This paper introduces the first translation tool for special functions between LaTeX and CAS. The approach improves error-prone manual translations and can be used to verify mathematical online compendia and CAS.
- Date
- 20. 1.2015 18:30:22
-
Salton, G.; Allan, J.; Singhal, A.: Automatic text decomposition and structuring (1996)
0.04
0.035360713 = product of:
0.070721425 = sum of:
0.070721425 = product of:
0.14144285 = sum of:
0.14144285 = weight(_text_:maps in 4067) [ClassicSimilarity], result of:
0.14144285 = score(doc=4067,freq=2.0), product of:
0.28477904 = queryWeight, product of:
5.619245 = idf(docFreq=435, maxDocs=44218)
0.050679237 = queryNorm
0.4966758 = fieldWeight in 4067, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.619245 = idf(docFreq=435, maxDocs=44218)
0.0625 = fieldNorm(doc=4067)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- Sophisticated text similarity measurements are used to determine relationships between natural language text and text excerpts. The resulting linked hypertext maps can be decomposed into text segments and text theme, and these decompositions are usable to identify different text types and text structures, leading to improved text access and utilization. Gives examples of text decomposition for expository and non expository texts
-
Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986)
0.03
0.027465349 = product of:
0.054930698 = sum of:
0.054930698 = product of:
0.109861396 = sum of:
0.109861396 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
0.109861396 = score(doc=402,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.61904186 = fieldWeight in 402, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.125 = fieldNorm(doc=402)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Source
- Information processing and management. 22(1986) no.6, S.465-476
-
Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984)
0.02
0.02403218 = product of:
0.04806436 = sum of:
0.04806436 = product of:
0.09612872 = sum of:
0.09612872 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
0.09612872 = score(doc=262,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.5416616 = fieldWeight in 262, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.109375 = fieldNorm(doc=262)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 20.10.2000 12:22:23
-
Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005)
0.02
0.02403218 = product of:
0.04806436 = sum of:
0.04806436 = product of:
0.09612872 = sum of:
0.09612872 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
0.09612872 = score(doc=6265,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.5416616 = fieldWeight in 6265, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.109375 = fieldNorm(doc=6265)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Source
- Information outlook. 9(2005) no.8, S.22-23
-
Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005)
0.02
0.022100445 = product of:
0.04420089 = sum of:
0.04420089 = product of:
0.08840178 = sum of:
0.08840178 = weight(_text_:maps in 1842) [ClassicSimilarity], result of:
0.08840178 = score(doc=1842,freq=2.0), product of:
0.28477904 = queryWeight, product of:
5.619245 = idf(docFreq=435, maxDocs=44218)
0.050679237 = queryNorm
0.31042236 = fieldWeight in 1842, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
5.619245 = idf(docFreq=435, maxDocs=44218)
0.0390625 = fieldNorm(doc=1842)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.
-
Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986)
0.02
0.020599011 = product of:
0.041198023 = sum of:
0.041198023 = product of:
0.082396045 = sum of:
0.082396045 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
0.082396045 = score(doc=58,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.46428138 = fieldWeight in 58, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.09375 = fieldNorm(doc=58)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 14. 6.2015 22:12:44
-
Hauer, M.: Automatische Indexierung (2000)
0.02
0.020599011 = product of:
0.041198023 = sum of:
0.041198023 = product of:
0.082396045 = sum of:
0.082396045 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
0.082396045 = score(doc=5887,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.46428138 = fieldWeight in 5887, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.09375 = fieldNorm(doc=5887)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Source
- Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt
-
Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986)
0.02
0.020599011 = product of:
0.041198023 = sum of:
0.041198023 = product of:
0.082396045 = sum of:
0.082396045 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
0.082396045 = score(doc=2051,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.46428138 = fieldWeight in 2051, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.09375 = fieldNorm(doc=2051)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 14. 6.2015 22:12:56
-
Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019)
0.02
0.020599011 = product of:
0.041198023 = sum of:
0.041198023 = product of:
0.082396045 = sum of:
0.082396045 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
0.082396045 = score(doc=5629,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.46428138 = fieldWeight in 5629, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.09375 = fieldNorm(doc=5629)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Source
- B.I.T.online. 22(2019) H.2, S.163-166
-
Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988)
0.02
0.017165843 = product of:
0.034331687 = sum of:
0.034331687 = product of:
0.06866337 = sum of:
0.06866337 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
0.06866337 = score(doc=1952,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.38690117 = fieldWeight in 1952, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.078125 = fieldNorm(doc=1952)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 16. 8.1998 12:51:22
-
Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998)
0.02
0.017165843 = product of:
0.034331687 = sum of:
0.034331687 = product of:
0.06866337 = sum of:
0.06866337 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
0.06866337 = score(doc=4157,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.38690117 = fieldWeight in 4157, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.078125 = fieldNorm(doc=4157)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Source
- Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
-
Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999)
0.02
0.017165843 = product of:
0.034331687 = sum of:
0.034331687 = product of:
0.06866337 = sum of:
0.06866337 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
0.06866337 = score(doc=374,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.38690117 = fieldWeight in 374, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.078125 = fieldNorm(doc=374)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 1. 4.2002 10:22:41
-
Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016)
0.02
0.017165843 = product of:
0.034331687 = sum of:
0.034331687 = product of:
0.06866337 = sum of:
0.06866337 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
0.06866337 = score(doc=2759,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.38690117 = fieldWeight in 2759, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.078125 = fieldNorm(doc=2759)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 1. 2.2016 18:25:22
-
Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995)
0.01
0.0137326745 = product of:
0.027465349 = sum of:
0.027465349 = product of:
0.054930698 = sum of:
0.054930698 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
0.054930698 = score(doc=4709,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.30952093 = fieldWeight in 4709, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=4709)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 31. 7.1996 9:22:19
-
Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996)
0.01
0.0137326745 = product of:
0.027465349 = sum of:
0.027465349 = product of:
0.054930698 = sum of:
0.054930698 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
0.054930698 = score(doc=6752,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.30952093 = fieldWeight in 6752, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=6752)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 6. 3.1997 16:22:15
-
Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006)
0.01
0.0137326745 = product of:
0.027465349 = sum of:
0.027465349 = product of:
0.054930698 = sum of:
0.054930698 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
0.054930698 = score(doc=3581,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.30952093 = fieldWeight in 3581, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=3581)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 24. 3.2006 12:22:02
-
Probst, M.; Mittelbach, J.: Maschinelle Indexierung in der Sacherschließung wissenschaftlicher Bibliotheken (2006)
0.01
0.0137326745 = product of:
0.027465349 = sum of:
0.027465349 = product of:
0.054930698 = sum of:
0.054930698 = weight(_text_:22 in 1755) [ClassicSimilarity], result of:
0.054930698 = score(doc=1755,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.30952093 = fieldWeight in 1755, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=1755)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 22. 3.2008 12:35:19
-
Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012)
0.01
0.0137326745 = product of:
0.027465349 = sum of:
0.027465349 = product of:
0.054930698 = sum of:
0.054930698 = weight(_text_:22 in 401) [ClassicSimilarity], result of:
0.054930698 = score(doc=401,freq=2.0), product of:
0.17747006 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.050679237 = queryNorm
0.30952093 = fieldWeight in 401, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0625 = fieldNorm(doc=401)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Date
- 11. 9.2012 19:43:22