-
Prasad, A.R.D.: PROMETHEUS: an automatic indexing system (1996)
0.01
0.0058254744 = product of:
0.08155664 = sum of:
0.08155664 = weight(_text_:representation in 5189) [ClassicSimilarity], result of:
0.08155664 = score(doc=5189,freq=6.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.7043805 = fieldWeight in 5189, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0625 = fieldNorm(doc=5189)
0.071428575 = coord(1/14)
- Abstract
- An automatic indexing system using the tools and techniques of artificial intelligence is described. The paper presents the various components of the system like the parser, grammar formalism, lexicon, and the frame based knowledge representation for semantic representation. The semantic representation is based on the Ranganathan school of thought, especially that of Deep Structure of Subject Indexing Languages enunciated by Bhattacharyya. It is attempted to demonstrate the various stepts in indexing by providing an illustration
-
Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998)
0.01
0.005015969 = product of:
0.03511178 = sum of:
0.02942922 = weight(_text_:representation in 1794) [ClassicSimilarity], result of:
0.02942922 = score(doc=1794,freq=2.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.25417143 = fieldWeight in 1794, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0390625 = fieldNorm(doc=1794)
0.0056825615 = product of:
0.017047685 = sum of:
0.017047685 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
0.017047685 = score(doc=1794,freq=2.0), product of:
0.08812423 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.025165197 = queryNorm
0.19345059 = fieldWeight in 1794, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0390625 = fieldNorm(doc=1794)
0.33333334 = coord(1/3)
0.14285715 = coord(2/14)
- Abstract
- In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document
- Date
- 11. 9.2000 19:53:22
-
Paijmans, H.: Comparing the document representation of two IR-systems : CLARIT and TOPIC (1993)
0.00
0.00475648 = product of:
0.06659072 = sum of:
0.06659072 = weight(_text_:representation in 6503) [ClassicSimilarity], result of:
0.06659072 = score(doc=6503,freq=4.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.57512426 = fieldWeight in 6503, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0625 = fieldNorm(doc=6503)
0.071428575 = coord(1/14)
- Abstract
- Discusses the TOPIC and CLARIT information retrieval systems in terms of assigned versus derived and precoordinate versus postcoordinate indexing. Compares the document representation of the two systems. Reports on a test done on a small sample of Wall Street Journal articles. The positive results found for CLARIT in earlier test on medical documents were not observed in this general database
-
Chowdhury, G.G.: Natural language processing and information retrieval : pt.1: basic issues; pt.2: major applications (1991)
0.00
0.0042041745 = product of:
0.05885844 = sum of:
0.05885844 = weight(_text_:representation in 3313) [ClassicSimilarity], result of:
0.05885844 = score(doc=3313,freq=2.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.50834286 = fieldWeight in 3313, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.078125 = fieldNorm(doc=3313)
0.071428575 = coord(1/14)
- Abstract
- Reviews the basic issues and procedures involved in natural language processing of textual material for final use in information retrieval. Covers: natural language processing; natural language understanding; syntactic and semantic analysis; parsing; knowledge bases and knowledge representation
-
Lu, K.; Mao, J.; Li, G.: Toward effective automated weighted subject indexing : a comparison of different approaches in different environments (2018)
0.00
0.0042041745 = product of:
0.05885844 = sum of:
0.05885844 = weight(_text_:representation in 4292) [ClassicSimilarity], result of:
0.05885844 = score(doc=4292,freq=8.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.50834286 = fieldWeight in 4292, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0390625 = fieldNorm(doc=4292)
0.071428575 = coord(1/14)
- Abstract
- Subject indexing plays an important role in supporting subject access to information resources. Current subject indexing systems do not make adequate distinctions on the importance of assigned subject descriptors. Assigning numeric weights to subject descriptors to distinguish their importance to the documents can strengthen the role of subject metadata. Automated methods are more cost-effective. This study compares different automated weighting methods in different environments. Two evaluation methods were used to assess the performance. Experiments on three datasets in the biomedical domain suggest the performance of different weighting methods depends on whether it is an abstract or full text environment. Mutual information with bag-of-words representation shows the best average performance in the full text environment, while cosine with bag-of-words representation is the best in an abstract environment. The cosine measure has relatively consistent and robust performance. A direct weighting method, IDF (Inverse Document Frequency), can produce quick and reasonable estimates of the weights. Bag-of-words representation generally outperforms the concept-based representation. Further improvement in performance can be obtained by using the learning-to-rank method to integrate different weighting methods. This study follows up Lu and Mao (Journal of the Association for Information Science and Technology, 66, 1776-1784, 2015), in which an automated weighted subject indexing method was proposed and validated. The findings from this study contribute to more effective weighted subject indexing.
-
Zhang, Y.; Zhang, C.; Li, J.: Joint modeling of characters, words, and conversation contexts for microblog keyphrase extraction (2020)
0.00
0.0042041745 = product of:
0.05885844 = sum of:
0.05885844 = weight(_text_:representation in 5816) [ClassicSimilarity], result of:
0.05885844 = score(doc=5816,freq=8.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.50834286 = fieldWeight in 5816, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0390625 = fieldNorm(doc=5816)
0.071428575 = coord(1/14)
- Abstract
- Millions of messages are produced on microblog platforms every day, leading to the pressing need for automatic identification of key points from the massive texts. To absorb salient content from the vast bulk of microblog posts, this article focuses on the task of microblog keyphrase extraction. In previous work, most efforts treat messages as independent documents and might suffer from the data sparsity problem exhibited in short and informal microblog posts. On the contrary, we propose to enrich contexts via exploiting conversations initialized by target posts and formed by their replies, which are generally centered around relevant topics to the target posts and therefore helpful for keyphrase identification. Concretely, we present a neural keyphrase extraction framework, which has 2 modules: a conversation context encoder and a keyphrase tagger. The conversation context encoder captures indicative representation from their conversation contexts and feeds the representation into the keyphrase tagger, and the keyphrase tagger extracts salient words from target posts. The 2 modules were trained jointly to optimize the conversation context encoding and keyphrase extraction processes. In the conversation context encoder, we leverage hierarchical structures to capture the word-level indicative representation and message-level indicative representation hierarchically. In both of the modules, we apply character-level representations, which enables the model to explore morphological features and deal with the out-of-vocabulary problem caused by the informal language style of microblog messages. Extensive comparison results on real-life data sets indicate that our model outperforms state-of-the-art models from previous studies.
-
Bonzi, S.: Representation of concepts in text : a comparison of within-document frequency, anaphora, and synonymy (1991)
0.00
0.00416192 = product of:
0.058266878 = sum of:
0.058266878 = weight(_text_:representation in 4933) [ClassicSimilarity], result of:
0.058266878 = score(doc=4933,freq=4.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.50323373 = fieldWeight in 4933, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0546875 = fieldNorm(doc=4933)
0.071428575 = coord(1/14)
- Abstract
- Investigates the 3 major ways by which a concept may be represented in text: within-document frequency, anaphoric reference, and synonyms in order to determine which provides the optical means of representation. Analysis a sample of 60 abstracts, drawn at random for the abstracting journals of 4 disciplines. Results show that in general, initial within-document frequency is higher for keyword terms. Additionally, frequency of keyword terms referenced anaphorically or with intellectually related terms is higher that that of other keyword terms. It appears that initial document length influences both the number and impact of both anaphoric resolutions and intellectually related terms
-
Liu, G.Z.: Semantic vector space model : implementation and evaluation (1997)
0.00
0.0035673599 = product of:
0.049943037 = sum of:
0.049943037 = weight(_text_:representation in 161) [ClassicSimilarity], result of:
0.049943037 = score(doc=161,freq=4.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.4313432 = fieldWeight in 161, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.046875 = fieldNorm(doc=161)
0.071428575 = coord(1/14)
- Abstract
- Presents the Semantic Vector Space Model (SVSM), a text representation and searching technique based on the combination of Vector Space Model (VSM) with heuristic syntax parsing and distributed representation of semantic case structures. Both document and queries are represented as semantic matrices. A search mechanism is designed to compute the similarity between 2 semantic matrices to predict relevancy. A prototype system was built to implement this model by modifying the SMART system and using the Xerox Part of Speech tagged as the pre-processor of the indexing. The prototype system was used in an experimental study to evaluate this technique in terms of precision, recall, and effectiveness of relevance ranking. Results show that if documents and queries were too short, the technique was less effective than VSM. But with longer documents and queires, especially when original docuemtns were used as queries, the system based on this technique was found be performance better than SMART
-
Lassalle, E.: Text retrieval : from a monolingual system to a multilingual system (1993)
0.00
0.0029429218 = product of:
0.041200902 = sum of:
0.041200902 = weight(_text_:representation in 7403) [ClassicSimilarity], result of:
0.041200902 = score(doc=7403,freq=2.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.35583997 = fieldWeight in 7403, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0546875 = fieldNorm(doc=7403)
0.071428575 = coord(1/14)
- Abstract
- Describes the TELMI monolingual text retrieval system and its future extension, a multilingual system. TELMI is designed for medium sized databases containing short texts. The characteristics of the system are fine-grained natural language processing (NLP); an open domain and a large scale knowledge base; automated indexing based on conceptual representation of texts and reusability of the NLP tools. Discusses the French MINITEL service, the MGS information service and the TELMI research system covering the full text system; NLP architecture; the lexical level; the syntactic level; the semantic level and an example of the use of a generic system
-
Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D.: ¬A picture is worth a thousand (coherent) words : building a natural description of images (2014)
0.00
0.0025486453 = product of:
0.03568103 = sum of:
0.03568103 = weight(_text_:representation in 1874) [ClassicSimilarity], result of:
0.03568103 = score(doc=1874,freq=6.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.30816647 = fieldWeight in 1874, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.02734375 = fieldNorm(doc=1874)
0.071428575 = coord(1/14)
- Content
- "People can summarize a complex scene in a few words without thinking twice. It's much more difficult for computers. But we've just gotten a bit closer -- we've developed a machine-learning system that can automatically produce captions (like the three above) to accurately describe images the first time it sees them. This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images. Recent research has greatly improved object detection, classification, and labeling. But accurately describing a complex scene requires a deeper representation of what's going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. Many efforts to construct computer-generated natural descriptions of images propose combining current state-of-the-art techniques in both computer vision and natural language processing to form a complete image description approach. But what if we instead merged recent computer vision and language models into a single jointly trained system, taking an image and directly producing a human readable sequence of words to describe it? This idea comes from recent advances in machine translation between languages, where a Recurrent Neural Network (RNN) transforms, say, a French sentence into a vector representation, and a second RNN uses that vector representation to generate a target sentence in German. Now, what if we replaced that first RNN and its input words with a deep Convolutional Neural Network (CNN) trained to classify objects in images? Normally, the CNN's last layer is used in a final Softmax among known classes of objects, assigning a probability that each object might be in the image. But if we remove that final layer, we can instead feed the CNN's rich encoding of the image into a RNN designed to produce phrases. We can then train the whole system directly on images and their captions, so it maximizes the likelihood that descriptions it produces best match the training descriptions for each image.
-
Souza, R.R.; Raghavan, K.S.: ¬A methodology for noun phrase-based automatic indexing (2006)
0.00
0.0025225044 = product of:
0.03531506 = sum of:
0.03531506 = weight(_text_:representation in 173) [ClassicSimilarity], result of:
0.03531506 = score(doc=173,freq=2.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.3050057 = fieldWeight in 173, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.046875 = fieldNorm(doc=173)
0.071428575 = coord(1/14)
- Abstract
- The scholarly community is increasingly employing the Web both for publication of scholarly output and for locating and accessing relevant scholarly literature. Organization of this vast body of digital information assumes significance in this context. The sheer volume of digital information to be handled makes traditional indexing and knowledge representation strategies ineffective and impractical. It is, therefore, worth exploring new approaches. An approach being discussed considers the intrinsic semantics of texts of documents. Based on the hypothesis that noun phrases in a text are semantically rich in terms of their ability to represent the subject content of the document, this approach seeks to identify and extract noun phrases instead of single keywords, and use them as descriptors. This paper presents a methodology that has been developed for extracting noun phrases from Portuguese texts. The results of an experiment carried out to test the adequacy of the methodology are also presented.
-
Snajder, J.; Dalbelo Basic, B.D.; Tadic, M.: Automatic acquisition of inflectional lexica for morphological normalisation (2008)
0.00
0.0025225044 = product of:
0.03531506 = sum of:
0.03531506 = weight(_text_:representation in 2910) [ClassicSimilarity], result of:
0.03531506 = score(doc=2910,freq=2.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.3050057 = fieldWeight in 2910, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.046875 = fieldNorm(doc=2910)
0.071428575 = coord(1/14)
- Abstract
- Due to natural language morphology, words can take on various morphological forms. Morphological normalisation - often used in information retrieval and text mining systems - conflates morphological variants of a word to a single representative form. In this paper, we describe an approach to lexicon-based inflectional normalisation. This approach is in between stemming and lemmatisation, and is suitable for morphological normalisation of inflectionally complex languages. To eliminate the immense effort required to compile the lexicon by hand, we focus on the problem of acquiring automatically an inflectional morphological lexicon from raw corpora. We propose a convenient and highly expressive morphology representation formalism on which the acquisition procedure is based. Our approach is applied to the morphologically complex Croatian language, but it should be equally applicable to other languages of similar morphological complexity. Experimental results show that our approach can be used to acquire a lexicon whose linguistic quality allows for rather good normalisation performance.
-
Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997)
0.00
0.002283341 = product of:
0.031966772 = sum of:
0.031966772 = product of:
0.047950156 = sum of:
0.024083402 = weight(_text_:29 in 2673) [ClassicSimilarity], result of:
0.024083402 = score(doc=2673,freq=2.0), product of:
0.08852329 = queryWeight, product of:
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.025165197 = queryNorm
0.27205724 = fieldWeight in 2673, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.0546875 = fieldNorm(doc=2673)
0.023866756 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
0.023866756 = score(doc=2673,freq=2.0), product of:
0.08812423 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.025165197 = queryNorm
0.2708308 = fieldWeight in 2673, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=2673)
0.6666667 = coord(2/3)
0.071428575 = coord(1/14)
- Date
- 1. 8.1996 22:08:06
- Source
- Computer networks and ISDN systems. 29(1997) no.8, S.1147-1156
-
Franke-Maier, M.: Anforderungen an die Qualität der Inhaltserschließung im Spannungsfeld von intellektuell und automatisch erzeugten Metadaten (2018)
0.00
0.002283341 = product of:
0.031966772 = sum of:
0.031966772 = product of:
0.047950156 = sum of:
0.024083402 = weight(_text_:29 in 5344) [ClassicSimilarity], result of:
0.024083402 = score(doc=5344,freq=2.0), product of:
0.08852329 = queryWeight, product of:
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.025165197 = queryNorm
0.27205724 = fieldWeight in 5344, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.0546875 = fieldNorm(doc=5344)
0.023866756 = weight(_text_:22 in 5344) [ClassicSimilarity], result of:
0.023866756 = score(doc=5344,freq=2.0), product of:
0.08812423 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.025165197 = queryNorm
0.2708308 = fieldWeight in 5344, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=5344)
0.6666667 = coord(2/3)
0.071428575 = coord(1/14)
- Abstract
- Spätestens seit dem Deutschen Bibliothekartag 2018 hat sich die Diskussion zu den automatischen Verfahren der Inhaltserschließung der Deutschen Nationalbibliothek von einer politisch geführten Diskussion in eine Qualitätsdiskussion verwandelt. Der folgende Beitrag beschäftigt sich mit Fragen der Qualität von Inhaltserschließung in digitalen Zeiten, wo heterogene Erzeugnisse unterschiedlicher Verfahren aufeinandertreffen und versucht, wichtige Anforderungen an Qualität zu definieren. Dieser Tagungsbeitrag fasst die vom Autor als Impulse vorgetragenen Ideen beim Workshop der FAG "Erschließung und Informationsvermittlung" des GBV am 29. August 2018 in Kiel zusammen. Der Workshop fand im Rahmen der 22. Verbundkonferenz des GBV statt.
-
Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints
0.00
0.0021020873 = product of:
0.02942922 = sum of:
0.02942922 = weight(_text_:representation in 4309) [ClassicSimilarity], result of:
0.02942922 = score(doc=4309,freq=2.0), product of:
0.11578492 = queryWeight, product of:
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.025165197 = queryNorm
0.25417143 = fieldWeight in 4309, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.600994 = idf(docFreq=1206, maxDocs=44218)
0.0390625 = fieldNorm(doc=4309)
0.071428575 = coord(1/14)
- Abstract
- Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to the relevance of particular subjects, disregarding indicators of insufficient content representation at the document-level. Therefore, we propose a novel approach that detects documents rather than concepts where quality criteria are met. Our approach uses a deep, multi-layered regression architecture, which comprises a variety of content-based indicators. We evaluated multiple configurations using text collections from law and economics, where the available content is restricted to very short texts. Notably, we demonstrate that the proposed quality estimation technique can determine subsets of the previously unseen data where considerable gains in document-level recall can be achieved, while upholding precision at the same time. Hence, the approach effectively performs a filtering that ensures high data quality standards in operative information retrieval systems.
-
Kuhlen, R.: Morphologische Relationen durch Reduktionsalgorithmen (1974)
0.00
0.0016218609 = product of:
0.022706052 = sum of:
0.022706052 = product of:
0.068118155 = sum of:
0.068118155 = weight(_text_:29 in 4251) [ClassicSimilarity], result of:
0.068118155 = score(doc=4251,freq=4.0), product of:
0.08852329 = queryWeight, product of:
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.025165197 = queryNorm
0.7694941 = fieldWeight in 4251, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.109375 = fieldNorm(doc=4251)
0.33333334 = coord(1/3)
0.071428575 = coord(1/14)
- Date
- 29. 1.2011 14:56:29
-
Panyr, J.: STEINADLER: ein Verfahren zur automatischen Deskribierung und zur automatischen thematischen Klassifikation (1978)
0.00
0.0013106616 = product of:
0.01834926 = sum of:
0.01834926 = product of:
0.055047777 = sum of:
0.055047777 = weight(_text_:29 in 5169) [ClassicSimilarity], result of:
0.055047777 = score(doc=5169,freq=2.0), product of:
0.08852329 = queryWeight, product of:
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.025165197 = queryNorm
0.6218451 = fieldWeight in 5169, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.125 = fieldNorm(doc=5169)
0.33333334 = coord(1/3)
0.071428575 = coord(1/14)
- Source
- Nachrichten für Dokumentation. 29(1978), S.92-96
-
Salton, G.; Yang, C.S.: On the specification of term values in automatic indexing (1973)
0.00
0.0013106616 = product of:
0.01834926 = sum of:
0.01834926 = product of:
0.055047777 = sum of:
0.055047777 = weight(_text_:29 in 5476) [ClassicSimilarity], result of:
0.055047777 = score(doc=5476,freq=2.0), product of:
0.08852329 = queryWeight, product of:
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.025165197 = queryNorm
0.6218451 = fieldWeight in 5476, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5176873 = idf(docFreq=3565, maxDocs=44218)
0.125 = fieldNorm(doc=5476)
0.33333334 = coord(1/3)
0.071428575 = coord(1/14)
- Source
- Journal of documentation. 29(1973), S.351-372
-
Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986)
0.00
0.0012988712 = product of:
0.018184196 = sum of:
0.018184196 = product of:
0.05455259 = sum of:
0.05455259 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
0.05455259 = score(doc=402,freq=2.0), product of:
0.08812423 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.025165197 = queryNorm
0.61904186 = fieldWeight in 402, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.125 = fieldNorm(doc=402)
0.33333334 = coord(1/3)
0.071428575 = coord(1/14)
- Source
- Information processing and management. 22(1986) no.6, S.465-476
-
Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984)
0.00
0.0011365123 = product of:
0.015911171 = sum of:
0.015911171 = product of:
0.04773351 = sum of:
0.04773351 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
0.04773351 = score(doc=262,freq=2.0), product of:
0.08812423 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.025165197 = queryNorm
0.5416616 = fieldWeight in 262, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.109375 = fieldNorm(doc=262)
0.33333334 = coord(1/3)
0.071428575 = coord(1/14)
- Date
- 20.10.2000 12:22:23