Search (40 results, page 1 of 2)

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02

0.017099358 = product of:
  0.05129807 = sum of:
    0.05129807 = product of:
      0.10259614 = sum of:
        0.10259614 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.10259614 = score(doc=402,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information processing and management. 22(1986) no.6, S.465-476

Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.01

0.014961937 = product of:
  0.04488581 = sum of:
    0.04488581 = product of:
      0.08977162 = sum of:
        0.08977162 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
          0.08977162 = score(doc=262,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.5416616 = fieldWeight in 262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=262)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 20.10.2000 12:22:23

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.01

0.014961937 = product of:
  0.04488581 = sum of:
    0.04488581 = product of:
      0.08977162 = sum of:
        0.08977162 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.08977162 = score(doc=6265,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information outlook. 9(2005) no.8, S.22-23

Mars, N.J.I.: ¬The management of scientific information, or, how to cope with the flood (1996) 0.01

0.014937594 = product of:
  0.04481278 = sum of:
    0.04481278 = product of:
      0.08962556 = sum of:
        0.08962556 = weight(_text_:group in 7414) [ClassicSimilarity], result of:
          0.08962556 = score(doc=7414,freq=2.0), product of:
            0.21906674 = queryWeight, product of:
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.047327764 = queryNorm
            0.40912446 = fieldWeight in 7414, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.0625 = fieldNorm(doc=7414)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Research in the Knowledge-Based Systems Group of the University of Twente in the Netherlands is aimed at reducing information overload. One approach is to support indexing by the traditional method of assigning content descriptions to find documents. A second way is to use a computer program to determine what the document says without descriptors. Discusses automated indexing and direct access to information

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.01

0.012824517 = product of:
  0.03847355 = sum of:
    0.03847355 = product of:
      0.0769471 = sum of:
        0.0769471 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
          0.0769471 = score(doc=58,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.46428138 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 14. 6.2015 22:12:44

Hauer, M.: Automatische Indexierung (2000) 0.01

0.012824517 = product of:
  0.03847355 = sum of:
    0.03847355 = product of:
      0.0769471 = sum of:
        0.0769471 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
          0.0769471 = score(doc=5887,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.46428138 = fieldWeight in 5887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5887)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt

Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.01

0.012824517 = product of:
  0.03847355 = sum of:
    0.03847355 = product of:
      0.0769471 = sum of:
        0.0769471 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
          0.0769471 = score(doc=2051,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.46428138 = fieldWeight in 2051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=2051)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 14. 6.2015 22:12:56

Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019) 0.01

0.012824517 = product of:
  0.03847355 = sum of:
    0.03847355 = product of:
      0.0769471 = sum of:
        0.0769471 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
          0.0769471 = score(doc=5629,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.46428138 = fieldWeight in 5629, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5629)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: B.I.T.online. 22(2019) H.2, S.163-166

Medelyan, O.; Witten, I.H.: Domain-independent automatic keyphrase indexing with small training sets (2008) 0.01
```
0.011203195 = product of:
  0.033609584 = sum of:
    0.033609584 = product of:
      0.06721917 = sum of:
        0.06721917 = weight(_text_:group in 1871) [ClassicSimilarity], result of:
          0.06721917 = score(doc=1871,freq=2.0), product of:
            0.21906674 = queryWeight, product of:
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.047327764 = queryNorm
            0.30684334 = fieldWeight in 1871, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.046875 = fieldNorm(doc=1871)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Keyphrases are widely used in both physical and digital libraries as a brief, but precise, summary of documents. They help organize material based on content, provide thematic access, represent search results, and assist with navigation. Manual assignment is expensive because trained human indexers must reach an understanding of the document and select appropriate descriptors according to defined cataloging rules. We propose a new method that enhances automatic keyphrase extraction by using semantic information about terms and phrases gleaned from a domain-specific thesaurus. The key advantage of the new approach is that it performs well with very little training data. We evaluate it on a large set of manually indexed documents in the domain of agriculture, compare its consistency with a group of six professional indexers, and explore its performance on smaller collections of documents in other domains and of French and Spanish documents.

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.01

0.010687098 = product of:
  0.032061294 = sum of:
    0.032061294 = product of:
      0.06412259 = sum of:
        0.06412259 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.06412259 = score(doc=1952,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 16. 8.1998 12:51:22

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.01

0.010687098 = product of:
  0.032061294 = sum of:
    0.032061294 = product of:
      0.06412259 = sum of:
        0.06412259 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.06412259 = score(doc=4157,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.01

0.010687098 = product of:
  0.032061294 = sum of:
    0.032061294 = product of:
      0.06412259 = sum of:
        0.06412259 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
          0.06412259 = score(doc=374,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.38690117 = fieldWeight in 374, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=374)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 4.2002 10:22:41

Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.01

0.010687098 = product of:
  0.032061294 = sum of:
    0.032061294 = product of:
      0.06412259 = sum of:
        0.06412259 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
          0.06412259 = score(doc=2759,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.38690117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Ahlgren, P.; Kekäläinen, J.: Indexing strategies for Swedish full text retrieval under different user scenarios (2007) 0.01
```
0.009335997 = product of:
  0.028007988 = sum of:
    0.028007988 = product of:
      0.056015976 = sum of:
        0.056015976 = weight(_text_:group in 896) [ClassicSimilarity], result of:
          0.056015976 = score(doc=896,freq=2.0), product of:
            0.21906674 = queryWeight, product of:
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.047327764 = queryNorm
            0.2557028 = fieldWeight in 896, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.0390625 = fieldNorm(doc=896)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This paper deals with Swedish full text retrieval and the problem of morphological variation of query terms in the document database. The effects of combination of indexing strategies with query terms on retrieval effectiveness were studied. Three of five tested combinations involved indexing strategies that used conflation, in the form of normalization. Further, two of these three combinations used indexing strategies that employed compound splitting. Normalization and compound splitting were performed by SWETWOL, a morphological analyzer for the Swedish language. A fourth combination attempted to group related terms by right hand truncation of query terms. The four combinations were compared to each other and to a baseline combination, where no attempt was made to counteract the problem of morphological variation of query terms in the document database. The five combinations were evaluated under six different user scenarios, where each scenario simulated a certain user type. The four alternative combinations outperformed the baseline, for each user scenario. The truncation combination had the best performance under each user scenario. The main conclusion of the paper is that normalization and right hand truncation (performed by a search expert) enhanced retrieval effectiveness in comparison to the baseline. The performance of the three combinations of indexing strategies with query terms based on normalization was not far below the performance of the truncation combination.
Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.01
```
0.009335997 = product of:
  0.028007988 = sum of:
    0.028007988 = product of:
      0.056015976 = sum of:
        0.056015976 = weight(_text_:group in 3627) [ClassicSimilarity], result of:
          0.056015976 = score(doc=3627,freq=2.0), product of:
            0.21906674 = queryWeight, product of:
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.047327764 = queryNorm
            0.2557028 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).
Golub, K.: Automatic subject indexing of text (2019) 0.01
```
0.009335997 = product of:
  0.028007988 = sum of:
    0.028007988 = product of:
      0.056015976 = sum of:
        0.056015976 = weight(_text_:group in 5268) [ClassicSimilarity], result of:
          0.056015976 = score(doc=5268,freq=2.0), product of:
            0.21906674 = queryWeight, product of:
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.047327764 = queryNorm
            0.2557028 = fieldWeight in 5268, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.628715 = idf(docFreq=1173, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5268)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Automatic subject indexing addresses problems of scale and sustainability and can be at the same time used to enrich existing metadata records, establish more connections across and between resources from various metadata and resource collec-tions, and enhance consistency of the metadata. In this work, au-tomatic subject indexing focuses on assigning index terms or classes from established knowledge organization systems (KOSs) for subject indexing like thesauri, subject headings systems and classification systems. The following major approaches are dis-cussed, in terms of their similarities and differences, advantages and disadvantages for automatic assigned indexing from KOSs: "text categorization," "document clustering," and "document classification." Text categorization is perhaps the most wide-spread, machine-learning approach with what seems generally good reported performance. Document clustering automatically both creates groups of related documents and extracts names of subjects depicting the group at hand. Document classification re-uses the intellectual effort invested into creating a KOS for sub-ject indexing and even simple string-matching algorithms have been reported to achieve good results, because one concept can be described using a number of different terms, including equiv-alent, related, narrower and broader terms. Finally, applicability of automatic subject indexing to operative information systems and challenges of evaluation are outlined, suggesting the need for more research.

Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.01

0.008549679 = product of:
  0.025649035 = sum of:
    0.025649035 = product of:
      0.05129807 = sum of:
        0.05129807 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
          0.05129807 = score(doc=4709,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.30952093 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 31. 7.1996 9:22:19

Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01

0.008549679 = product of:
  0.025649035 = sum of:
    0.025649035 = product of:
      0.05129807 = sum of:
        0.05129807 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
          0.05129807 = score(doc=6752,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.30952093 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 6. 3.1997 16:22:15

Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.01

0.008549679 = product of:
  0.025649035 = sum of:
    0.025649035 = product of:
      0.05129807 = sum of:
        0.05129807 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
          0.05129807 = score(doc=3581,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.30952093 = fieldWeight in 3581, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3581)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 24. 3.2006 12:22:02

Probst, M.; Mittelbach, J.: Maschinelle Indexierung in der Sacherschließung wissenschaftlicher Bibliotheken (2006) 0.01

0.008549679 = product of:
  0.025649035 = sum of:
    0.025649035 = product of:
      0.05129807 = sum of:
        0.05129807 = weight(_text_:22 in 1755) [ClassicSimilarity], result of:
          0.05129807 = score(doc=1755,freq=2.0), product of:
            0.16573377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047327764 = queryNorm
            0.30952093 = fieldWeight in 1755, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1755)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 3.2008 12:35:19

Search (40 results, page 1 of 2)

Authors

Years

Languages

Types

Themes