Search (207 results, page 1 of 11)

  • × theme_ss:"Automatisches Indexieren"
  1. Milstead, J.L.: Thesauri in a full-text world (1998) 0.07
    0.07340328 = product of:
      0.12845573 = sum of:
        0.020927707 = weight(_text_:systems in 2337) [ClassicSimilarity], result of:
          0.020927707 = score(doc=2337,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.1697705 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.0108718425 = product of:
          0.021743685 = sum of:
            0.021743685 = weight(_text_:science in 2337) [ClassicSimilarity], result of:
              0.021743685 = score(doc=2337,freq=4.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.20579056 = fieldWeight in 2337, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2337)
          0.5 = coord(1/2)
        0.0265347 = weight(_text_:library in 2337) [ClassicSimilarity], result of:
          0.0265347 = score(doc=2337,freq=6.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.25158736 = fieldWeight in 2337, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.07012148 = sum of:
          0.042948496 = weight(_text_:applications in 2337) [ClassicSimilarity], result of:
            0.042948496 = score(doc=2337,freq=2.0), product of:
              0.17659263 = queryWeight, product of:
                4.4025097 = idf(docFreq=1471, maxDocs=44218)
                0.04011181 = queryNorm
              0.2432066 = fieldWeight in 2337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4025097 = idf(docFreq=1471, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
          0.027172983 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
            0.027172983 = score(doc=2337,freq=2.0), product of:
              0.14046472 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04011181 = queryNorm
              0.19345059 = fieldWeight in 2337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
      0.5714286 = coord(4/7)
    
    Abstract
    Despite early claims to the contemporary, thesauri continue to find use as access tools for information in the full-text environment. Their mode of use is changing, but this change actually represents an expansion rather than a contrdiction of their utility. Thesauri and similar vocabulary tools can complement full-text access by aiding users in focusing their searches, by supplementing the linguistic analysis of the text search engine, and even by serving as one of the tools used by the linguistic engine for its analysis. While human indexing contunues to be used for many databases, the trend is to increase the use of machine aids for this purpose. All machine-aided indexing (MAI) systems rely on thesauri as the basis for term selection. In the 21st century, the balance of effort between human and machine will change at both input and output, but thesauri will continue to play an important role for the foreseeable future
    Date
    22. 9.1997 19:16:05
    Imprint
    Urbana-Champaign, IL : Illinois University at Urbana-Champaign, Graduate School of Library and Information Science
    Source
    Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al
  2. Kim, P.K.: ¬An automatic indexing of compound words based on mutual information for Korean text retrieval (1995) 0.05
    0.054200415 = product of:
      0.12646763 = sum of:
        0.033484332 = weight(_text_:systems in 620) [ClassicSimilarity], result of:
          0.033484332 = score(doc=620,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.2716328 = fieldWeight in 620, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=620)
        0.068471596 = sum of:
          0.02460017 = weight(_text_:science in 620) [ClassicSimilarity], result of:
            0.02460017 = score(doc=620,freq=2.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.23282544 = fieldWeight in 620, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.0625 = fieldNorm(doc=620)
          0.043871425 = weight(_text_:29 in 620) [ClassicSimilarity], result of:
            0.043871425 = score(doc=620,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.31092256 = fieldWeight in 620, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0625 = fieldNorm(doc=620)
        0.024511702 = weight(_text_:library in 620) [ClassicSimilarity], result of:
          0.024511702 = score(doc=620,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.23240642 = fieldWeight in 620, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0625 = fieldNorm(doc=620)
      0.42857143 = coord(3/7)
    
    Abstract
    Presents an automatic indexing technique for compound words suitable for an agglutinative language, specifically Korean. Discusses some construction conditions for compound words and the rules for decomposing compound words to enhance the exhaustivity of indexing, demonstrating that this system, mutual information, enhances both the exhaustivity of indexing and the specifity of terms. Suggests that the construction conditions and rules for decomposition presented may be used in multilingual information retrieval systems to translate the indexing terms of the specific language into those of the language required
    Source
    Library and information science. 1995, no.34, S.29-38
  3. Thiel, T.J.: Automated indexing of information stored on optical disk electronic document image management systems (1994) 0.05
    0.052722093 = product of:
      0.12301821 = sum of:
        0.05859758 = weight(_text_:systems in 1260) [ClassicSimilarity], result of:
          0.05859758 = score(doc=1260,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.47535738 = fieldWeight in 1260, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.109375 = fieldNorm(doc=1260)
        0.02152515 = product of:
          0.0430503 = sum of:
            0.0430503 = weight(_text_:science in 1260) [ClassicSimilarity], result of:
              0.0430503 = score(doc=1260,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.40744454 = fieldWeight in 1260, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.109375 = fieldNorm(doc=1260)
          0.5 = coord(1/2)
        0.04289548 = weight(_text_:library in 1260) [ClassicSimilarity], result of:
          0.04289548 = score(doc=1260,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.40671125 = fieldWeight in 1260, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.109375 = fieldNorm(doc=1260)
      0.42857143 = coord(3/7)
    
    Source
    Encyclopedia of library and information science. Vol.54, [=Suppl.17]
  4. Banerjee, K.; Johnson, M.: Improving access to archival collections with automated entity extraction (2015) 0.04
    0.040931392 = product of:
      0.09550658 = sum of:
        0.0513537 = sum of:
          0.018450128 = weight(_text_:science in 2144) [ClassicSimilarity], result of:
            0.018450128 = score(doc=2144,freq=2.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.17461908 = fieldWeight in 2144, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.046875 = fieldNorm(doc=2144)
          0.03290357 = weight(_text_:29 in 2144) [ClassicSimilarity], result of:
            0.03290357 = score(doc=2144,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.23319192 = fieldWeight in 2144, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.046875 = fieldNorm(doc=2144)
        0.018383777 = weight(_text_:library in 2144) [ClassicSimilarity], result of:
          0.018383777 = score(doc=2144,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.17430481 = fieldWeight in 2144, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.046875 = fieldNorm(doc=2144)
        0.025769096 = product of:
          0.05153819 = sum of:
            0.05153819 = weight(_text_:applications in 2144) [ClassicSimilarity], result of:
              0.05153819 = score(doc=2144,freq=2.0), product of:
                0.17659263 = queryWeight, product of:
                  4.4025097 = idf(docFreq=1471, maxDocs=44218)
                  0.04011181 = queryNorm
                0.2918479 = fieldWeight in 2144, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4025097 = idf(docFreq=1471, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2144)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    The complexity and diversity of archival resources make constructing rich metadata records time consuming and expensive, which in turn limits access to these valuable materials. However, significant automation of the metadata creation process would dramatically reduce the cost of providing access points, improve access to individual resources, and establish connections between resources that would otherwise remain unknown. Using a case study at Oregon Health & Science University as a lens to examine the conceptual and technical challenges associated with automated extraction of access points, we discuss using publically accessible API's to extract entities (i.e. people, places, concepts, etc.) from digital and digitized objects. We describe why Linked Open Data is not well suited for a use case such as ours. We conclude with recommendations about how this method can be used in archives as well as for other library applications.
    Source
    Code4Lib journal. Issue 29(2015), [http://journal.code4lib.org/issues/issues/issue29]
  5. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.04
    0.037590347 = product of:
      0.08771081 = sum of:
        0.029596249 = weight(_text_:systems in 5400) [ClassicSimilarity], result of:
          0.029596249 = score(doc=5400,freq=4.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.24009174 = fieldWeight in 5400, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
        0.04279475 = sum of:
          0.0153751075 = weight(_text_:science in 5400) [ClassicSimilarity], result of:
            0.0153751075 = score(doc=5400,freq=2.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.1455159 = fieldWeight in 5400, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5400)
          0.027419642 = weight(_text_:29 in 5400) [ClassicSimilarity], result of:
            0.027419642 = score(doc=5400,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.19432661 = fieldWeight in 5400, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5400)
        0.015319815 = weight(_text_:library in 5400) [ClassicSimilarity], result of:
          0.015319815 = score(doc=5400,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.14525402 = fieldWeight in 5400, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.42857143 = coord(3/7)
    
    Abstract
    Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
    Date
    29. 9.2019 12:18:42
    Footnote
    Beitrag eines Special Issue: Research Information Systems and Science Classifications; including papers from "Trajectories for Research: Fathoming the Promise of the NARCIS Classification," 27-28 September 2018, The Hague, The Netherlands.
  6. Schuegraf, E.J.; Bommel, M.F.van: ¬An automatic document indexing system based on cooperating expert systems : design and development (1993) 0.04
    0.03607105 = product of:
      0.08416578 = sum of:
        0.047353994 = weight(_text_:systems in 6504) [ClassicSimilarity], result of:
          0.047353994 = score(doc=6504,freq=4.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.38414678 = fieldWeight in 6504, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=6504)
        0.012300085 = product of:
          0.02460017 = sum of:
            0.02460017 = weight(_text_:science in 6504) [ClassicSimilarity], result of:
              0.02460017 = score(doc=6504,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.23282544 = fieldWeight in 6504, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6504)
          0.5 = coord(1/2)
        0.024511702 = weight(_text_:library in 6504) [ClassicSimilarity], result of:
          0.024511702 = score(doc=6504,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.23240642 = fieldWeight in 6504, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0625 = fieldNorm(doc=6504)
      0.42857143 = coord(3/7)
    
    Abstract
    Discusses the design of an automatic indexing system based on two cooperating expert systems and the investigation related to its development. The design combines statistical and artificial intelligence techniques. Examines choice of content indicators, the effect of stemming and the identification of characteristic vocabularies for given subject areas. Presents experimental results. Discusses the application of machine learning algorithms to the identification of vocabularies
    Source
    Canadian journal of information and library science. 18(1993) no.2, S.32-50
  7. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.04
    0.035862528 = product of:
      0.06275942 = sum of:
        0.033484332 = weight(_text_:systems in 5499) [ClassicSimilarity], result of:
          0.033484332 = score(doc=5499,freq=8.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.2716328 = fieldWeight in 5499, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03125 = fieldNorm(doc=5499)
        0.0061500426 = product of:
          0.012300085 = sum of:
            0.012300085 = weight(_text_:science in 5499) [ClassicSimilarity], result of:
              0.012300085 = score(doc=5499,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.11641272 = fieldWeight in 5499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5499)
          0.5 = coord(1/2)
        0.012255851 = weight(_text_:library in 5499) [ClassicSimilarity], result of:
          0.012255851 = score(doc=5499,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.11620321 = fieldWeight in 5499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.03125 = fieldNorm(doc=5499)
        0.010869193 = product of:
          0.021738386 = sum of:
            0.021738386 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
              0.021738386 = score(doc=5499,freq=2.0), product of:
                0.14046472 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04011181 = queryNorm
                0.15476047 = fieldWeight in 5499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5499)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Abstract
    Purpose Modern mathematicians and scientists of math-related disciplines often use Document Preparation Systems (DPS) to write and Computer Algebra Systems (CAS) to calculate mathematical expressions. Usually, they translate the expressions manually between DPS and CAS. This process is time-consuming and error-prone. The purpose of this paper is to automate this translation. This paper uses Maple and Mathematica as the CAS, and LaTeX as the DPS. Design/methodology/approach Bruce Miller at the National Institute of Standards and Technology (NIST) developed a collection of special LaTeX macros that create links from mathematical symbols to their definitions in the NIST Digital Library of Mathematical Functions (DLMF). The authors are using these macros to perform rule-based translations between the formulae in the DLMF and CAS. Moreover, the authors develop software to ease the creation of new rules and to discover inconsistencies. Findings The authors created 396 mappings and translated 58.8 percent of DLMF formulae (2,405 expressions) successfully between Maple and DLMF. For a significant percentage, the special function definitions in Maple and the DLMF were different. An atomic symbol in one system maps to a composite expression in the other system. The translator was also successfully used for automatic verification of mathematical online compendia and CAS. The evaluation techniques discovered two errors in the DLMF and one defect in Maple. Originality/value This paper introduces the first translation tool for special functions between LaTeX and CAS. The approach improves error-prone manual translations and can be used to verify mathematical online compendia and CAS.
    Date
    20. 1.2015 18:30:22
    Footnote
    Beitrag in einem Special Issue: Information Science in the German-speaking Countries.
  8. Losee, R.M.: ¬A Gray code based ordering for documents on shelves : classification for browsing and retrieval (1992) 0.03
    0.031562172 = product of:
      0.07364506 = sum of:
        0.041434746 = weight(_text_:systems in 2335) [ClassicSimilarity], result of:
          0.041434746 = score(doc=2335,freq=4.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.33612844 = fieldWeight in 2335, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2335)
        0.010762575 = product of:
          0.02152515 = sum of:
            0.02152515 = weight(_text_:science in 2335) [ClassicSimilarity], result of:
              0.02152515 = score(doc=2335,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.20372227 = fieldWeight in 2335, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2335)
          0.5 = coord(1/2)
        0.02144774 = weight(_text_:library in 2335) [ClassicSimilarity], result of:
          0.02144774 = score(doc=2335,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.20335563 = fieldWeight in 2335, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2335)
      0.42857143 = coord(3/7)
    
    Abstract
    A document classifier places documents together in a linear arrangement for browsing or high-speed access by human or computerised information retrieval systems. Requirements for document classification and browsing systems are developed from similarity measures, distance measures, and the notion of subject aboutness. A requirement that documents be arranged in decreasing order of similarity as the distance from a given document increases can often not be met. Based on these requirements, information-theoretic considerations, and the Gray code, a classification system is proposed that can classifiy documents without human intervention. A measure of classifier performance is developed, and used to evaluate experimental results comparing the distance between subject headings assigned to documents given classifications from the proposed system and the Library of Congress Classification (LCC) system
    Source
    Journal of the American Society for Information Science. 43(1992) no.4, S.312-322
  9. Haas, S.; He, S.: Toward the automatic identification of sublanguage vocabulary (1993) 0.03
    0.029478008 = product of:
      0.103173025 = sum of:
        0.07866132 = sum of:
          0.034789898 = weight(_text_:science in 4891) [ClassicSimilarity], result of:
            0.034789898 = score(doc=4891,freq=4.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.3292649 = fieldWeight in 4891, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.0625 = fieldNorm(doc=4891)
          0.043871425 = weight(_text_:29 in 4891) [ClassicSimilarity], result of:
            0.043871425 = score(doc=4891,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.31092256 = fieldWeight in 4891, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0625 = fieldNorm(doc=4891)
        0.024511702 = weight(_text_:library in 4891) [ClassicSimilarity], result of:
          0.024511702 = score(doc=4891,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.23240642 = fieldWeight in 4891, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0625 = fieldNorm(doc=4891)
      0.2857143 = coord(2/7)
    
    Abstract
    Describes a method developed for automatic identification of sublanguage vocabulary words as they occur in abstracts. Describes the sublanguage vocabulary identification procedures using abstracts from computer science and library and information science as sublanguage sources. Evaluates the results using three criteria. Discuss the practical and theoretical significance of this research and plans for further experiments
    Source
    Information processing and management. 29(1993) no.6, S.721-744
  10. Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.03
    0.02893441 = product of:
      0.06751362 = sum of:
        0.02929879 = weight(_text_:systems in 2673) [ClassicSimilarity], result of:
          0.02929879 = score(doc=2673,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.23767869 = fieldWeight in 2673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
        0.019193748 = product of:
          0.038387496 = sum of:
            0.038387496 = weight(_text_:29 in 2673) [ClassicSimilarity], result of:
              0.038387496 = score(doc=2673,freq=2.0), product of:
                0.14110081 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04011181 = queryNorm
                0.27205724 = fieldWeight in 2673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2673)
          0.5 = coord(1/2)
        0.019021088 = product of:
          0.038042177 = sum of:
            0.038042177 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
              0.038042177 = score(doc=2673,freq=2.0), product of:
                0.14046472 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04011181 = queryNorm
                0.2708308 = fieldWeight in 2673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2673)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Date
    1. 8.1996 22:08:06
    Source
    Computer networks and ISDN systems. 29(1997) no.8, S.1147-1156
  11. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.03
    0.02761136 = product of:
      0.09663975 = sum of:
        0.05859758 = weight(_text_:systems in 6265) [ClassicSimilarity], result of:
          0.05859758 = score(doc=6265,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.47535738 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
        0.038042177 = product of:
          0.07608435 = sum of:
            0.07608435 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
              0.07608435 = score(doc=6265,freq=2.0), product of:
                0.14046472 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04011181 = queryNorm
                0.5416616 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  12. Humphrey, S.M.; Névéol, A.; Browne, A.; Gobeil, J.; Ruch, P.; Darmoni, S.J.: Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty (2009) 0.03
    0.026546717 = product of:
      0.06194234 = sum of:
        0.029596249 = weight(_text_:systems in 3300) [ClassicSimilarity], result of:
          0.029596249 = score(doc=3300,freq=4.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.24009174 = fieldWeight in 3300, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3300)
        0.0108718425 = product of:
          0.021743685 = sum of:
            0.021743685 = weight(_text_:science in 3300) [ClassicSimilarity], result of:
              0.021743685 = score(doc=3300,freq=4.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.20579056 = fieldWeight in 3300, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3300)
          0.5 = coord(1/2)
        0.021474248 = product of:
          0.042948496 = sum of:
            0.042948496 = weight(_text_:applications in 3300) [ClassicSimilarity], result of:
              0.042948496 = score(doc=3300,freq=2.0), product of:
                0.17659263 = queryWeight, product of:
                  4.4025097 = idf(docFreq=1471, maxDocs=44218)
                  0.04011181 = queryNorm
                0.2432066 = fieldWeight in 3300, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4025097 = idf(docFreq=1471, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3300)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    Automatic document categorization is an important research problem in Information Science and Natural Language Processing. Many applications, including, Word Sense Disambiguation and Information Retrieval in large collections, can benefit from such categorization. This paper focuses on automatic categorization of documents from the biomedical literature into broad discipline-based categories. Two different systems are described and contrasted: CISMeF, which uses rules based on human indexing of the documents by the Medical Subject Headings (MeSH) controlled vocabulary in order to assign metaterms (MTs), and Journal Descriptor Indexing (JDI), based on human categorization of about 4,000 journals and statistical associations between journal descriptors (JDs) and textwords in the documents. We evaluate and compare the performance of these systems against a gold standard of humanly assigned categories for 100 MEDLINE documents, using six measures selected from trec_eval. The results show that for five of the measures performance is comparable, and for one measure JDI is superior. We conclude that these results favor JDI, given the significantly greater intellectual overhead involved in human indexing and maintaining a rule base for mapping MeSH terms to MTs. We also note a JDI method that associates JDs with MeSH indexing rather than textwords, and it may be worthwhile to investigate whether this JDI method (statistical) and CISMeF (rule-based) might be combined and then evaluated showing they are complementary to one another.
    Source
    Journal of the American Society for Information Science and Technology. 60(2009) no.12, S.2530-2539
  13. Krutulis, J.D.; Jacob, E.K.: ¬A theoretical model for the study of emergent structure in adaptive information networks (1995) 0.03
    0.026361046 = product of:
      0.061509106 = sum of:
        0.02929879 = weight(_text_:systems in 3353) [ClassicSimilarity], result of:
          0.02929879 = score(doc=3353,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.23767869 = fieldWeight in 3353, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3353)
        0.010762575 = product of:
          0.02152515 = sum of:
            0.02152515 = weight(_text_:science in 3353) [ClassicSimilarity], result of:
              0.02152515 = score(doc=3353,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.20372227 = fieldWeight in 3353, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3353)
          0.5 = coord(1/2)
        0.02144774 = weight(_text_:library in 3353) [ClassicSimilarity], result of:
          0.02144774 = score(doc=3353,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.20335563 = fieldWeight in 3353, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3353)
      0.42857143 = coord(3/7)
    
    Imprint
    Alberta : Alberta University, School of Library and Information Studies
    Source
    Connectedness: information, systems, people, organizations. Proceedings of CAIS/ACSI 95, the proceedings of the 23rd Annual Conference of the Canadian Association for Information Science. Ed. by Hope A. Olson and Denis B. Ward
  14. Schulz, K.U.; Brunner, L.: Vollautomatische thematische Verschlagwortung großer Textkollektionen mittels semantischer Netze (2017) 0.03
    0.025488982 = product of:
      0.089211434 = sum of:
        0.02929879 = weight(_text_:systems in 3493) [ClassicSimilarity], result of:
          0.02929879 = score(doc=3493,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.23767869 = fieldWeight in 3493, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3493)
        0.059912644 = sum of:
          0.02152515 = weight(_text_:science in 3493) [ClassicSimilarity], result of:
            0.02152515 = score(doc=3493,freq=2.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.20372227 = fieldWeight in 3493, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3493)
          0.038387496 = weight(_text_:29 in 3493) [ClassicSimilarity], result of:
            0.038387496 = score(doc=3493,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.27205724 = fieldWeight in 3493, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3493)
      0.2857143 = coord(2/7)
    
    Source
    Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber
  15. Böhm, A.; Seifert, C.; Schlötterer, J.; Granitzer, M.: Identifying tweets from the economic domain (2017) 0.03
    0.025488982 = product of:
      0.089211434 = sum of:
        0.02929879 = weight(_text_:systems in 3495) [ClassicSimilarity], result of:
          0.02929879 = score(doc=3495,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.23767869 = fieldWeight in 3495, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3495)
        0.059912644 = sum of:
          0.02152515 = weight(_text_:science in 3495) [ClassicSimilarity], result of:
            0.02152515 = score(doc=3495,freq=2.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.20372227 = fieldWeight in 3495, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3495)
          0.038387496 = weight(_text_:29 in 3495) [ClassicSimilarity], result of:
            0.038387496 = score(doc=3495,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.27205724 = fieldWeight in 3495, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3495)
      0.2857143 = coord(2/7)
    
    Source
    Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber
  16. Kempf, A.O.: Neue Verfahrenswege der Wissensorganisation : eine Evaluation automatischer Indexierung in der sozialwissenschaftlichen Fachinformation (2017) 0.03
    0.025488982 = product of:
      0.089211434 = sum of:
        0.02929879 = weight(_text_:systems in 3497) [ClassicSimilarity], result of:
          0.02929879 = score(doc=3497,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.23767869 = fieldWeight in 3497, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3497)
        0.059912644 = sum of:
          0.02152515 = weight(_text_:science in 3497) [ClassicSimilarity], result of:
            0.02152515 = score(doc=3497,freq=2.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.20372227 = fieldWeight in 3497, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3497)
          0.038387496 = weight(_text_:29 in 3497) [ClassicSimilarity], result of:
            0.038387496 = score(doc=3497,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.27205724 = fieldWeight in 3497, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3497)
      0.2857143 = coord(2/7)
    
    Source
    Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber
  17. Jones, S.; Paynter, G.W.: Automatic extractionof document keyphrases for use in digital libraries : evaluations and applications (2002) 0.03
    0.025182022 = product of:
      0.05875805 = sum of:
        0.029596249 = weight(_text_:systems in 601) [ClassicSimilarity], result of:
          0.029596249 = score(doc=601,freq=4.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.24009174 = fieldWeight in 601, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=601)
        0.0076875538 = product of:
          0.0153751075 = sum of:
            0.0153751075 = weight(_text_:science in 601) [ClassicSimilarity], result of:
              0.0153751075 = score(doc=601,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.1455159 = fieldWeight in 601, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=601)
          0.5 = coord(1/2)
        0.021474248 = product of:
          0.042948496 = sum of:
            0.042948496 = weight(_text_:applications in 601) [ClassicSimilarity], result of:
              0.042948496 = score(doc=601,freq=2.0), product of:
                0.17659263 = queryWeight, product of:
                  4.4025097 = idf(docFreq=1471, maxDocs=44218)
                  0.04011181 = queryNorm
                0.2432066 = fieldWeight in 601, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4025097 = idf(docFreq=1471, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=601)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.
    Source
    Journal of the American Society for Information Science and technology. 53(2002) no.8, S.653-677
  18. Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.02
    0.024031213 = product of:
      0.08410924 = sum of:
        0.02511325 = weight(_text_:systems in 1660) [ClassicSimilarity], result of:
          0.02511325 = score(doc=1660,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.2037246 = fieldWeight in 1660, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=1660)
        0.058995992 = sum of:
          0.026092423 = weight(_text_:science in 1660) [ClassicSimilarity], result of:
            0.026092423 = score(doc=1660,freq=4.0), product of:
              0.10565929 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.04011181 = queryNorm
              0.24694869 = fieldWeight in 1660, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.046875 = fieldNorm(doc=1660)
          0.03290357 = weight(_text_:29 in 1660) [ClassicSimilarity], result of:
            0.03290357 = score(doc=1660,freq=2.0), product of:
              0.14110081 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.04011181 = queryNorm
              0.23319192 = fieldWeight in 1660, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.046875 = fieldNorm(doc=1660)
      0.2857143 = coord(2/7)
    
    Abstract
    A corpus of 242 abstracts of Arabic documents on computer science and information systems using the Proceedings of the Saudi Arabian National Conferences as a source was put together. Reports on the design and building of an automatic information retrieval system from scratch to handle Arabic data. Both automatic and manual indexing techniques were implemented. Experiments using measures of recall and precision has demonstrated that automatic indexing is at least as effective as manual indexing and more effective in some cases. Automatic indexing is both cheaper and faster. Results suggests that a wider coverage of the literature can be achieved with less money and produce as good results as with manual indexing. Compares the retrieval results using words as index terms versus stems and roots, and confirms the results obtained by Al-Kharashi and Abu-Salem with smaller corpora that root indexing is more effective than word indexing
    Date
    29. 7.1998 17:40:01
    Source
    Journal of the American Society for Information Science. 48(1997) no.10, S.867-881
  19. Griffiths, A.; Luckhurst, H.C.; Willett, P.: Using interdocument similarity information in document retrieval systems (1986) 0.02
    0.02289221 = product of:
      0.08012273 = sum of:
        0.05859758 = weight(_text_:systems in 2415) [ClassicSimilarity], result of:
          0.05859758 = score(doc=2415,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.47535738 = fieldWeight in 2415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.109375 = fieldNorm(doc=2415)
        0.02152515 = product of:
          0.0430503 = sum of:
            0.0430503 = weight(_text_:science in 2415) [ClassicSimilarity], result of:
              0.0430503 = score(doc=2415,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.40744454 = fieldWeight in 2415, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2415)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Source
    Journal of the American Society for Information Science. 37(1986) no.1, S.3-11
  20. Benson, A.C.: Image descriptions and their relational expressions : a review of the literature and the issues (2015) 0.02
    0.022595182 = product of:
      0.05272209 = sum of:
        0.02511325 = weight(_text_:systems in 1867) [ClassicSimilarity], result of:
          0.02511325 = score(doc=1867,freq=2.0), product of:
            0.12327058 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.04011181 = queryNorm
            0.2037246 = fieldWeight in 1867, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=1867)
        0.009225064 = product of:
          0.018450128 = sum of:
            0.018450128 = weight(_text_:science in 1867) [ClassicSimilarity], result of:
              0.018450128 = score(doc=1867,freq=2.0), product of:
                0.10565929 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.04011181 = queryNorm
                0.17461908 = fieldWeight in 1867, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1867)
          0.5 = coord(1/2)
        0.018383777 = weight(_text_:library in 1867) [ClassicSimilarity], result of:
          0.018383777 = score(doc=1867,freq=2.0), product of:
            0.10546913 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.04011181 = queryNorm
            0.17430481 = fieldWeight in 1867, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.046875 = fieldNorm(doc=1867)
      0.42857143 = coord(3/7)
    
    Abstract
    Purpose - The purpose of this paper is to survey the treatment of relationships, relationship expressions and the ways in which they manifest themselves in image descriptions. Design/methodology/approach - The term "relationship" is construed in the broadest possible way to include spatial relationships ("to the right of"), temporal ("in 1936," "at noon"), meronymic ("part of"), and attributive ("has color," "has dimension"). The intentions of these vaguely delimited categories with image information, image creation, and description in libraries and archives is complex and in need of explanation. Findings - The review brings into question many generally held beliefs about the relationship problem such as the belief that the semantics of relationships are somehow embedded in the relationship term itself and that image search and retrieval solutions can be found through refinement of word-matching systems. Originality/value - This review has no hope of systematically examining all evidence in all disciplines pertaining to this topic. It instead focusses on a general description of a theoretical treatment in Library and Information Science.

Years

Languages

Types

  • a 185
  • el 12
  • x 8
  • m 6
  • s 3
  • d 1
  • p 1
  • More… Less…