Document (#16032)

Author
Losee, R.M.
Title
Text windows and phrases differing by discipline, location in document, and syntactic structure
Source
Information processing and management. 32(1996) no.6, S.747-767
Year
1996
Abstract
Knowledge of window style, content, location, and grammatical structure may be used to classify documents as originating within a particular discipline or may be used to place a document on a theory vs. practice spectrum. Examines characteristics of phrases and text windows, including their number, location in documents, and grammatical construction, in addition to studying variations in these window characteristics across disciplines. Examines some of the linguistic regularities for individual disciplines, and suggests families of regularities that may provide helpful for the automatic classification of documents, as well as for information retrieval and filtering applications
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Losee, R.M.: ¬A Gray code based ordering for documents on shelves : classification for browsing and retrieval (1992) 5.18
    5.184806 = sum of:
      5.184806 = weight(author_txt:losee in 2335) [ClassicSimilarity], result of:
        5.184806 = fieldWeight in 2335, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.625 = fieldNorm(doc=2335)
    
  2. Losee, R.M.: ¬The relative shelf location of circulated books : a study of classification, users, and browsing (1993) 5.18
    5.184806 = sum of:
      5.184806 = weight(author_txt:losee in 4485) [ClassicSimilarity], result of:
        5.184806 = fieldWeight in 4485, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.625 = fieldNorm(doc=4485)
    
  3. Losee, R.M.: Seven fundamental questions for the science of library classification (1993) 5.18
    5.184806 = sum of:
      5.184806 = weight(author_txt:losee in 4508) [ClassicSimilarity], result of:
        5.184806 = fieldWeight in 4508, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.625 = fieldNorm(doc=4508)
    
  4. Losee, R.M.: Term dependence : truncating the Bahadur Lazarsfeld expansion (1994) 5.18
    5.184806 = sum of:
      5.184806 = weight(author_txt:losee in 7390) [ClassicSimilarity], result of:
        5.184806 = fieldWeight in 7390, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.625 = fieldNorm(doc=7390)
    
  5. Losee, R.M.: Upper bounds for retrieval performance and their user measuring performance and generating optimal queries : can it get any better than this? (1994) 5.18
    5.184806 = sum of:
      5.184806 = weight(author_txt:losee in 7418) [ClassicSimilarity], result of:
        5.184806 = fieldWeight in 7418, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.625 = fieldNorm(doc=7418)
    

Similar documents (content)

  1. Losee, R.M.: Learning syntactic rules and tags with genetic algorithms for information retrieval and filtering : an empirical basis for grammatical rules (1996) 0.17
    0.17293407 = sum of:
      0.17293407 = product of:
        0.72055864 = sum of:
          0.061455753 = weight(abstract_txt:syntactic in 4068) [ClassicSimilarity], result of:
            0.061455753 = score(doc=4068,freq=1.0), product of:
              0.12053095 = queryWeight, product of:
                1.0017344 = boost
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.018436229 = queryNorm
              0.5098753 = fieldWeight in 4068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.078125 = fieldNorm(doc=4068)
          0.087368935 = weight(abstract_txt:filtering in 4068) [ClassicSimilarity], result of:
            0.087368935 = score(doc=4068,freq=2.0), product of:
              0.12095345 = queryWeight, product of:
                1.0034885 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.018436229 = queryNorm
              0.7223352 = fieldWeight in 4068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.078125 = fieldNorm(doc=4068)
          0.016761651 = weight(abstract_txt:used in 4068) [ClassicSimilarity], result of:
            0.016761651 = score(doc=4068,freq=1.0), product of:
              0.06386723 = queryWeight, product of:
                1.0312343 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.018436229 = queryNorm
              0.26244524 = fieldWeight in 4068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.078125 = fieldNorm(doc=4068)
          0.049459483 = weight(abstract_txt:document in 4068) [ClassicSimilarity], result of:
            0.049459483 = score(doc=4068,freq=2.0), product of:
              0.104285344 = queryWeight, product of:
                1.3177406 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018436229 = queryNorm
              0.4742707 = fieldWeight in 4068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=4068)
          0.051389057 = weight(abstract_txt:characteristics in 4068) [ClassicSimilarity], result of:
            0.051389057 = score(doc=4068,freq=1.0), product of:
              0.13478678 = queryWeight, product of:
                1.4981039 = boost
                4.8801513 = idf(docFreq=912, maxDocs=44218)
                0.018436229 = queryNorm
              0.38126183 = fieldWeight in 4068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8801513 = idf(docFreq=912, maxDocs=44218)
                0.078125 = fieldNorm(doc=4068)
          0.4541238 = weight(abstract_txt:grammatical in 4068) [ClassicSimilarity], result of:
            0.4541238 = score(doc=4068,freq=4.0), product of:
              0.36293575 = queryWeight, product of:
                2.4582903 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.018436229 = queryNorm
              1.2512512 = fieldWeight in 4068, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.078125 = fieldNorm(doc=4068)
        0.24 = coord(6/25)
    
  2. Haas, S.W.; Losee, R.M.: Looking in text windows : their size and composition (1994) 0.14
    0.13861084 = sum of:
      0.13861084 = product of:
        0.57754517 = sum of:
          0.0611371 = weight(abstract_txt:style in 8525) [ClassicSimilarity], result of:
            0.0611371 = score(doc=8525,freq=1.0), product of:
              0.12011395 = queryWeight, product of:
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.018436229 = queryNorm
              0.5089925 = fieldWeight in 8525, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.078125 = fieldNorm(doc=8525)
          0.016761651 = weight(abstract_txt:used in 8525) [ClassicSimilarity], result of:
            0.016761651 = score(doc=8525,freq=1.0), product of:
              0.06386723 = queryWeight, product of:
                1.0312343 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.018436229 = queryNorm
              0.26244524 = fieldWeight in 8525, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.078125 = fieldNorm(doc=8525)
          0.06538021 = weight(abstract_txt:text in 8525) [ClassicSimilarity], result of:
            0.06538021 = score(doc=8525,freq=5.0), product of:
              0.09254957 = queryWeight, product of:
                1.2413821 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.018436229 = queryNorm
              0.7064345 = fieldWeight in 8525, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=8525)
          0.036595885 = weight(abstract_txt:structure in 8525) [ClassicSimilarity], result of:
            0.036595885 = score(doc=8525,freq=1.0), product of:
              0.107486784 = queryWeight, product of:
                1.3378142 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.018436229 = queryNorm
              0.3404687 = fieldWeight in 8525, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.078125 = fieldNorm(doc=8525)
          0.19519156 = weight(abstract_txt:windows in 8525) [ClassicSimilarity], result of:
            0.19519156 = score(doc=8525,freq=3.0), product of:
              0.22751 = queryWeight, product of:
                1.9463392 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.018436229 = queryNorm
              0.8579472 = fieldWeight in 8525, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.078125 = fieldNorm(doc=8525)
          0.2024788 = weight(abstract_txt:window in 8525) [ClassicSimilarity], result of:
            0.2024788 = score(doc=8525,freq=1.0), product of:
              0.336243 = queryWeight, product of:
                2.3661644 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.018436229 = queryNorm
              0.60217994 = fieldWeight in 8525, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.078125 = fieldNorm(doc=8525)
        0.24 = coord(6/25)
    
  3. Zhu, J.; Song, D.; Rüger, S.: Integrating multiple windows and document features for expert finding (2009) 0.12
    0.12457878 = sum of:
      0.12457878 = product of:
        0.51907825 = sum of:
          0.013409321 = weight(abstract_txt:used in 2755) [ClassicSimilarity], result of:
            0.013409321 = score(doc=2755,freq=1.0), product of:
              0.06386723 = queryWeight, product of:
                1.0312343 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.018436229 = queryNorm
              0.2099562 = fieldWeight in 2755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=2755)
          0.06853307 = weight(abstract_txt:document in 2755) [ClassicSimilarity], result of:
            0.06853307 = score(doc=2755,freq=6.0), product of:
              0.104285344 = queryWeight, product of:
                1.3177406 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018436229 = queryNorm
              0.65716875 = fieldWeight in 2755, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2755)
          0.029276708 = weight(abstract_txt:structure in 2755) [ClassicSimilarity], result of:
            0.029276708 = score(doc=2755,freq=1.0), product of:
              0.107486784 = queryWeight, product of:
                1.3378142 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.018436229 = queryNorm
              0.27237496 = fieldWeight in 2755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=2755)
          0.037141167 = weight(abstract_txt:documents in 2755) [ClassicSimilarity], result of:
            0.037141167 = score(doc=2755,freq=1.0), product of:
              0.14419195 = queryWeight, product of:
                1.89773 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018436229 = queryNorm
              0.2575814 = fieldWeight in 2755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2755)
          0.09015512 = weight(abstract_txt:windows in 2755) [ClassicSimilarity], result of:
            0.09015512 = score(doc=2755,freq=1.0), product of:
              0.22751 = queryWeight, product of:
                1.9463392 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.018436229 = queryNorm
              0.3962688 = fieldWeight in 2755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.0625 = fieldNorm(doc=2755)
          0.28056285 = weight(abstract_txt:window in 2755) [ClassicSimilarity], result of:
            0.28056285 = score(doc=2755,freq=3.0), product of:
              0.336243 = queryWeight, product of:
                2.3661644 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.018436229 = queryNorm
              0.834405 = fieldWeight in 2755, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=2755)
        0.24 = coord(6/25)
    
  4. Losee, R.: ¬A performance model of the length and number of subject headings and index phrases (2004) 0.12
    0.12434642 = sum of:
      0.12434642 = product of:
        0.5181101 = sum of:
          0.023704555 = weight(abstract_txt:used in 3725) [ClassicSimilarity], result of:
            0.023704555 = score(doc=3725,freq=2.0), product of:
              0.06386723 = queryWeight, product of:
                1.0312343 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.018436229 = queryNorm
              0.37115362 = fieldWeight in 3725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.04135007 = weight(abstract_txt:text in 3725) [ClassicSimilarity], result of:
            0.04135007 = score(doc=3725,freq=2.0), product of:
              0.09254957 = queryWeight, product of:
                1.2413821 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.018436229 = queryNorm
              0.44678837 = fieldWeight in 3725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.049459483 = weight(abstract_txt:document in 3725) [ClassicSimilarity], result of:
            0.049459483 = score(doc=3725,freq=2.0), product of:
              0.104285344 = queryWeight, product of:
                1.3177406 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018436229 = queryNorm
              0.4742707 = fieldWeight in 3725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.051389057 = weight(abstract_txt:characteristics in 3725) [ClassicSimilarity], result of:
            0.051389057 = score(doc=3725,freq=1.0), product of:
              0.13478678 = queryWeight, product of:
                1.4981039 = boost
                4.8801513 = idf(docFreq=912, maxDocs=44218)
                0.018436229 = queryNorm
              0.38126183 = fieldWeight in 3725, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8801513 = idf(docFreq=912, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.06565692 = weight(abstract_txt:documents in 3725) [ClassicSimilarity], result of:
            0.06565692 = score(doc=3725,freq=2.0), product of:
              0.14419195 = queryWeight, product of:
                1.89773 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018436229 = queryNorm
              0.4553439 = fieldWeight in 3725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
          0.28655002 = weight(abstract_txt:phrases in 3725) [ClassicSimilarity], result of:
            0.28655002 = score(doc=3725,freq=4.0), product of:
              0.2670016 = queryWeight, product of:
                2.1085079 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.018436229 = queryNorm
              1.0732147 = fieldWeight in 3725, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.078125 = fieldNorm(doc=3725)
        0.24 = coord(6/25)
    
  5. Fagan, J.L.: ¬The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval (1989) 0.10
    0.10444435 = sum of:
      0.10444435 = product of:
        0.43518478 = sum of:
          0.0983292 = weight(abstract_txt:syntactic in 1845) [ClassicSimilarity], result of:
            0.0983292 = score(doc=1845,freq=4.0), product of:
              0.12053095 = queryWeight, product of:
                1.0017344 = boost
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.018436229 = queryNorm
              0.8158004 = fieldWeight in 1845, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.013409321 = weight(abstract_txt:used in 1845) [ClassicSimilarity], result of:
            0.013409321 = score(doc=1845,freq=1.0), product of:
              0.06386723 = queryWeight, product of:
                1.0312343 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.018436229 = queryNorm
              0.2099562 = fieldWeight in 1845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.033080056 = weight(abstract_txt:text in 1845) [ClassicSimilarity], result of:
            0.033080056 = score(doc=1845,freq=2.0), product of:
              0.09254957 = queryWeight, product of:
                1.2413821 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.018436229 = queryNorm
              0.3574307 = fieldWeight in 1845, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.06256185 = weight(abstract_txt:document in 1845) [ClassicSimilarity], result of:
            0.06256185 = score(doc=1845,freq=5.0), product of:
              0.104285344 = queryWeight, product of:
                1.3177406 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018436229 = queryNorm
              0.59991026 = fieldWeight in 1845, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.029276708 = weight(abstract_txt:structure in 1845) [ClassicSimilarity], result of:
            0.029276708 = score(doc=1845,freq=1.0), product of:
              0.107486784 = queryWeight, product of:
                1.3378142 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.018436229 = queryNorm
              0.27237496 = fieldWeight in 1845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
          0.19852766 = weight(abstract_txt:phrases in 1845) [ClassicSimilarity], result of:
            0.19852766 = score(doc=1845,freq=3.0), product of:
              0.2670016 = queryWeight, product of:
                2.1085079 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.018436229 = queryNorm
              0.7435449 = fieldWeight in 1845, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.0625 = fieldNorm(doc=1845)
        0.24 = coord(6/25)