Document (#33385)

Author
Britt, B.L.
Berry, M.W.
Browne, M.
Merrell, M.A.
Kolpack, J.
Title
Document classification techniques for automated technology readiness level analysis
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.4, S.675-680
Year
2008
Abstract
The overhead of assessing technology readiness for deployment and investment purposes can be costly to both large and small businesses. Recent advances in the automatic interpretation of technology readiness levels (TRLs) of a given technology can substantially reduce the risk and associated cost of bringing these new technologies to market. Using vector-space information-retrieval models, such as latent semantic indexing, it is feasible to group similar technology descriptions by exploiting the latent structure of term usage within textual documents. Once the documents have been semantically clustered (or grouped), they can be classified based on the TRL scores of (known) nearest-neighbor documents. Three automated (no human curation) strategies for assigning TRLs to documents are discussed with accuracies as high as 86% achieved for two-class problems.

Similar documents (author)

  1. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 6.49
    6.4855547 = sum of:
      6.4855547 = sum of:
        3.2003305 = weight(author_txt:browne in 5777) [ClassicSimilarity], result of:
          3.2003305 = score(doc=5777,freq=1.0), product of:
            0.7009094 = queryWeight, product of:
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.07675363 = queryNorm
            4.565969 = fieldWeight in 5777, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.5 = fieldNorm(doc=5777)
        3.2852242 = weight(author_txt:berry in 5777) [ClassicSimilarity], result of:
          3.2852242 = score(doc=5777,freq=1.0), product of:
            0.71325034 = queryWeight, product of:
              1.0087651 = boost
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.07675363 = queryNorm
            4.6059904 = fieldWeight in 5777, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.5 = fieldNorm(doc=5777)
    
  2. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 6.49
    6.4855547 = sum of:
      6.4855547 = sum of:
        3.2003305 = weight(author_txt:browne in 7) [ClassicSimilarity], result of:
          3.2003305 = score(doc=7,freq=1.0), product of:
            0.7009094 = queryWeight, product of:
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.07675363 = queryNorm
            4.565969 = fieldWeight in 7, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.5 = fieldNorm(doc=7)
        3.2852242 = weight(author_txt:berry in 7) [ClassicSimilarity], result of:
          3.2852242 = score(doc=7,freq=1.0), product of:
            0.71325034 = queryWeight, product of:
              1.0087651 = boost
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.07675363 = queryNorm
            4.6059904 = fieldWeight in 7, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.211981 = idf(docFreq=11, maxDocs=44218)
              0.5 = fieldNorm(doc=7)
    
  3. Berry, J.: CD-ROM: the medium for the moment (1992) 2.05
    2.0532653 = sum of:
      2.0532653 = product of:
        4.1065307 = sum of:
          4.1065307 = weight(author_txt:berry in 3635) [ClassicSimilarity], result of:
            4.1065307 = score(doc=3635,freq=1.0), product of:
              0.71325034 = queryWeight, product of:
                1.0087651 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.07675363 = queryNorm
              5.7574883 = fieldWeight in 3635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.625 = fieldNorm(doc=3635)
        0.5 = coord(1/2)
    
  4. Browne, G.: Scope notes for LISA subject headings (1992) 2.00
    2.0002065 = sum of:
      2.0002065 = product of:
        4.000413 = sum of:
          4.000413 = weight(author_txt:browne in 1430) [ClassicSimilarity], result of:
            4.000413 = score(doc=1430,freq=1.0), product of:
              0.7009094 = queryWeight, product of:
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.07675363 = queryNorm
              5.7074614 = fieldWeight in 1430, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.625 = fieldNorm(doc=1430)
        0.5 = coord(1/2)
    
  5. Browne, G.: Professional liability of indexers (1996) 2.00
    2.0002065 = sum of:
      2.0002065 = product of:
        4.000413 = sum of:
          4.000413 = weight(author_txt:browne in 3643) [ClassicSimilarity], result of:
            4.000413 = score(doc=3643,freq=1.0), product of:
              0.7009094 = queryWeight, product of:
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.07675363 = queryNorm
              5.7074614 = fieldWeight in 3643, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.625 = fieldNorm(doc=3643)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Sun, J.: Why different people prefer different systems for different tasks : an activity perspective on technology adoption in a dynamic user environment (2012) 0.07
    0.07215293 = sum of:
      0.07215293 = product of:
        0.9019116 = sum of:
          0.079044715 = weight(abstract_txt:technology in 4961) [ClassicSimilarity], result of:
            0.079044715 = score(doc=4961,freq=1.0), product of:
              0.23606804 = queryWeight, product of:
                3.2239468 = boost
                4.2859354 = idf(docFreq=1653, maxDocs=44218)
                0.017084556 = queryNorm
              0.3348387 = fieldWeight in 4961, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2859354 = idf(docFreq=1653, maxDocs=44218)
                0.078125 = fieldNorm(doc=4961)
          0.8228669 = weight(abstract_txt:readiness in 4961) [ClassicSimilarity], result of:
            0.8228669 = score(doc=4961,freq=4.0), product of:
              0.5980059 = queryWeight, product of:
                3.9746387 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.017084556 = queryNorm
              1.376018 = fieldWeight in 4961, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=4961)
        0.08 = coord(2/25)
    
  2. Kim, J.-H.; Choi, K.-S.: Patent document categorization based on semantic structural information (2007) 0.06
    0.057418406 = sum of:
      0.057418406 = product of:
        0.35886505 = sum of:
          0.052237015 = weight(abstract_txt:semantically in 933) [ClassicSimilarity], result of:
            0.052237015 = score(doc=933,freq=1.0), product of:
              0.121541396 = queryWeight, product of:
                1.0345379 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.017084556 = queryNorm
              0.42978784 = fieldWeight in 933, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0625 = fieldNorm(doc=933)
          0.109758645 = weight(abstract_txt:clustered in 933) [ClassicSimilarity], result of:
            0.109758645 = score(doc=933,freq=2.0), product of:
              0.15825392 = queryWeight, product of:
                1.1804879 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.017084556 = queryNorm
              0.69356036 = fieldWeight in 933, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=933)
          0.08669132 = weight(abstract_txt:nearest in 933) [ClassicSimilarity], result of:
            0.08669132 = score(doc=933,freq=1.0), product of:
              0.1703684 = queryWeight, product of:
                1.2248385 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.017084556 = queryNorm
              0.5088462 = fieldWeight in 933, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.0625 = fieldNorm(doc=933)
          0.11017806 = weight(abstract_txt:documents in 933) [ClassicSimilarity], result of:
            0.11017806 = score(doc=933,freq=6.0), product of:
              0.17462441 = queryWeight, product of:
                2.480086 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017084556 = queryNorm
              0.63094306 = fieldWeight in 933, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=933)
        0.16 = coord(4/25)
    
  3. Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.06
    0.0567148 = sum of:
      0.0567148 = product of:
        0.3544675 = sum of:
          0.09229908 = weight(abstract_txt:investment in 393) [ClassicSimilarity], result of:
            0.09229908 = score(doc=393,freq=2.0), product of:
              0.1409917 = queryWeight, product of:
                1.114246 = boost
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.017084556 = queryNorm
              0.6546419 = fieldWeight in 393, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.406428 = idf(docFreq=72, maxDocs=44218)
                0.0625 = fieldNorm(doc=393)
          0.08669132 = weight(abstract_txt:nearest in 393) [ClassicSimilarity], result of:
            0.08669132 = score(doc=393,freq=1.0), product of:
              0.1703684 = queryWeight, product of:
                1.2248385 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.017084556 = queryNorm
              0.5088462 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.0625 = fieldNorm(doc=393)
          0.1118658 = weight(abstract_txt:neighbor in 393) [ClassicSimilarity], result of:
            0.1118658 = score(doc=393,freq=1.0), product of:
              0.20193125 = queryWeight, product of:
                1.3334786 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.017084556 = queryNorm
              0.55397964 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.0625 = fieldNorm(doc=393)
          0.06361133 = weight(abstract_txt:documents in 393) [ClassicSimilarity], result of:
            0.06361133 = score(doc=393,freq=2.0), product of:
              0.17462441 = queryWeight, product of:
                2.480086 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017084556 = queryNorm
              0.36427513 = fieldWeight in 393, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=393)
        0.16 = coord(4/25)
    
  4. Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.05
    0.054364726 = sum of:
      0.054364726 = product of:
        0.33977956 = sum of:
          0.07761108 = weight(abstract_txt:clustered in 979) [ClassicSimilarity], result of:
            0.07761108 = score(doc=979,freq=1.0), product of:
              0.15825392 = queryWeight, product of:
                1.1804879 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.017084556 = queryNorm
              0.49042124 = fieldWeight in 979, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=979)
          0.08669132 = weight(abstract_txt:nearest in 979) [ClassicSimilarity], result of:
            0.08669132 = score(doc=979,freq=1.0), product of:
              0.1703684 = queryWeight, product of:
                1.2248385 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.017084556 = queryNorm
              0.5088462 = fieldWeight in 979, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.0625 = fieldNorm(doc=979)
          0.1118658 = weight(abstract_txt:neighbor in 979) [ClassicSimilarity], result of:
            0.1118658 = score(doc=979,freq=1.0), product of:
              0.20193125 = queryWeight, product of:
                1.3334786 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.017084556 = queryNorm
              0.55397964 = fieldWeight in 979, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.0625 = fieldNorm(doc=979)
          0.06361133 = weight(abstract_txt:documents in 979) [ClassicSimilarity], result of:
            0.06361133 = score(doc=979,freq=2.0), product of:
              0.17462441 = queryWeight, product of:
                2.480086 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017084556 = queryNorm
              0.36427513 = fieldWeight in 979, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=979)
        0.16 = coord(4/25)
    
  5. Mu, T.; Goulermas, J.Y.; Korkontzelos, I.; Ananiadou, S.: Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities (2016) 0.05
    0.05284587 = sum of:
      0.05284587 = product of:
        0.33028668 = sum of:
          0.04717796 = weight(abstract_txt:scores in 2496) [ClassicSimilarity], result of:
            0.04717796 = score(doc=2496,freq=1.0), product of:
              0.11356158 = queryWeight, product of:
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.017084556 = queryNorm
              0.41543946 = fieldWeight in 2496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.0625 = fieldNorm(doc=2496)
          0.052237015 = weight(abstract_txt:semantically in 2496) [ClassicSimilarity], result of:
            0.052237015 = score(doc=2496,freq=1.0), product of:
              0.121541396 = queryWeight, product of:
                1.0345379 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.017084556 = queryNorm
              0.42978784 = fieldWeight in 2496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0625 = fieldNorm(doc=2496)
          0.1118658 = weight(abstract_txt:neighbor in 2496) [ClassicSimilarity], result of:
            0.1118658 = score(doc=2496,freq=1.0), product of:
              0.20193125 = queryWeight, product of:
                1.3334786 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.017084556 = queryNorm
              0.55397964 = fieldWeight in 2496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.0625 = fieldNorm(doc=2496)
          0.119005896 = weight(abstract_txt:documents in 2496) [ClassicSimilarity], result of:
            0.119005896 = score(doc=2496,freq=7.0), product of:
              0.17462441 = queryWeight, product of:
                2.480086 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017084556 = queryNorm
              0.6814963 = fieldWeight in 2496, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2496)
        0.16 = coord(4/25)