Document (#33386)

Author
Britt, B.L.
Berry, M.W.
Browne, M.
Merrell, M.A.
Kolpack, J.
Title
Document classification techniques for automated technology readiness level analysis
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.4, S.675-680
Year
2008
Abstract
The overhead of assessing technology readiness for deployment and investment purposes can be costly to both large and small businesses. Recent advances in the automatic interpretation of technology readiness levels (TRLs) of a given technology can substantially reduce the risk and associated cost of bringing these new technologies to market. Using vector-space information-retrieval models, such as latent semantic indexing, it is feasible to group similar technology descriptions by exploiting the latent structure of term usage within textual documents. Once the documents have been semantically clustered (or grouped), they can be classified based on the TRL scores of (known) nearest-neighbor documents. Three automated (no human curation) strategies for assigning TRLs to documents are discussed with accuracies as high as 86% achieved for two-class problems.

Similar documents (author)

  1. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 6.49
    6.4898148 = sum of:
      6.4898148 = sum of:
        3.2449074 = weight(author_txt:browne in 778) [ClassicSimilarity], result of:
          3.2449074 = score(doc=778,freq=1.0), product of:
            0.70710677 = queryWeight, product of:
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.0770438 = queryNorm
            4.588992 = fieldWeight in 778, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.5 = fieldNorm(doc=778)
        3.2449074 = weight(author_txt:berry in 778) [ClassicSimilarity], result of:
          3.2449074 = score(doc=778,freq=1.0), product of:
            0.70710677 = queryWeight, product of:
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.0770438 = queryNorm
            4.588992 = fieldWeight in 778, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.5 = fieldNorm(doc=778)
    
  2. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 6.49
    6.4898148 = sum of:
      6.4898148 = sum of:
        3.2449074 = weight(author_txt:browne in 2008) [ClassicSimilarity], result of:
          3.2449074 = score(doc=2008,freq=1.0), product of:
            0.70710677 = queryWeight, product of:
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.0770438 = queryNorm
            4.588992 = fieldWeight in 2008, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.5 = fieldNorm(doc=2008)
        3.2449074 = weight(author_txt:berry in 2008) [ClassicSimilarity], result of:
          3.2449074 = score(doc=2008,freq=1.0), product of:
            0.70710677 = queryWeight, product of:
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.0770438 = queryNorm
            4.588992 = fieldWeight in 2008, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.177984 = idf(docFreq=11, maxDocs=42740)
              0.5 = fieldNorm(doc=2008)
    
  3. Berry, J.: CD-ROM: the medium for the moment (1992) 2.03
    2.028067 = sum of:
      2.028067 = product of:
        4.056134 = sum of:
          4.056134 = weight(author_txt:berry in 3635) [ClassicSimilarity], result of:
            4.056134 = score(doc=3635,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.0770438 = queryNorm
              5.7362404 = fieldWeight in 3635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.625 = fieldNorm(doc=3635)
        0.5 = coord(1/2)
    
  4. Browne, G.: Scope notes for LISA subject headings (1992) 2.03
    2.028067 = sum of:
      2.028067 = product of:
        4.056134 = sum of:
          4.056134 = weight(author_txt:browne in 1499) [ClassicSimilarity], result of:
            4.056134 = score(doc=1499,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.0770438 = queryNorm
              5.7362404 = fieldWeight in 1499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.625 = fieldNorm(doc=1499)
        0.5 = coord(1/2)
    
  5. Browne, G.: Professional liability of indexers (1996) 2.03
    2.028067 = sum of:
      2.028067 = product of:
        4.056134 = sum of:
          4.056134 = weight(author_txt:browne in 4644) [ClassicSimilarity], result of:
            4.056134 = score(doc=4644,freq=1.0), product of:
              0.70710677 = queryWeight, product of:
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.0770438 = queryNorm
              5.7362404 = fieldWeight in 4644, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.177984 = idf(docFreq=11, maxDocs=42740)
                0.625 = fieldNorm(doc=4644)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Sun, J.: Why different people prefer different systems for different tasks : an activity perspective on technology adoption in a dynamic user environment (2012) 0.07
    0.073224775 = sum of:
      0.073224775 = product of:
        0.9153097 = sum of:
          0.07865923 = weight(abstract_txt:technology in 1962) [ClassicSimilarity], result of:
            0.07865923 = score(doc=1962,freq=1.0), product of:
              0.23437658 = queryWeight, product of:
                3.2223382 = boost
                4.2958136 = idf(docFreq=1582, maxDocs=42740)
                0.016931586 = queryNorm
              0.33561045 = fieldWeight in 1962, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2958136 = idf(docFreq=1582, maxDocs=42740)
                0.078125 = fieldNorm(doc=1962)
          0.8366505 = weight(abstract_txt:readiness in 1962) [ClassicSimilarity], result of:
            0.8366505 = score(doc=1962,freq=4.0), product of:
              0.60229266 = queryWeight, product of:
                4.0012293 = boost
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.016931586 = queryNorm
              1.3891096 = fieldWeight in 1962, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.078125 = fieldNorm(doc=1962)
        0.08 = coord(2/25)
    
  2. Kim, J.-H.; Choi, K.-S.: Patent document categorization based on semantic structural information (2007) 0.06
    0.056709 = sum of:
      0.056709 = product of:
        0.35443124 = sum of:
          0.0515973 = weight(abstract_txt:semantically in 2934) [ClassicSimilarity], result of:
            0.0515973 = score(doc=2934,freq=1.0), product of:
              0.12007403 = queryWeight, product of:
                1.0314628 = boost
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.016931586 = queryNorm
              0.4297124 = fieldWeight in 2934, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.0625 = fieldNorm(doc=2934)
          0.10795443 = weight(abstract_txt:clustered in 2934) [ClassicSimilarity], result of:
            0.10795443 = score(doc=2934,freq=2.0), product of:
              0.1559007 = queryWeight, product of:
                1.1753117 = boost
                7.834249 = idf(docFreq=45, maxDocs=42740)
                0.016931586 = queryNorm
              0.6924563 = fieldWeight in 2934, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.834249 = idf(docFreq=45, maxDocs=42740)
                0.0625 = fieldNorm(doc=2934)
          0.086461455 = weight(abstract_txt:nearest in 2934) [ClassicSimilarity], result of:
            0.086461455 = score(doc=2934,freq=1.0), product of:
              0.16939975 = queryWeight, product of:
                1.2251391 = boost
                8.166383 = idf(docFreq=32, maxDocs=42740)
                0.016931586 = queryNorm
              0.5103989 = fieldWeight in 2934, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.166383 = idf(docFreq=32, maxDocs=42740)
                0.0625 = fieldNorm(doc=2934)
          0.108418055 = weight(abstract_txt:documents in 2934) [ClassicSimilarity], result of:
            0.108418055 = score(doc=2934,freq=6.0), product of:
              0.17208184 = queryWeight, product of:
                2.4695995 = boost
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.016931586 = queryNorm
              0.6300377 = fieldWeight in 2934, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.0625 = fieldNorm(doc=2934)
        0.16 = coord(4/25)
    
  3. Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.06
    0.056255568 = sum of:
      0.056255568 = product of:
        0.3515973 = sum of:
          0.09098725 = weight(abstract_txt:investment in 394) [ClassicSimilarity], result of:
            0.09098725 = score(doc=394,freq=2.0), product of:
              0.13910459 = queryWeight, product of:
                1.1101962 = boost
                7.400211 = idf(docFreq=70, maxDocs=42740)
                0.016931586 = queryNorm
              0.6540924 = fieldWeight in 394, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.400211 = idf(docFreq=70, maxDocs=42740)
                0.0625 = fieldNorm(doc=394)
          0.086461455 = weight(abstract_txt:nearest in 394) [ClassicSimilarity], result of:
            0.086461455 = score(doc=394,freq=1.0), product of:
              0.16939975 = queryWeight, product of:
                1.2251391 = boost
                8.166383 = idf(docFreq=32, maxDocs=42740)
                0.016931586 = queryNorm
              0.5103989 = fieldWeight in 394, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.166383 = idf(docFreq=32, maxDocs=42740)
                0.0625 = fieldNorm(doc=394)
          0.11155341 = weight(abstract_txt:neighbor in 394) [ClassicSimilarity], result of:
            0.11155341 = score(doc=394,freq=1.0), product of:
              0.20076422 = queryWeight, product of:
                1.3337431 = boost
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.016931586 = queryNorm
              0.55564386 = fieldWeight in 394, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.0625 = fieldNorm(doc=394)
          0.06259519 = weight(abstract_txt:documents in 394) [ClassicSimilarity], result of:
            0.06259519 = score(doc=394,freq=2.0), product of:
              0.17208184 = queryWeight, product of:
                2.4695995 = boost
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.016931586 = queryNorm
              0.36375242 = fieldWeight in 394, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.0625 = fieldNorm(doc=394)
        0.16 = coord(4/25)
    
  4. Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.05
    0.053911254 = sum of:
      0.053911254 = product of:
        0.33694535 = sum of:
          0.07633531 = weight(abstract_txt:clustered in 2980) [ClassicSimilarity], result of:
            0.07633531 = score(doc=2980,freq=1.0), product of:
              0.1559007 = queryWeight, product of:
                1.1753117 = boost
                7.834249 = idf(docFreq=45, maxDocs=42740)
                0.016931586 = queryNorm
              0.48964056 = fieldWeight in 2980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.834249 = idf(docFreq=45, maxDocs=42740)
                0.0625 = fieldNorm(doc=2980)
          0.086461455 = weight(abstract_txt:nearest in 2980) [ClassicSimilarity], result of:
            0.086461455 = score(doc=2980,freq=1.0), product of:
              0.16939975 = queryWeight, product of:
                1.2251391 = boost
                8.166383 = idf(docFreq=32, maxDocs=42740)
                0.016931586 = queryNorm
              0.5103989 = fieldWeight in 2980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.166383 = idf(docFreq=32, maxDocs=42740)
                0.0625 = fieldNorm(doc=2980)
          0.11155341 = weight(abstract_txt:neighbor in 2980) [ClassicSimilarity], result of:
            0.11155341 = score(doc=2980,freq=1.0), product of:
              0.20076422 = queryWeight, product of:
                1.3337431 = boost
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.016931586 = queryNorm
              0.55564386 = fieldWeight in 2980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.0625 = fieldNorm(doc=2980)
          0.06259519 = weight(abstract_txt:documents in 2980) [ClassicSimilarity], result of:
            0.06259519 = score(doc=2980,freq=2.0), product of:
              0.17208184 = queryWeight, product of:
                2.4695995 = boost
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.016931586 = queryNorm
              0.36375242 = fieldWeight in 2980, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.0625 = fieldNorm(doc=2980)
        0.16 = coord(4/25)
    
  5. Mu, T.; Goulermas, J.Y.; Korkontzelos, I.; Ananiadou, S.: Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities (2016) 0.05
    0.052529052 = sum of:
      0.052529052 = product of:
        0.3283066 = sum of:
          0.048051022 = weight(abstract_txt:scores in 4497) [ClassicSimilarity], result of:
            0.048051022 = score(doc=4497,freq=1.0), product of:
              0.11450721 = queryWeight, product of:
                1.0072689 = boost
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.016931586 = queryNorm
              0.41963315 = fieldWeight in 4497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.0625 = fieldNorm(doc=4497)
          0.0515973 = weight(abstract_txt:semantically in 4497) [ClassicSimilarity], result of:
            0.0515973 = score(doc=4497,freq=1.0), product of:
              0.12007403 = queryWeight, product of:
                1.0314628 = boost
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.016931586 = queryNorm
              0.4297124 = fieldWeight in 4497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8753986 = idf(docFreq=119, maxDocs=42740)
                0.0625 = fieldNorm(doc=4497)
          0.11155341 = weight(abstract_txt:neighbor in 4497) [ClassicSimilarity], result of:
            0.11155341 = score(doc=4497,freq=1.0), product of:
              0.20076422 = queryWeight, product of:
                1.3337431 = boost
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.016931586 = queryNorm
              0.55564386 = fieldWeight in 4497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.890302 = idf(docFreq=15, maxDocs=42740)
                0.0625 = fieldNorm(doc=4497)
          0.117104866 = weight(abstract_txt:documents in 4497) [ClassicSimilarity], result of:
            0.117104866 = score(doc=4497,freq=7.0), product of:
              0.17208184 = queryWeight, product of:
                2.4695995 = boost
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.016931586 = queryNorm
              0.68051845 = fieldWeight in 4497, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.0625 = fieldNorm(doc=4497)
        0.16 = coord(4/25)