Search (3 results, page 1 of 1)

Chen, C.; Cribbin, T.; Macredie, R.; Morar, S.: Visualizing and tracking the growth of competing paradigms : two case studies (2002) 0.00
```
0.0026849252 = product of:
  0.0053698504 = sum of:
    0.0053698504 = product of:
      0.010739701 = sum of:
        0.010739701 = weight(_text_:a in 602) [ClassicSimilarity], result of:
          0.010739701 = score(doc=602,freq=14.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20223314 = fieldWeight in 602, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=602)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this article we demonstrate the use of an integrative approach to visualizing and tracking the development of scientific paradigms. This approach is designed to reveal the long-term process of competing scientific paradigms. We assume that a cluster of highly cited and cocited scientific publications in a cocitation network represents the core of a predominant scientific paradigm. The growth of a paradigm is depicted and animated through the rise of citation rates and the movement of its core cluster towards the center of the cocitation network. We study two cases of competing scientific paradigms in the real world: (1) the causes of mass extinctions, and (2) the connections between mad cow disease and a new variant of a brain disease in humans-vCJD. Various theoretical and practical issues concerning this approach are discussed.

Type

a
Cribbin, T.: Discovering latent topical structure by second-order similarity analysis (2011) 0.00
```
0.0020714647 = product of:
  0.0041429293 = sum of:
    0.0041429293 = product of:
      0.008285859 = sum of:
        0.008285859 = weight(_text_:a in 4470) [ClassicSimilarity], result of:
          0.008285859 = score(doc=4470,freq=12.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.15602624 = fieldWeight in 4470, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4470)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Computing document similarity directly from a "bag of words" vector space model can be problematic because term independence causes the relationships between synonymous terms and the contextual influences that determine the sense of polysemous terms to be ignored. This study compares two methods that potentially address these problems by deriving the higher order relationships that lie latent within the original first-order space. The first is latent semantic analysis (LSA), a dimension reduction method that is a well-known means of addressing the vocabulary mismatch problem in information retrieval systems. The second is the lesser known yet conceptually simple approach of second-order similarity (SOS) analysis, whereby latent similarity is measured in terms of mutual first-order similarity. Nearest neighbour tests show that SOS analysis derives similarity models that are superior to both first-order and LSA-derived models at both coarse and fine levels of semantic granularity. SOS analysis has been criticized for its computational complexity. A second contribution is the novel application of vector truncation to reduce run-time by a constant factor. Speed-ups of 4 to 10 times are achievable without compromising the structural gains achieved by full-vector SOS analysis.

Type

a
Westerman, S.J.; Cribbin, T.; Collins, J.: Human assessments of document similarity (2010) 0.00
```
0.001757696 = product of:
  0.003515392 = sum of:
    0.003515392 = product of:
      0.007030784 = sum of:
        0.007030784 = weight(_text_:a in 3915) [ClassicSimilarity], result of:
          0.007030784 = score(doc=3915,freq=6.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.13239266 = fieldWeight in 3915, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=3915)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Two studies are reported that examined the reliability of human assessments of document similarity and the association between human ratings and the results of n-gram automatic text analysis (ATA). Human interassessor reliability (IAR) was moderate to poor. However, correlations between average human ratings and n-gram solutions were strong. The average correlation between ATA and individual human solutions was greater than IAR. N-gram length influenced the strength of association, but optimum string length depended on the nature of the text (technical vs. nontechnical). We conclude that the methodology applied in previous studies may have led to overoptimistic views on human reliability, but that an optimal n-gram solution can provide a good approximation of the average human assessment of document similarity, a result that has important implications for future development of document visualization systems.

Type

a

Search (3 results, page 1 of 1)

Authors

Years

Themes