Search (6 results, page 1 of 1)

Stover, J.A.; Winter, Y.; Koppel, M.; Kestemont, M.: Computational authorship verification method attributes a new work to a major 2nd century African author (2016) 0.05
```
0.046723835 = sum of:
  0.031430807 = product of:
    0.12572323 = sum of:
      0.12572323 = weight(_text_:authors in 2503) [ClassicSimilarity], result of:
        0.12572323 = score(doc=2503,freq=6.0), product of:
          0.24018547 = queryWeight, product of:
            4.558814 = idf(docFreq=1258, maxDocs=44218)
            0.052685954 = queryNorm
          0.52344227 = fieldWeight in 2503, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            4.558814 = idf(docFreq=1258, maxDocs=44218)
            0.046875 = fieldNorm(doc=2503)
    0.25 = coord(1/4)
  0.015293028 = product of:
    0.030586056 = sum of:
      0.030586056 = weight(_text_:m in 2503) [ClassicSimilarity], result of:
        0.030586056 = score(doc=2503,freq=4.0), product of:
          0.13110629 = queryWeight, product of:
            2.4884486 = idf(docFreq=9980, maxDocs=44218)
            0.052685954 = queryNorm
          0.23329206 = fieldWeight in 2503, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            2.4884486 = idf(docFreq=9980, maxDocs=44218)
            0.046875 = fieldNorm(doc=2503)
    0.5 = coord(1/2)
```
Abstract

We discuss a real-world application of a recently proposed machine learning method for authorship verification. Authorship verification is considered an extremely difficult task in computational text classification, because it does not assume that the correct author of an anonymous text is included in the candidate authors available. To determine whether 2 documents have been written by the same author, the verification method discussed uses repeated feature subsampling and a pool of impostor authors. We use this technique to attribute a newly discovered Latin text from antiquity (the Compendiosa expositio) to Apuleius. This North African writer was one of the most important authors of the Roman Empire in the 2nd century and authored one of the world's first novels. This attribution has profound and wide-reaching cultural value, because it has been over a century since a new text by a major author from antiquity was discovered. This research therefore illustrates the rapidly growing potential of computational methods for studying the global textual heritage.

Koppel, M.; Schweitzer, N.: Measuring direct and indirect authorial influence in historical corpora (2014) 0.04

0.03861385 = sum of:
  0.024195446 = product of:
    0.09678178 = sum of:
      0.09678178 = weight(_text_:authors in 1506) [ClassicSimilarity], result of:
        0.09678178 = score(doc=1506,freq=2.0), product of:
          0.24018547 = queryWeight, product of:
            4.558814 = idf(docFreq=1258, maxDocs=44218)
            0.052685954 = queryNorm
          0.40294603 = fieldWeight in 1506, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.558814 = idf(docFreq=1258, maxDocs=44218)
            0.0625 = fieldNorm(doc=1506)
    0.25 = coord(1/4)
  0.014418405 = product of:
    0.02883681 = sum of:
      0.02883681 = weight(_text_:m in 1506) [ClassicSimilarity], result of:
        0.02883681 = score(doc=1506,freq=2.0), product of:
          0.13110629 = queryWeight, product of:
            2.4884486 = idf(docFreq=9980, maxDocs=44218)
            0.052685954 = queryNorm
          0.21994986 = fieldWeight in 1506, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.4884486 = idf(docFreq=9980, maxDocs=44218)
            0.0625 = fieldNorm(doc=1506)
    0.5 = coord(1/2)

Abstract: We show how automatically extracted citations in historical corpora can be used to measure the direct and indirect influence of authors on each other. These measures can in turn be used to determine an author's overall prominence in the corpus and to identify distinct schools of thought. We apply our methods to two major historical corpora. Using scholarly consensus as a gold standard, we demonstrate empirically the superiority of indirect influence over direct influence as a basis for various measures of authorial impact.

Koppel, M.; Schler, J.; Argamon, S.: Computational methods in authorship attribution (2009) 0.02
```
0.024133656 = sum of:
  0.015122154 = product of:
    0.060488615 = sum of:
      0.060488615 = weight(_text_:authors in 2683) [ClassicSimilarity], result of:
        0.060488615 = score(doc=2683,freq=2.0), product of:
          0.24018547 = queryWeight, product of:
            4.558814 = idf(docFreq=1258, maxDocs=44218)
            0.052685954 = queryNorm
          0.25184128 = fieldWeight in 2683, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.558814 = idf(docFreq=1258, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2683)
    0.25 = coord(1/4)
  0.009011503 = product of:
    0.018023007 = sum of:
      0.018023007 = weight(_text_:m in 2683) [ClassicSimilarity], result of:
        0.018023007 = score(doc=2683,freq=2.0), product of:
          0.13110629 = queryWeight, product of:
            2.4884486 = idf(docFreq=9980, maxDocs=44218)
            0.052685954 = queryNorm
          0.13746867 = fieldWeight in 2683, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.4884486 = idf(docFreq=9980, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2683)
    0.5 = coord(1/2)
```
Abstract

Statistical authorship attribution has a long history, culminating in the use of modern machine learning classification methods. Nevertheless, most of this work suffers from the limitation of assuming a small closed set of candidate authors and essentially unlimited training text for each. Real-life authorship attribution problems, however, typically fall short of this ideal. Thus, following detailed discussion of previous work, three scenarios are considered here for which solutions to the basic attribution problem are inadequate. In the first variant, the profiling problem, there is no candidate set at all; in this case, the challenge is to provide as much demographic or psychological information as possible about the author. In the second variant, the needle-in-a-haystack problem, there are many thousands of candidates for each of whom we might have a very limited writing sample. In the third variant, the verification problem, there is no closed candidate set but there is one suspect; in this case, the challenge is to determine if the suspect is or is not the author. For each variant, it is shown how machine learning methods can be adapted to handle the special challenges of that variant.

Koppel, M.; Akiva, N.; Dagan, I.: Feature instability as a criterion for selecting potential style markers (2006) 0.01

0.0072092023 = product of:
  0.014418405 = sum of:
    0.014418405 = product of:
      0.02883681 = sum of:
        0.02883681 = weight(_text_:m in 6092) [ClassicSimilarity], result of:
          0.02883681 = score(doc=6092,freq=2.0), product of:
            0.13110629 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.052685954 = queryNorm
            0.21994986 = fieldWeight in 6092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0625 = fieldNorm(doc=6092)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Akiva, N.; Koppel, M.: ¬A generic unsupervised method for decomposing multi-author documents (2013) 0.01

0.0072092023 = product of:
  0.014418405 = sum of:
    0.014418405 = product of:
      0.02883681 = sum of:
        0.02883681 = weight(_text_:m in 1098) [ClassicSimilarity], result of:
          0.02883681 = score(doc=1098,freq=2.0), product of:
            0.13110629 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.052685954 = queryNorm
            0.21994986 = fieldWeight in 1098, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0625 = fieldNorm(doc=1098)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Koppel, M.; Winter, Y.: Determining if two documents are written by the same author (2014) 0.01

0.006308052 = product of:
  0.012616104 = sum of:
    0.012616104 = product of:
      0.025232209 = sum of:
        0.025232209 = weight(_text_:m in 1602) [ClassicSimilarity], result of:
          0.025232209 = score(doc=1602,freq=2.0), product of:
            0.13110629 = queryWeight, product of:
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.052685954 = queryNorm
            0.19245613 = fieldWeight in 1602, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4884486 = idf(docFreq=9980, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1602)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Search (6 results, page 1 of 1)

Authors

Years