Document (#41433)

Author
Stathopoulos, Y.
Baker, S.
Rei, M.
Teufel, S.
Title
Variable typing : assigning meaning to variables in mathematical text
Source
Proceedings of NAACL-HLT 2018, Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, June 1 - 6, 2018. (Long Papers)
Imprint
Berlin : Springer
Year
2018
Pages
S.303-312
Abstract
Information about the meaning of mathematical variables in text is useful in NLP/IR tasks such as symbol disambiguation, topic modeling and mathematical information retrieval (MIR). We introduce variable typing, the task of assigning one mathematical type (multiword technical terms referring to mathematical concepts) to each variable in a sentence of mathematical text. As part of this work, we also introduce a new annotated data set composed of 33,524 data points extracted from scientific documents published on arXiv. Our intrinsic evaluation demonstrates that our data set is sufficient to successfully train and evaluate current classifiers from three different model architectures. The best performing model is evaluated on an extrinsic task: MIR, by producing a typed formula index. Our results show that the best performing MIR models make use of our typed index, compared to a formula index only containing raw symbols, thereby demonstrating the usefulness of variable typing.
Content
Vgl.: http://aclweb.org/anthology/N18-1028.
Field
Mathematik

Similar documents (author)

  1. Baker, S.L.: Will fiction classification schemes increase use? (1988) 5.30
    5.298757 = sum of:
      5.298757 = weight(author_txt:baker in 1023) [ClassicSimilarity], result of:
        5.298757 = fieldWeight in 1023, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.625 = fieldNorm(doc=1023)
    
  2. Baker, B.: ¬A conceptual framework for teaching online catalog use (1986) 5.30
    5.298757 = sum of:
      5.298757 = weight(author_txt:baker in 1576) [ClassicSimilarity], result of:
        5.298757 = fieldWeight in 1576, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.625 = fieldNorm(doc=1576)
    
  3. Baker, C.: ¬A marriage of high-tech and fine art : the National Gallery's micro gallery project (1993) 5.30
    5.298757 = sum of:
      5.298757 = weight(author_txt:baker in 7007) [ClassicSimilarity], result of:
        5.298757 = fieldWeight in 7007, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.625 = fieldNorm(doc=7007)
    
  4. Baker, T.: ¬A multilingual registry for Dublin Core elements and qualifiers (2000) 5.30
    5.298757 = sum of:
      5.298757 = weight(author_txt:baker in 4447) [ClassicSimilarity], result of:
        5.298757 = fieldWeight in 4447, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.625 = fieldNorm(doc=4447)
    
  5. Baker, T.: ¬A grammar of Dublin Core (2000) 5.30
    5.298757 = sum of:
      5.298757 = weight(author_txt:baker in 1236) [ClassicSimilarity], result of:
        5.298757 = fieldWeight in 1236, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.625 = fieldNorm(doc=1236)
    

Similar documents (content)

  1. Aizawa, A.; Kohlhase, M.: Mathematical information retrieval (2021) 0.13
    0.12655717 = sum of:
      0.12655717 = product of:
        0.79098237 = sum of:
          0.04992833 = weight(abstract_txt:task in 667) [ClassicSimilarity], result of:
            0.04992833 = score(doc=667,freq=1.0), product of:
              0.092946395 = queryWeight, product of:
                1.3162117 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.014378393 = queryNorm
              0.5371734 = fieldWeight in 667, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.109375 = fieldNorm(doc=667)
          0.09683207 = weight(abstract_txt:introduce in 667) [ClassicSimilarity], result of:
            0.09683207 = score(doc=667,freq=1.0), product of:
              0.14454862 = queryWeight, product of:
                1.6414077 = boost
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.014378393 = queryNorm
              0.66989267 = fieldWeight in 667, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.109375 = fieldNorm(doc=667)
          0.18642814 = weight(abstract_txt:formula in 667) [ClassicSimilarity], result of:
            0.18642814 = score(doc=667,freq=1.0), product of:
              0.2237049 = queryWeight, product of:
                2.04196 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.014378393 = queryNorm
              0.8333664 = fieldWeight in 667, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.109375 = fieldNorm(doc=667)
          0.45779383 = weight(abstract_txt:mathematical in 667) [ClassicSimilarity], result of:
            0.45779383 = score(doc=667,freq=2.0), product of:
              0.46609905 = queryWeight, product of:
                5.1051583 = boost
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.014378393 = queryNorm
              0.98218143 = fieldWeight in 667, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.109375 = fieldNorm(doc=667)
        0.16 = coord(4/25)
    
  2. Ibáñez, A.; Armañanzas, R.; Bielza, C.; Larrañaga, P.: Genetic algorithms and Gaussian Bayesian networks to uncover the predictive core set of bibliometric indices (2016) 0.12
    0.11699087 = sum of:
      0.11699087 = product of:
        0.48746198 = sum of:
          0.015254852 = weight(abstract_txt:model in 3041) [ClassicSimilarity], result of:
            0.015254852 = score(doc=3041,freq=1.0), product of:
              0.061230134 = queryWeight, product of:
                1.0682971 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.014378393 = queryNorm
              0.24913962 = fieldWeight in 3041, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=3041)
          0.01897317 = weight(abstract_txt:data in 3041) [ClassicSimilarity], result of:
            0.01897317 = score(doc=3041,freq=2.0), product of:
              0.064338885 = queryWeight, product of:
                1.3411949 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014378393 = queryNorm
              0.29489428 = fieldWeight in 3041, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=3041)
          0.052645516 = weight(abstract_txt:best in 3041) [ClassicSimilarity], result of:
            0.052645516 = score(doc=3041,freq=3.0), product of:
              0.0969528 = queryWeight, product of:
                1.3442798 = boost
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.014378393 = queryNorm
              0.5430015 = fieldWeight in 3041, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.0625 = fieldNorm(doc=3041)
          0.08420864 = weight(abstract_txt:variables in 3041) [ClassicSimilarity], result of:
            0.08420864 = score(doc=3041,freq=2.0), product of:
              0.1517939 = queryWeight, product of:
                1.6820412 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.014378393 = queryNorm
              0.5547564 = fieldWeight in 3041, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0625 = fieldNorm(doc=3041)
          0.067013815 = weight(abstract_txt:index in 3041) [ClassicSimilarity], result of:
            0.067013815 = score(doc=3041,freq=3.0), product of:
              0.13035452 = queryWeight, product of:
                1.9090538 = boost
                4.74895 = idf(docFreq=1040, maxDocs=44218)
                0.014378393 = queryNorm
              0.5140889 = fieldWeight in 3041, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.74895 = idf(docFreq=1040, maxDocs=44218)
                0.0625 = fieldNorm(doc=3041)
          0.24936602 = weight(abstract_txt:variable in 3041) [ClassicSimilarity], result of:
            0.24936602 = score(doc=3041,freq=2.0), product of:
              0.39438286 = queryWeight, product of:
                3.8342772 = boost
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.014378393 = queryNorm
              0.63229424 = fieldWeight in 3041, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.0625 = fieldNorm(doc=3041)
        0.24 = coord(6/25)
    
  3. Malesios, C.: Some variations on the standard theoretical models for the h-index : a comparative analysis (2015) 0.10
    0.09985768 = sum of:
      0.09985768 = product of:
        0.6241105 = sum of:
          0.022882279 = weight(abstract_txt:model in 2267) [ClassicSimilarity], result of:
            0.022882279 = score(doc=2267,freq=1.0), product of:
              0.061230134 = queryWeight, product of:
                1.0682971 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.014378393 = queryNorm
              0.37370944 = fieldWeight in 2267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.09375 = fieldNorm(doc=2267)
          0.020124085 = weight(abstract_txt:data in 2267) [ClassicSimilarity], result of:
            0.020124085 = score(doc=2267,freq=1.0), product of:
              0.064338885 = queryWeight, product of:
                1.3411949 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014378393 = queryNorm
              0.31278262 = fieldWeight in 2267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=2267)
          0.10052073 = weight(abstract_txt:index in 2267) [ClassicSimilarity], result of:
            0.10052073 = score(doc=2267,freq=3.0), product of:
              0.13035452 = queryWeight, product of:
                1.9090538 = boost
                4.74895 = idf(docFreq=1040, maxDocs=44218)
                0.014378393 = queryNorm
              0.7711334 = fieldWeight in 2267, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.74895 = idf(docFreq=1040, maxDocs=44218)
                0.09375 = fieldNorm(doc=2267)
          0.48058343 = weight(abstract_txt:mathematical in 2267) [ClassicSimilarity], result of:
            0.48058343 = score(doc=2267,freq=3.0), product of:
              0.46609905 = queryWeight, product of:
                5.1051583 = boost
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.014378393 = queryNorm
              1.0310757 = fieldWeight in 2267, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.09375 = fieldNorm(doc=2267)
        0.16 = coord(4/25)
    
  4. Zhang, X.; Liu, J.; Cole, M.; Belkin, N.: Predicting users' domain knowledge in information retrieval using multiple regression analysis of search behaviors (2015) 0.09
    0.09169507 = sum of:
      0.09169507 = product of:
        0.3820628 = sum of:
          0.037366606 = weight(abstract_txt:model in 1822) [ClassicSimilarity], result of:
            0.037366606 = score(doc=1822,freq=6.0), product of:
              0.061230134 = queryWeight, product of:
                1.0682971 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.014378393 = queryNorm
              0.61026496 = fieldWeight in 1822, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=1822)
          0.040348187 = weight(abstract_txt:task in 1822) [ClassicSimilarity], result of:
            0.040348187 = score(doc=1822,freq=2.0), product of:
              0.092946395 = queryWeight, product of:
                1.3162117 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.014378393 = queryNorm
              0.43410167 = fieldWeight in 1822, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.0625 = fieldNorm(doc=1822)
          0.0134160565 = weight(abstract_txt:data in 1822) [ClassicSimilarity], result of:
            0.0134160565 = score(doc=1822,freq=1.0), product of:
              0.064338885 = queryWeight, product of:
                1.3411949 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014378393 = queryNorm
              0.20852174 = fieldWeight in 1822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=1822)
          0.030394902 = weight(abstract_txt:best in 1822) [ClassicSimilarity], result of:
            0.030394902 = score(doc=1822,freq=1.0), product of:
              0.0969528 = queryWeight, product of:
                1.3442798 = boost
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.014378393 = queryNorm
              0.31350204 = fieldWeight in 1822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0160327 = idf(docFreq=796, maxDocs=44218)
                0.0625 = fieldNorm(doc=1822)
          0.08420864 = weight(abstract_txt:variables in 1822) [ClassicSimilarity], result of:
            0.08420864 = score(doc=1822,freq=2.0), product of:
              0.1517939 = queryWeight, product of:
                1.6820412 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.014378393 = queryNorm
              0.5547564 = fieldWeight in 1822, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0625 = fieldNorm(doc=1822)
          0.17632839 = weight(abstract_txt:variable in 1822) [ClassicSimilarity], result of:
            0.17632839 = score(doc=1822,freq=1.0), product of:
              0.39438286 = queryWeight, product of:
                3.8342772 = boost
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.014378393 = queryNorm
              0.44709954 = fieldWeight in 1822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.0625 = fieldNorm(doc=1822)
        0.24 = coord(6/25)
    
  5. Guns, R.: Tracing the origins of the semantic web (2013) 0.08
    0.083661824 = sum of:
      0.083661824 = product of:
        0.5228864 = sum of:
          0.0134160565 = weight(abstract_txt:data in 1093) [ClassicSimilarity], result of:
            0.0134160565 = score(doc=1093,freq=1.0), product of:
              0.064338885 = queryWeight, product of:
                1.3411949 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014378393 = queryNorm
              0.20852174 = fieldWeight in 1093, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=1093)
          0.042115606 = weight(abstract_txt:meaning in 1093) [ClassicSimilarity], result of:
            0.042115606 = score(doc=1093,freq=1.0), product of:
              0.12050042 = queryWeight, product of:
                1.4986622 = boost
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.014378393 = queryNorm
              0.34950587 = fieldWeight in 1093, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.0625 = fieldNorm(doc=1093)
          0.14675795 = weight(abstract_txt:typed in 1093) [ClassicSimilarity], result of:
            0.14675795 = score(doc=1093,freq=1.0), product of:
              0.27696675 = queryWeight, product of:
                2.2720783 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.014378393 = queryNorm
              0.5298757 = fieldWeight in 1093, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.0625 = fieldNorm(doc=1093)
          0.3205968 = weight(abstract_txt:typing in 1093) [ClassicSimilarity], result of:
            0.3205968 = score(doc=1093,freq=2.0), product of:
              0.42366225 = queryWeight, product of:
                3.4416363 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.014378393 = queryNorm
              0.75672734 = fieldWeight in 1093, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0625 = fieldNorm(doc=1093)
        0.16 = coord(4/25)