Document (#40897)

Author
Neumann, M.
Steinberg, J.
Schaer, P.
Title
Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting
Source
Code4Lib journal. Issue 38(2017), [http://journal.code4lib.org]
Year
2017
Abstract
Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.
Content
Vgl.: http://journal.code4lib.org/articles/13007.
Theme
Metadaten

Similar documents (author)

  1. Schaer, P.: Integration von Open-Access-Repositorien in Fachportale (2010) 2.20
    2.1998017 = sum of:
      2.1998017 = product of:
        4.3996034 = sum of:
          4.3996034 = weight(author_txt:schaer in 3500) [ClassicSimilarity], result of:
            4.3996034 = score(doc=3500,freq=1.0), product of:
              0.7740188 = queryWeight, product of:
                1.1056513 = boost
                9.094566 = idf(docFreq=12, maxDocs=42596)
                0.07697529 = queryNorm
              5.684104 = fieldWeight in 3500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.094566 = idf(docFreq=12, maxDocs=42596)
                0.625 = fieldNorm(doc=3500)
        0.5 = coord(1/2)
    
  2. Munkelt, J.; Schaer, P.: Towards an IR test collection for the German National Library (2018) 1.76
    1.7598414 = sum of:
      1.7598414 = product of:
        3.519683 = sum of:
          3.519683 = weight(author_txt:schaer in 781) [ClassicSimilarity], result of:
            3.519683 = score(doc=781,freq=1.0), product of:
              0.7740188 = queryWeight, product of:
                1.1056513 = boost
                9.094566 = idf(docFreq=12, maxDocs=42596)
                0.07697529 = queryNorm
              4.547283 = fieldWeight in 781, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.094566 = idf(docFreq=12, maxDocs=42596)
                0.5 = fieldNorm(doc=781)
        0.5 = coord(1/2)
    
  3. Neumann. M.: HAL: Hyperspace Analogue to Language (2012) 1.63
    1.6275301 = sum of:
      1.6275301 = product of:
        3.2550602 = sum of:
          3.2550602 = weight(author_txt:neumann in 966) [ClassicSimilarity], result of:
            3.2550602 = score(doc=966,freq=1.0), product of:
              0.6331625 = queryWeight, product of:
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.07697529 = queryNorm
              5.1409554 = fieldWeight in 966, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.625 = fieldNorm(doc=966)
        0.5 = coord(1/2)
    
  4. Neumann, G.: Studienanleitung für das Lehrgebiet alphabetische Katalogisierung (1986) 1.63
    1.6275301 = sum of:
      1.6275301 = product of:
        3.2550602 = sum of:
          3.2550602 = weight(author_txt:neumann in 6011) [ClassicSimilarity], result of:
            3.2550602 = score(doc=6011,freq=1.0), product of:
              0.6331625 = queryWeight, product of:
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.07697529 = queryNorm
              5.1409554 = fieldWeight in 6011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.625 = fieldNorm(doc=6011)
        0.5 = coord(1/2)
    
  5. Neumann, G.: ¬Das *¬ISBC# erschließt Btx für die Schulbibliotheken (1994) 1.63
    1.6275301 = sum of:
      1.6275301 = product of:
        3.2550602 = sum of:
          3.2550602 = weight(author_txt:neumann in 285) [ClassicSimilarity], result of:
            3.2550602 = score(doc=285,freq=1.0), product of:
              0.6331625 = queryWeight, product of:
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.07697529 = queryNorm
              5.1409554 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.225529 = idf(docFreq=30, maxDocs=42596)
                0.625 = fieldNorm(doc=285)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Harlow, C.: Data munging tools in Preparation for RDF : Catmandu and LODRefine (2015) 0.16
    0.16223039 = sum of:
      0.16223039 = product of:
        0.50697 = sum of:
          0.02499326 = weight(abstract_txt:library in 3278) [ClassicSimilarity], result of:
            0.02499326 = score(doc=3278,freq=3.0), product of:
              0.07255036 = queryWeight, product of:
                1.0954504 = boost
                3.1823115 = idf(docFreq=4803, maxDocs=42596)
                0.020811535 = queryNorm
              0.34449533 = fieldWeight in 3278, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1823115 = idf(docFreq=4803, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
          0.061096575 = weight(abstract_txt:metadata in 3278) [ClassicSimilarity], result of:
            0.061096575 = score(doc=3278,freq=3.0), product of:
              0.11501076 = queryWeight, product of:
                1.1261508 = boost
                4.907245 = idf(docFreq=855, maxDocs=42596)
                0.020811535 = queryNorm
              0.53122485 = fieldWeight in 3278, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.907245 = idf(docFreq=855, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
          0.0519771 = weight(abstract_txt:tool in 3278) [ClassicSimilarity], result of:
            0.0519771 = score(doc=3278,freq=2.0), product of:
              0.11820404 = queryWeight, product of:
                1.1416776 = boost
                4.974904 = idf(docFreq=799, maxDocs=42596)
                0.020811535 = queryNorm
              0.43972355 = fieldWeight in 3278, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.974904 = idf(docFreq=799, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
          0.016518574 = weight(abstract_txt:with in 3278) [ClassicSimilarity], result of:
            0.016518574 = score(doc=3278,freq=3.0), product of:
              0.060587857 = queryWeight, product of:
                1.1559396 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.020811535 = queryNorm
              0.27263835 = fieldWeight in 3278, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
          0.055161785 = weight(abstract_txt:cases in 3278) [ClassicSimilarity], result of:
            0.055161785 = score(doc=3278,freq=1.0), product of:
              0.15495057 = queryWeight, product of:
                1.3071455 = boost
                5.695936 = idf(docFreq=388, maxDocs=42596)
                0.020811535 = queryNorm
              0.355996 = fieldWeight in 3278, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.695936 = idf(docFreq=388, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
          0.014511462 = weight(abstract_txt:that in 3278) [ClassicSimilarity], result of:
            0.014511462 = score(doc=3278,freq=2.0), product of:
              0.068529636 = queryWeight, product of:
                1.374474 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.020811535 = queryNorm
              0.21175455 = fieldWeight in 3278, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
          0.17950188 = weight(abstract_txt:programmers in 3278) [ClassicSimilarity], result of:
            0.17950188 = score(doc=3278,freq=1.0), product of:
              0.34026214 = queryWeight, product of:
                1.9370202 = boost
                8.4406395 = idf(docFreq=24, maxDocs=42596)
                0.020811535 = queryNorm
              0.52753997 = fieldWeight in 3278, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.4406395 = idf(docFreq=24, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
          0.103209324 = weight(abstract_txt:data in 3278) [ClassicSimilarity], result of:
            0.103209324 = score(doc=3278,freq=9.0), product of:
              0.16313225 = queryWeight, product of:
                2.3230464 = boost
                3.3742545 = idf(docFreq=3964, maxDocs=42596)
                0.020811535 = queryNorm
              0.6326727 = fieldWeight in 3278, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.3742545 = idf(docFreq=3964, maxDocs=42596)
                0.0625 = fieldNorm(doc=3278)
        0.32 = coord(8/25)
    
  2. Fox, B.; Fox, C.J.: Efficient stemmer generation (2002) 0.15
    0.14644364 = sum of:
      0.14644364 = product of:
        0.9152727 = sum of:
          0.016689755 = weight(abstract_txt:with in 3586) [ClassicSimilarity], result of:
            0.016689755 = score(doc=3586,freq=1.0), product of:
              0.060587857 = queryWeight, product of:
                1.1559396 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.020811535 = queryNorm
              0.2754637 = fieldWeight in 3586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.109375 = fieldNorm(doc=3586)
          0.017957019 = weight(abstract_txt:that in 3586) [ClassicSimilarity], result of:
            0.017957019 = score(doc=3586,freq=1.0), product of:
              0.068529636 = queryWeight, product of:
                1.374474 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.020811535 = queryNorm
              0.2620329 = fieldWeight in 3586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.109375 = fieldNorm(doc=3586)
          0.31412828 = weight(abstract_txt:programmers in 3586) [ClassicSimilarity], result of:
            0.31412828 = score(doc=3586,freq=1.0), product of:
              0.34026214 = queryWeight, product of:
                1.9370202 = boost
                8.4406395 = idf(docFreq=24, maxDocs=42596)
                0.020811535 = queryNorm
              0.92319494 = fieldWeight in 3586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.4406395 = idf(docFreq=24, maxDocs=42596)
                0.109375 = fieldNorm(doc=3586)
          0.5664976 = weight(abstract_txt:custom in 3586) [ClassicSimilarity], result of:
            0.5664976 = score(doc=3586,freq=2.0), product of:
              0.45803085 = queryWeight, product of:
                2.7524557 = boost
                7.995954 = idf(docFreq=38, maxDocs=42596)
                0.020811535 = queryNorm
              1.236811 = fieldWeight in 3586, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.995954 = idf(docFreq=38, maxDocs=42596)
                0.109375 = fieldNorm(doc=3586)
        0.16 = coord(4/25)
    
  3. Foulonneau, M.: Information redundancy across metadata collections (2007) 0.14
    0.13523044 = sum of:
      0.13523044 = product of:
        0.4225951 = sum of:
          0.020406911 = weight(abstract_txt:library in 2095) [ClassicSimilarity], result of:
            0.020406911 = score(doc=2095,freq=2.0), product of:
              0.07255036 = queryWeight, product of:
                1.0954504 = boost
                3.1823115 = idf(docFreq=4803, maxDocs=42596)
                0.020811535 = queryNorm
              0.28127927 = fieldWeight in 2095, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1823115 = idf(docFreq=4803, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
          0.11699104 = weight(abstract_txt:metadata in 2095) [ClassicSimilarity], result of:
            0.11699104 = score(doc=2095,freq=11.0), product of:
              0.11501076 = queryWeight, product of:
                1.1261508 = boost
                4.907245 = idf(docFreq=855, maxDocs=42596)
                0.020811535 = queryNorm
              1.0172182 = fieldWeight in 2095, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                4.907245 = idf(docFreq=855, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
          0.009537003 = weight(abstract_txt:with in 2095) [ClassicSimilarity], result of:
            0.009537003 = score(doc=2095,freq=1.0), product of:
              0.060587857 = queryWeight, product of:
                1.1559396 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.020811535 = queryNorm
              0.15740784 = fieldWeight in 2095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
          0.014511462 = weight(abstract_txt:that in 2095) [ClassicSimilarity], result of:
            0.014511462 = score(doc=2095,freq=2.0), product of:
              0.068529636 = queryWeight, product of:
                1.374474 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.020811535 = queryNorm
              0.21175455 = fieldWeight in 2095, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
          0.0375751 = weight(abstract_txt:present in 2095) [ClassicSimilarity], result of:
            0.0375751 = score(doc=2095,freq=1.0), product of:
              0.13731927 = queryWeight, product of:
                1.5070883 = boost
                4.37813 = idf(docFreq=1452, maxDocs=42596)
                0.020811535 = queryNorm
              0.27363312 = fieldWeight in 2095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.37813 = idf(docFreq=1452, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
          0.05346714 = weight(abstract_txt:digital in 2095) [ClassicSimilarity], result of:
            0.05346714 = score(doc=2095,freq=2.0), product of:
              0.13788362 = queryWeight, product of:
                1.510182 = boost
                4.3871174 = idf(docFreq=1439, maxDocs=42596)
                0.020811535 = queryNorm
              0.38777006 = fieldWeight in 2095, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3871174 = idf(docFreq=1439, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
          0.13570331 = weight(abstract_txt:harvesting in 2095) [ClassicSimilarity], result of:
            0.13570331 = score(doc=2095,freq=1.0), product of:
              0.28237608 = queryWeight, product of:
                1.7645798 = boost
                7.689224 = idf(docFreq=52, maxDocs=42596)
                0.020811535 = queryNorm
              0.4805765 = fieldWeight in 2095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.689224 = idf(docFreq=52, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
          0.034403108 = weight(abstract_txt:data in 2095) [ClassicSimilarity], result of:
            0.034403108 = score(doc=2095,freq=1.0), product of:
              0.16313225 = queryWeight, product of:
                2.3230464 = boost
                3.3742545 = idf(docFreq=3964, maxDocs=42596)
                0.020811535 = queryNorm
              0.2108909 = fieldWeight in 2095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3742545 = idf(docFreq=3964, maxDocs=42596)
                0.0625 = fieldNorm(doc=2095)
        0.32 = coord(8/25)
    
  4. Güven, S.; Feiner, S.: ¬A hypermedia authoring tool for augmented and virtual reality (2003) 0.14
    0.13516934 = sum of:
      0.13516934 = product of:
        0.48274767 = sum of:
          0.044896472 = weight(abstract_txt:task in 115) [ClassicSimilarity], result of:
            0.044896472 = score(doc=115,freq=1.0), product of:
              0.11640432 = queryWeight, product of:
                1.1329529 = boost
                4.936886 = idf(docFreq=830, maxDocs=42596)
                0.020811535 = queryNorm
              0.3856942 = fieldWeight in 115, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.936886 = idf(docFreq=830, maxDocs=42596)
                0.078125 = fieldNorm(doc=115)
          0.0459417 = weight(abstract_txt:tool in 115) [ClassicSimilarity], result of:
            0.0459417 = score(doc=115,freq=1.0), product of:
              0.11820404 = queryWeight, product of:
                1.1416776 = boost
                4.974904 = idf(docFreq=799, maxDocs=42596)
                0.020811535 = queryNorm
              0.38866436 = fieldWeight in 115, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.974904 = idf(docFreq=799, maxDocs=42596)
                0.078125 = fieldNorm(doc=115)
          0.0168592 = weight(abstract_txt:with in 115) [ClassicSimilarity], result of:
            0.0168592 = score(doc=115,freq=2.0), product of:
              0.060587857 = queryWeight, product of:
                1.1559396 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.020811535 = queryNorm
              0.27826038 = fieldWeight in 115, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.078125 = fieldNorm(doc=115)
          0.09087762 = weight(abstract_txt:creating in 115) [ClassicSimilarity], result of:
            0.09087762 = score(doc=115,freq=2.0), product of:
              0.14783897 = queryWeight, product of:
                1.2767968 = boost
                5.563691 = idf(docFreq=443, maxDocs=42596)
                0.020811535 = queryNorm
              0.6147068 = fieldWeight in 115, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.563691 = idf(docFreq=443, maxDocs=42596)
                0.078125 = fieldNorm(doc=115)
          0.012826442 = weight(abstract_txt:that in 115) [ClassicSimilarity], result of:
            0.012826442 = score(doc=115,freq=1.0), product of:
              0.068529636 = queryWeight, product of:
                1.374474 = boost
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.020811535 = queryNorm
              0.18716635 = fieldWeight in 115, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3957293 = idf(docFreq=10548, maxDocs=42596)
                0.078125 = fieldNorm(doc=115)
          0.046968874 = weight(abstract_txt:present in 115) [ClassicSimilarity], result of:
            0.046968874 = score(doc=115,freq=1.0), product of:
              0.13731927 = queryWeight, product of:
                1.5070883 = boost
                4.37813 = idf(docFreq=1452, maxDocs=42596)
                0.020811535 = queryNorm
              0.3420414 = fieldWeight in 115, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.37813 = idf(docFreq=1452, maxDocs=42596)
                0.078125 = fieldNorm(doc=115)
          0.22437735 = weight(abstract_txt:programmers in 115) [ClassicSimilarity], result of:
            0.22437735 = score(doc=115,freq=1.0), product of:
              0.34026214 = queryWeight, product of:
                1.9370202 = boost
                8.4406395 = idf(docFreq=24, maxDocs=42596)
                0.020811535 = queryNorm
              0.65942496 = fieldWeight in 115, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.4406395 = idf(docFreq=24, maxDocs=42596)
                0.078125 = fieldNorm(doc=115)
        0.28 = coord(7/25)
    
  5. Shreeves, S.L.; Kaczmarek, J.S.; Cole, T.W.: Harvesting cultural heritage metadata using OAI Protocol (2003) 0.12
    0.122849375 = sum of:
      0.122849375 = product of:
        0.5118724 = sum of:
          0.02550864 = weight(abstract_txt:library in 5776) [ClassicSimilarity], result of:
            0.02550864 = score(doc=5776,freq=2.0), product of:
              0.07255036 = queryWeight, product of:
                1.0954504 = boost
                3.1823115 = idf(docFreq=4803, maxDocs=42596)
                0.020811535 = queryNorm
              0.3515991 = fieldWeight in 5776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1823115 = idf(docFreq=4803, maxDocs=42596)
                0.078125 = fieldNorm(doc=5776)
          0.12471286 = weight(abstract_txt:metadata in 5776) [ClassicSimilarity], result of:
            0.12471286 = score(doc=5776,freq=8.0), product of:
              0.11501076 = queryWeight, product of:
                1.1261508 = boost
                4.907245 = idf(docFreq=855, maxDocs=42596)
                0.020811535 = queryNorm
              1.0843582 = fieldWeight in 5776, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.907245 = idf(docFreq=855, maxDocs=42596)
                0.078125 = fieldNorm(doc=5776)
          0.011921254 = weight(abstract_txt:with in 5776) [ClassicSimilarity], result of:
            0.011921254 = score(doc=5776,freq=1.0), product of:
              0.060587857 = queryWeight, product of:
                1.1559396 = boost
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.020811535 = queryNorm
              0.19675979 = fieldWeight in 5776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5185254 = idf(docFreq=9329, maxDocs=42596)
                0.078125 = fieldNorm(doc=5776)
          0.06683392 = weight(abstract_txt:digital in 5776) [ClassicSimilarity], result of:
            0.06683392 = score(doc=5776,freq=2.0), product of:
              0.13788362 = queryWeight, product of:
                1.510182 = boost
                4.3871174 = idf(docFreq=1439, maxDocs=42596)
                0.020811535 = queryNorm
              0.48471257 = fieldWeight in 5776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3871174 = idf(docFreq=1439, maxDocs=42596)
                0.078125 = fieldNorm(doc=5776)
          0.23989183 = weight(abstract_txt:harvesting in 5776) [ClassicSimilarity], result of:
            0.23989183 = score(doc=5776,freq=2.0), product of:
              0.28237608 = queryWeight, product of:
                1.7645798 = boost
                7.689224 = idf(docFreq=52, maxDocs=42596)
                0.020811535 = queryNorm
              0.84954727 = fieldWeight in 5776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.689224 = idf(docFreq=52, maxDocs=42596)
                0.078125 = fieldNorm(doc=5776)
          0.043003887 = weight(abstract_txt:data in 5776) [ClassicSimilarity], result of:
            0.043003887 = score(doc=5776,freq=1.0), product of:
              0.16313225 = queryWeight, product of:
                2.3230464 = boost
                3.3742545 = idf(docFreq=3964, maxDocs=42596)
                0.020811535 = queryNorm
              0.26361364 = fieldWeight in 5776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3742545 = idf(docFreq=3964, maxDocs=42596)
                0.078125 = fieldNorm(doc=5776)
        0.24 = coord(6/25)