Document (#40897)

Author
Neumann, M.
Steinberg, J.
Schaer, P.
Title
Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting
Source
Code4Lib journal. Issue 38(2017), [http://journal.code4lib.org]
Year
2017
Abstract
Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.
Content
Vgl.: http://journal.code4lib.org/articles/13007.
Theme
Metadaten

Similar documents (author)

  1. Schaer, P.: Integration von Open-Access-Repositorien in Fachportale (2010) 2.20
    2.2005491 = sum of:
      2.2005491 = product of:
        4.4010983 = sum of:
          4.4010983 = weight(author_txt:schaer in 4321) [ClassicSimilarity], result of:
            4.4010983 = score(doc=4321,freq=1.0), product of:
              0.77399457 = queryWeight, product of:
                1.105608 = boost
                9.097941 = idf(docFreq=12, maxDocs=42740)
                0.07694734 = queryNorm
              5.6862135 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.097941 = idf(docFreq=12, maxDocs=42740)
                0.625 = fieldNorm(doc=4321)
        0.5 = coord(1/2)
    
  2. Munkelt, J.; Schaer, P.: Towards an IR test collection for the German National Library (2018) 1.76
    1.7604393 = sum of:
      1.7604393 = product of:
        3.5208786 = sum of:
          3.5208786 = weight(author_txt:schaer in 781) [ClassicSimilarity], result of:
            3.5208786 = score(doc=781,freq=1.0), product of:
              0.77399457 = queryWeight, product of:
                1.105608 = boost
                9.097941 = idf(docFreq=12, maxDocs=42740)
                0.07694734 = queryNorm
              4.5489707 = fieldWeight in 781, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.097941 = idf(docFreq=12, maxDocs=42740)
                0.5 = fieldNorm(doc=781)
        0.5 = coord(1/2)
    
  3. Neumann. M.: HAL: Hyperspace Analogue to Language (2012) 1.63
    1.6282744 = sum of:
      1.6282744 = product of:
        3.256549 = sum of:
          3.256549 = weight(author_txt:neumann in 966) [ClassicSimilarity], result of:
            3.256549 = score(doc=966,freq=1.0), product of:
              0.63319224 = queryWeight, product of:
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.07694734 = queryNorm
              5.143065 = fieldWeight in 966, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.625 = fieldNorm(doc=966)
        0.5 = coord(1/2)
    
  4. Neumann, G.: Studienanleitung für das Lehrgebiet alphabetische Katalogisierung (1986) 1.63
    1.6282744 = sum of:
      1.6282744 = product of:
        3.256549 = sum of:
          3.256549 = weight(author_txt:neumann in 6011) [ClassicSimilarity], result of:
            3.256549 = score(doc=6011,freq=1.0), product of:
              0.63319224 = queryWeight, product of:
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.07694734 = queryNorm
              5.143065 = fieldWeight in 6011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.625 = fieldNorm(doc=6011)
        0.5 = coord(1/2)
    
  5. Neumann, G.: ¬Das *¬ISBC# erschließt Btx für die Schulbibliotheken (1994) 1.63
    1.6282744 = sum of:
      1.6282744 = product of:
        3.256549 = sum of:
          3.256549 = weight(author_txt:neumann in 285) [ClassicSimilarity], result of:
            3.256549 = score(doc=285,freq=1.0), product of:
              0.63319224 = queryWeight, product of:
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.07694734 = queryNorm
              5.143065 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.228904 = idf(docFreq=30, maxDocs=42740)
                0.625 = fieldNorm(doc=285)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Harlow, C.: Data munging tools in Preparation for RDF : Catmandu and LODRefine (2015) 0.16
    0.16241361 = sum of:
      0.16241361 = product of:
        0.50754255 = sum of:
          0.02504512 = weight(abstract_txt:library in 4278) [ClassicSimilarity], result of:
            0.02504512 = score(doc=4278,freq=3.0), product of:
              0.0726808 = queryWeight, product of:
                1.0953293 = boost
                3.1831915 = idf(docFreq=4815, maxDocs=42740)
                0.020845497 = queryNorm
              0.34459057 = fieldWeight in 4278, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1831915 = idf(docFreq=4815, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
          0.061124466 = weight(abstract_txt:metadata in 4278) [ClassicSimilarity], result of:
            0.061124466 = score(doc=4278,freq=3.0), product of:
              0.11509345 = queryWeight, product of:
                1.1254196 = boost
                4.905958 = idf(docFreq=859, maxDocs=42740)
                0.020845497 = queryNorm
              0.53108555 = fieldWeight in 4278, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.905958 = idf(docFreq=859, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
          0.052030217 = weight(abstract_txt:tool in 4278) [ClassicSimilarity], result of:
            0.052030217 = score(doc=4278,freq=2.0), product of:
              0.11833359 = queryWeight, product of:
                1.1411513 = boost
                4.974536 = idf(docFreq=802, maxDocs=42740)
                0.020845497 = queryNorm
              0.439691 = fieldWeight in 4278, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.974536 = idf(docFreq=802, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
          0.016521338 = weight(abstract_txt:with in 4278) [ClassicSimilarity], result of:
            0.016521338 = score(doc=4278,freq=3.0), product of:
              0.060619734 = queryWeight, product of:
                1.1550777 = boost
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.020845497 = queryNorm
              0.2725406 = fieldWeight in 4278, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
          0.055253908 = weight(abstract_txt:cases in 4278) [ClassicSimilarity], result of:
            0.055253908 = score(doc=4278,freq=1.0), product of:
              0.15518734 = queryWeight, product of:
                1.3068247 = boost
                5.696744 = idf(docFreq=389, maxDocs=42740)
                0.020845497 = queryNorm
              0.3560465 = fieldWeight in 4278, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.696744 = idf(docFreq=389, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
          0.014510045 = weight(abstract_txt:that in 4278) [ClassicSimilarity], result of:
            0.014510045 = score(doc=4278,freq=2.0), product of:
              0.06855358 = queryWeight, product of:
                1.373328 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.020845497 = queryNorm
              0.21165991 = fieldWeight in 4278, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
          0.1799409 = weight(abstract_txt:programmers in 4278) [ClassicSimilarity], result of:
            0.1799409 = score(doc=4278,freq=1.0), product of:
              0.340958 = queryWeight, product of:
                1.9370446 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.020845497 = queryNorm
              0.5277509 = fieldWeight in 4278, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
          0.10311657 = weight(abstract_txt:data in 4278) [ClassicSimilarity], result of:
            0.10311657 = score(doc=4278,freq=9.0), product of:
              0.16310209 = queryWeight, product of:
                2.3204894 = boost
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.020845497 = queryNorm
              0.63222104 = fieldWeight in 4278, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.0625 = fieldNorm(doc=4278)
        0.32 = coord(8/25)
    
  2. Fox, B.; Fox, C.J.: Efficient stemmer generation (2002) 0.15
    0.14679445 = sum of:
      0.14679445 = product of:
        0.9174653 = sum of:
          0.016692549 = weight(abstract_txt:with in 3586) [ClassicSimilarity], result of:
            0.016692549 = score(doc=3586,freq=1.0), product of:
              0.060619734 = queryWeight, product of:
                1.1550777 = boost
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.020845497 = queryNorm
              0.27536494 = fieldWeight in 3586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.109375 = fieldNorm(doc=3586)
          0.017955264 = weight(abstract_txt:that in 3586) [ClassicSimilarity], result of:
            0.017955264 = score(doc=3586,freq=1.0), product of:
              0.06855358 = queryWeight, product of:
                1.373328 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.020845497 = queryNorm
              0.26191577 = fieldWeight in 3586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.109375 = fieldNorm(doc=3586)
          0.31489655 = weight(abstract_txt:programmers in 3586) [ClassicSimilarity], result of:
            0.31489655 = score(doc=3586,freq=1.0), product of:
              0.340958 = queryWeight, product of:
                1.9370446 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.020845497 = queryNorm
              0.9235641 = fieldWeight in 3586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.109375 = fieldNorm(doc=3586)
          0.567921 = weight(abstract_txt:custom in 3586) [ClassicSimilarity], result of:
            0.567921 = score(doc=3586,freq=2.0), product of:
              0.45898795 = queryWeight, product of:
                2.7525516 = boost
                7.999329 = idf(docFreq=38, maxDocs=42740)
                0.020845497 = queryNorm
              1.237333 = fieldWeight in 3586, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.999329 = idf(docFreq=38, maxDocs=42740)
                0.109375 = fieldNorm(doc=3586)
        0.16 = coord(4/25)
    
  3. Güven, S.; Feiner, S.: ¬A hypermedia authoring tool for augmented and virtual reality (2003) 0.14
    0.1353764 = sum of:
      0.1353764 = product of:
        0.48348713 = sum of:
          0.04494608 = weight(abstract_txt:task in 936) [ClassicSimilarity], result of:
            0.04494608 = score(doc=936,freq=1.0), product of:
              0.11653834 = queryWeight, product of:
                1.1324619 = boost
                4.936657 = idf(docFreq=833, maxDocs=42740)
                0.020845497 = queryNorm
              0.38567632 = fieldWeight in 936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.936657 = idf(docFreq=833, maxDocs=42740)
                0.078125 = fieldNorm(doc=936)
          0.045988653 = weight(abstract_txt:tool in 936) [ClassicSimilarity], result of:
            0.045988653 = score(doc=936,freq=1.0), product of:
              0.11833359 = queryWeight, product of:
                1.1411513 = boost
                4.974536 = idf(docFreq=802, maxDocs=42740)
                0.020845497 = queryNorm
              0.38863564 = fieldWeight in 936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.974536 = idf(docFreq=802, maxDocs=42740)
                0.078125 = fieldNorm(doc=936)
          0.016862022 = weight(abstract_txt:with in 936) [ClassicSimilarity], result of:
            0.016862022 = score(doc=936,freq=2.0), product of:
              0.060619734 = queryWeight, product of:
                1.1550777 = boost
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.020845497 = queryNorm
              0.2781606 = fieldWeight in 936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.078125 = fieldNorm(doc=936)
          0.09093577 = weight(abstract_txt:creating in 936) [ClassicSimilarity], result of:
            0.09093577 = score(doc=936,freq=2.0), product of:
              0.14796333 = queryWeight, product of:
                1.2760458 = boost
                5.5625715 = idf(docFreq=445, maxDocs=42740)
                0.020845497 = queryNorm
              0.61458313 = fieldWeight in 936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5625715 = idf(docFreq=445, maxDocs=42740)
                0.078125 = fieldNorm(doc=936)
          0.01282519 = weight(abstract_txt:that in 936) [ClassicSimilarity], result of:
            0.01282519 = score(doc=936,freq=1.0), product of:
              0.06855358 = queryWeight, product of:
                1.373328 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.020845497 = queryNorm
              0.18708271 = fieldWeight in 936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.078125 = fieldNorm(doc=936)
          0.047003277 = weight(abstract_txt:present in 936) [ClassicSimilarity], result of:
            0.047003277 = score(doc=936,freq=1.0), product of:
              0.13744326 = queryWeight, product of:
                1.5062482 = boost
                4.377384 = idf(docFreq=1458, maxDocs=42740)
                0.020845497 = queryNorm
              0.34198314 = fieldWeight in 936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.377384 = idf(docFreq=1458, maxDocs=42740)
                0.078125 = fieldNorm(doc=936)
          0.22492613 = weight(abstract_txt:programmers in 936) [ClassicSimilarity], result of:
            0.22492613 = score(doc=936,freq=1.0), product of:
              0.340958 = queryWeight, product of:
                1.9370446 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.020845497 = queryNorm
              0.65968865 = fieldWeight in 936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.078125 = fieldNorm(doc=936)
        0.28 = coord(7/25)
    
  4. Foulonneau, M.: Information redundancy across metadata collections (2007) 0.13
    0.13497825 = sum of:
      0.13497825 = product of:
        0.42180702 = sum of:
          0.020449257 = weight(abstract_txt:library in 2916) [ClassicSimilarity], result of:
            0.020449257 = score(doc=2916,freq=2.0), product of:
              0.0726808 = queryWeight, product of:
                1.0953293 = boost
                3.1831915 = idf(docFreq=4815, maxDocs=42740)
                0.020845497 = queryNorm
              0.28135705 = fieldWeight in 2916, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1831915 = idf(docFreq=4815, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
          0.11704445 = weight(abstract_txt:metadata in 2916) [ClassicSimilarity], result of:
            0.11704445 = score(doc=2916,freq=11.0), product of:
              0.11509345 = queryWeight, product of:
                1.1254196 = boost
                4.905958 = idf(docFreq=859, maxDocs=42740)
                0.020845497 = queryNorm
              1.0169514 = fieldWeight in 2916, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                4.905958 = idf(docFreq=859, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
          0.009538599 = weight(abstract_txt:with in 2916) [ClassicSimilarity], result of:
            0.009538599 = score(doc=2916,freq=1.0), product of:
              0.060619734 = queryWeight, product of:
                1.1550777 = boost
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.020845497 = queryNorm
              0.15735139 = fieldWeight in 2916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
          0.014510045 = weight(abstract_txt:that in 2916) [ClassicSimilarity], result of:
            0.014510045 = score(doc=2916,freq=2.0), product of:
              0.06855358 = queryWeight, product of:
                1.373328 = boost
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.020845497 = queryNorm
              0.21165991 = fieldWeight in 2916, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3946586 = idf(docFreq=10595, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
          0.037602622 = weight(abstract_txt:present in 2916) [ClassicSimilarity], result of:
            0.037602622 = score(doc=2916,freq=1.0), product of:
              0.13744326 = queryWeight, product of:
                1.5062482 = boost
                4.377384 = idf(docFreq=1458, maxDocs=42740)
                0.020845497 = queryNorm
              0.2735865 = fieldWeight in 2916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.377384 = idf(docFreq=1458, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
          0.053228136 = weight(abstract_txt:digital in 2916) [ClassicSimilarity], result of:
            0.053228136 = score(doc=2916,freq=2.0), product of:
              0.1375294 = queryWeight, product of:
                1.5067202 = boost
                4.3787556 = idf(docFreq=1456, maxDocs=42740)
                0.020845497 = queryNorm
              0.38703096 = fieldWeight in 2916, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3787556 = idf(docFreq=1456, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
          0.13506176 = weight(abstract_txt:harvesting in 2916) [ClassicSimilarity], result of:
            0.13506176 = score(doc=2916,freq=1.0), product of:
              0.28160208 = queryWeight, product of:
                1.7603829 = boost
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.020845497 = queryNorm
              0.47961915 = fieldWeight in 2916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
          0.034372192 = weight(abstract_txt:data in 2916) [ClassicSimilarity], result of:
            0.034372192 = score(doc=2916,freq=1.0), product of:
              0.16310209 = queryWeight, product of:
                2.3204894 = boost
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.020845497 = queryNorm
              0.21074034 = fieldWeight in 2916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.0625 = fieldNorm(doc=2916)
        0.32 = coord(8/25)
    
  5. Shreeves, S.L.; Kaczmarek, J.S.; Cole, T.W.: Harvesting cultural heritage metadata using OAI Protocol (2003) 0.12
    0.12252305 = sum of:
      0.12252305 = product of:
        0.5105127 = sum of:
          0.02556157 = weight(abstract_txt:library in 776) [ClassicSimilarity], result of:
            0.02556157 = score(doc=776,freq=2.0), product of:
              0.0726808 = queryWeight, product of:
                1.0953293 = boost
                3.1831915 = idf(docFreq=4815, maxDocs=42740)
                0.020845497 = queryNorm
              0.3516963 = fieldWeight in 776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1831915 = idf(docFreq=4815, maxDocs=42740)
                0.078125 = fieldNorm(doc=776)
          0.1247698 = weight(abstract_txt:metadata in 776) [ClassicSimilarity], result of:
            0.1247698 = score(doc=776,freq=8.0), product of:
              0.11509345 = queryWeight, product of:
                1.1254196 = boost
                4.905958 = idf(docFreq=859, maxDocs=42740)
                0.020845497 = queryNorm
              1.0840739 = fieldWeight in 776, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.905958 = idf(docFreq=859, maxDocs=42740)
                0.078125 = fieldNorm(doc=776)
          0.011923249 = weight(abstract_txt:with in 776) [ClassicSimilarity], result of:
            0.011923249 = score(doc=776,freq=1.0), product of:
              0.060619734 = queryWeight, product of:
                1.1550777 = boost
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.020845497 = queryNorm
              0.19668923 = fieldWeight in 776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5176222 = idf(docFreq=9369, maxDocs=42740)
                0.078125 = fieldNorm(doc=776)
          0.06653517 = weight(abstract_txt:digital in 776) [ClassicSimilarity], result of:
            0.06653517 = score(doc=776,freq=2.0), product of:
              0.1375294 = queryWeight, product of:
                1.5067202 = boost
                4.3787556 = idf(docFreq=1456, maxDocs=42740)
                0.020845497 = queryNorm
              0.4837887 = fieldWeight in 776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3787556 = idf(docFreq=1456, maxDocs=42740)
                0.078125 = fieldNorm(doc=776)
          0.2387577 = weight(abstract_txt:harvesting in 776) [ClassicSimilarity], result of:
            0.2387577 = score(doc=776,freq=2.0), product of:
              0.28160208 = queryWeight, product of:
                1.7603829 = boost
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.020845497 = queryNorm
              0.84785485 = fieldWeight in 776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.078125 = fieldNorm(doc=776)
          0.04296524 = weight(abstract_txt:data in 776) [ClassicSimilarity], result of:
            0.04296524 = score(doc=776,freq=1.0), product of:
              0.16310209 = queryWeight, product of:
                2.3204894 = boost
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.020845497 = queryNorm
              0.26342544 = fieldWeight in 776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3718455 = idf(docFreq=3987, maxDocs=42740)
                0.078125 = fieldNorm(doc=776)
        0.24 = coord(6/25)