Document (#41599)

Author
Gasser, M.
Wanger, R.
Prada, I.
Title
Wenn Algorithmen Zeitschriften lesen : vom Mehrwert automatisierter Textanreicherung
Source
o-bib: Das offene Bibliotheksjournal. 5(2018) Nr.4, S.181-192
Year
2018
Abstract
In Zusammenarbeit mit dem Institut für Computerlinguistik der Universität Zürich (ICL UZH) lancierte die ETH-Bibliothek Zürich ein Pilotprojekt im Bereich automatisierter Textanreicherung. Grundlage für den Piloten bildeten Volltextdateien der Schweizer Zeitschriftenplattform E-Periodica. Anhand eines ausgewählten Korpus dieser OCR-Daten wurden mit automatisierten Verfahren Tests in den Bereichen OCR-Korrektur, Erkennung von Personen-, Orts- und Ländernamen sowie Verlinkung identifizierter Personen mit der Gemeinsamen Normdatei GND durchgeführt. Insgesamt wurden sehr positive Resultate erzielt. Das verwendete System dient nun als Grundlage für den weiteren Kompetenzausbau der ETH-Bibliothek auf diesem Gebiet. Das gesamte bestehende Angebot der Plattform E-Periodica soll automatisiert angereichert und um neue Funktionalitäten erweitert werden. Dies mit dem Ziel, Forschenden einen Mehrwert bei der Informationsbeschaffung zu bieten. Im vorliegenden Beitrag werden Projektinhalt, Methodik und Resultate erläutert sowie das weitere Vorgehen skizziert.
Content
Vortrag anlässlich des 107. Deutschen Bibliothekartages 2018 in Berlin, Themenkreis "Fokus Erschließen & Bewahren". https://www.o-bib.de/article/view/5382. https://doi.org/10.5282/o-bib/2018H4S181-192.
Form
Zeitungen
Location
CH

Similar documents (author)

  1. Palfrey, J.; Gasser, U.: Generation Internet : die Digital Natives: Wie sie leben - Was sie denken - Wie sie arbeiten (2008) 4.88
    4.8754888 = sum of:
      4.8754888 = weight(author_txt:gasser in 6056) [ClassicSimilarity], result of:
        4.8754888 = fieldWeight in 6056, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.5 = fieldNorm(doc=6056)
    
  2. Stvilia, B.; Gasser, L.: Value-based metadata quality assessment (2008) 4.88
    4.8754888 = sum of:
      4.8754888 = weight(author_txt:gasser in 252) [ClassicSimilarity], result of:
        4.8754888 = fieldWeight in 252, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.5 = fieldNorm(doc=252)
    
  3. Gasser, U.; Thurman, J.: Themen und Herausforderungen der Regulierung von Suchmaschinen (2007) 4.88
    4.8754888 = sum of:
      4.8754888 = weight(author_txt:gasser in 382) [ClassicSimilarity], result of:
        4.8754888 = fieldWeight in 382, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.5 = fieldNorm(doc=382)
    
  4. Stvilia, B.; Gasser, L.; Twidale, M.B.; Smith, L.C.: ¬A framework for information quality assessment (2007) 3.05
    3.0471804 = sum of:
      3.0471804 = weight(author_txt:gasser in 610) [ClassicSimilarity], result of:
        3.0471804 = fieldWeight in 610, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.3125 = fieldNorm(doc=610)
    
  5. Stvilia, B.; Twidale, M.B.; Smith, L.C.; Gasser, L.: Information quality work organization in wikipedia (2008) 3.05
    3.0471804 = sum of:
      3.0471804 = weight(author_txt:gasser in 1859) [ClassicSimilarity], result of:
        3.0471804 = fieldWeight in 1859, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.3125 = fieldNorm(doc=1859)
    

Similar documents (content)

  1. Bubenhofer, N.: Einführung in die Korpuslinguistik : Praktische Grundlagen und Werkzeuge (2006) 0.09
    0.09313931 = sum of:
      0.09313931 = product of:
        0.77616096 = sum of:
          0.1506626 = weight(abstract_txt:computerlinguistik in 3126) [ClassicSimilarity], result of:
            0.1506626 = score(doc=3126,freq=1.0), product of:
              0.16169898 = queryWeight, product of:
                1.0463418 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.018140681 = queryNorm
              0.9317474 = fieldWeight in 3126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.109375 = fieldNorm(doc=3126)
          0.21679255 = weight(abstract_txt:korpus in 3126) [ClassicSimilarity], result of:
            0.21679255 = score(doc=3126,freq=1.0), product of:
              0.20609456 = queryWeight, product of:
                1.181281 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.018140681 = queryNorm
              1.0519081 = fieldWeight in 3126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.109375 = fieldNorm(doc=3126)
          0.40870583 = weight(abstract_txt:zürich in 3126) [ClassicSimilarity], result of:
            0.40870583 = score(doc=3126,freq=2.0), product of:
              0.31451705 = queryWeight, product of:
                2.0637498 = boost
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.018140681 = queryNorm
              1.2994711 = fieldWeight in 3126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.109375 = fieldNorm(doc=3126)
        0.12 = coord(3/25)
    
  2. Jutzi, U.; Keller, A.: Dissertationen Online an der ETH-Bibliothek Zürich (2001) 0.08
    0.078202054 = sum of:
      0.078202054 = product of:
        0.6516838 = sum of:
          0.11925478 = weight(abstract_txt:schweizer in 5670) [ClassicSimilarity], result of:
            0.11925478 = score(doc=5670,freq=1.0), product of:
              0.15333879 = queryWeight, product of:
                1.0189338 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.018140681 = queryNorm
              0.7777209 = fieldWeight in 5670, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.09375 = fieldNorm(doc=5670)
          0.1033773 = weight(abstract_txt:bibliothek in 5670) [ClassicSimilarity], result of:
            0.1033773 = score(doc=5670,freq=3.0), product of:
              0.12178334 = queryWeight, product of:
                1.2841887 = boost
                5.227637 = idf(docFreq=644, maxDocs=44218)
                0.018140681 = queryNorm
              0.8488624 = fieldWeight in 5670, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.227637 = idf(docFreq=644, maxDocs=44218)
                0.09375 = fieldNorm(doc=5670)
          0.42905176 = weight(abstract_txt:zürich in 5670) [ClassicSimilarity], result of:
            0.42905176 = score(doc=5670,freq=3.0), product of:
              0.31451705 = queryWeight, product of:
                2.0637498 = boost
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.018140681 = queryNorm
              1.3641605 = fieldWeight in 5670, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.09375 = fieldNorm(doc=5670)
        0.12 = coord(3/25)
    
  3. Jensen, N.: Evaluierung von mehrsprachigem Web-Retrieval : Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF) (2006) 0.08
    0.07607569 = sum of:
      0.07607569 = product of:
        0.47547305 = sum of:
          0.11272959 = weight(abstract_txt:erzielt in 5964) [ClassicSimilarity], result of:
            0.11272959 = score(doc=5964,freq=1.0), product of:
              0.14769307 = queryWeight, product of:
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.018140681 = queryNorm
              0.7632693 = fieldWeight in 5964, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.09375 = fieldNorm(doc=5964)
          0.04120407 = weight(abstract_txt:sowie in 5964) [ClassicSimilarity], result of:
            0.04120407 = score(doc=5964,freq=1.0), product of:
              0.09512724 = queryWeight, product of:
                1.1349778 = boost
                4.6202335 = idf(docFreq=1183, maxDocs=44218)
                0.018140681 = queryNorm
              0.4331469 = fieldWeight in 5964, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6202335 = idf(docFreq=1183, maxDocs=44218)
                0.09375 = fieldNorm(doc=5964)
          0.26279226 = weight(abstract_txt:korpus in 5964) [ClassicSimilarity], result of:
            0.26279226 = score(doc=5964,freq=2.0), product of:
              0.20609456 = queryWeight, product of:
                1.181281 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.018140681 = queryNorm
              1.2751052 = fieldWeight in 5964, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.09375 = fieldNorm(doc=5964)
          0.05874711 = weight(abstract_txt:wurden in 5964) [ClassicSimilarity], result of:
            0.05874711 = score(doc=5964,freq=1.0), product of:
              0.12050429 = queryWeight, product of:
                1.2774273 = boost
                5.2001123 = idf(docFreq=662, maxDocs=44218)
                0.018140681 = queryNorm
              0.48751053 = fieldWeight in 5964, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2001123 = idf(docFreq=662, maxDocs=44218)
                0.09375 = fieldNorm(doc=5964)
        0.16 = coord(4/25)
    
  4. Beck, C.: ¬Die Qualität der Fremddatenanreicherung FRED (2021) 0.07
    0.07288071 = sum of:
      0.07288071 = product of:
        0.45550442 = sum of:
          0.02746938 = weight(abstract_txt:sowie in 377) [ClassicSimilarity], result of:
            0.02746938 = score(doc=377,freq=1.0), product of:
              0.09512724 = queryWeight, product of:
                1.1349778 = boost
                4.6202335 = idf(docFreq=1183, maxDocs=44218)
                0.018140681 = queryNorm
              0.2887646 = fieldWeight in 377, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6202335 = idf(docFreq=1183, maxDocs=44218)
                0.0625 = fieldNorm(doc=377)
          0.03916474 = weight(abstract_txt:wurden in 377) [ClassicSimilarity], result of:
            0.03916474 = score(doc=377,freq=1.0), product of:
              0.12050429 = queryWeight, product of:
                1.2774273 = boost
                5.2001123 = idf(docFreq=662, maxDocs=44218)
                0.018140681 = queryNorm
              0.32500702 = fieldWeight in 377, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2001123 = idf(docFreq=662, maxDocs=44218)
                0.0625 = fieldNorm(doc=377)
          0.1553241 = weight(abstract_txt:resultate in 377) [ClassicSimilarity], result of:
            0.1553241 = score(doc=377,freq=1.0), product of:
              0.3019244 = queryWeight, product of:
                2.0220134 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.018140681 = queryNorm
              0.514447 = fieldWeight in 377, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0625 = fieldNorm(doc=377)
          0.2335462 = weight(abstract_txt:zürich in 377) [ClassicSimilarity], result of:
            0.2335462 = score(doc=377,freq=2.0), product of:
              0.31451705 = queryWeight, product of:
                2.0637498 = boost
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.018140681 = queryNorm
              0.74255496 = fieldWeight in 377, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.0625 = fieldNorm(doc=377)
        0.16 = coord(4/25)
    
  5. Buurman, G.M.: Wissenterritorien : ein Werkzeug zur Visualisierung wissenschaftlicher Diskurse (2001) 0.07
    0.06594765 = sum of:
      0.06594765 = product of:
        0.32973826 = sum of:
          0.0753313 = weight(abstract_txt:computerlinguistik in 5889) [ClassicSimilarity], result of:
            0.0753313 = score(doc=5889,freq=1.0), product of:
              0.16169898 = queryWeight, product of:
                1.0463418 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.018140681 = queryNorm
              0.4658737 = fieldWeight in 5889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5889)
          0.024035707 = weight(abstract_txt:sowie in 5889) [ClassicSimilarity], result of:
            0.024035707 = score(doc=5889,freq=1.0), product of:
              0.09512724 = queryWeight, product of:
                1.1349778 = boost
                4.6202335 = idf(docFreq=1183, maxDocs=44218)
                0.018140681 = queryNorm
              0.25266904 = fieldWeight in 5889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6202335 = idf(docFreq=1183, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5889)
          0.03426915 = weight(abstract_txt:wurden in 5889) [ClassicSimilarity], result of:
            0.03426915 = score(doc=5889,freq=1.0), product of:
              0.12050429 = queryWeight, product of:
                1.2774273 = boost
                5.2001123 = idf(docFreq=662, maxDocs=44218)
                0.018140681 = queryNorm
              0.28438115 = fieldWeight in 5889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2001123 = idf(docFreq=662, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5889)
          0.051602732 = weight(abstract_txt:grundlage in 5889) [ClassicSimilarity], result of:
            0.051602732 = score(doc=5889,freq=1.0), product of:
              0.15831259 = queryWeight, product of:
                1.4641739 = boost
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.018140681 = queryNorm
              0.3259547 = fieldWeight in 5889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5889)
          0.14449935 = weight(abstract_txt:zürich in 5889) [ClassicSimilarity], result of:
            0.14449935 = score(doc=5889,freq=1.0), product of:
              0.31451705 = queryWeight, product of:
                2.0637498 = boost
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.018140681 = queryNorm
              0.45943245 = fieldWeight in 5889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5889)
        0.2 = coord(5/25)