Document (#34482)

Author
Auer, S.
Lehmann, J.
Title
What have Innsbruck and Leipzig in common? : extracting semantics from Wiki content
Source
http://www.informatik.uni-leipzig.de/~auer/publication/ExtractingSemantics.pdf
Year
2007
Abstract
Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.
Content
Beitrag für WWW 2007.
Theme
Semantic Web
Object
DBpedia
Wikipedia

Similar documents (author)

  1. Auer, S.; Lehmann, J.: Making the Web a data washing machine : creating knowledge out of interlinked data (2010) 5.54
    5.542895 = sum of:
      5.542895 = sum of:
        2.0196323 = weight(author_txt:lehmann in 112) [ClassicSimilarity], result of:
          2.0196323 = score(doc=112,freq=1.0), product of:
            0.567957 = queryWeight, product of:
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.07985987 = queryNorm
            3.55596 = fieldWeight in 112, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.5 = fieldNorm(doc=112)
        3.5232625 = weight(author_txt:auer in 112) [ClassicSimilarity], result of:
          3.5232625 = score(doc=112,freq=1.0), product of:
            0.82305825 = queryWeight, product of:
              1.2038089 = boost
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.07985987 = queryNorm
            4.2806964 = fieldWeight in 112, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.5 = fieldNorm(doc=112)
    
  2. Auer, S.; Lehmann, J.; Bizer, C.: Semantische Mashups auf Basis Vernetzter Daten (2009) 4.16
    4.1571712 = sum of:
      4.1571712 = sum of:
        1.5147243 = weight(author_txt:lehmann in 4868) [ClassicSimilarity], result of:
          1.5147243 = score(doc=4868,freq=1.0), product of:
            0.567957 = queryWeight, product of:
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.07985987 = queryNorm
            2.66697 = fieldWeight in 4868, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.375 = fieldNorm(doc=4868)
        2.6424468 = weight(author_txt:auer in 4868) [ClassicSimilarity], result of:
          2.6424468 = score(doc=4868,freq=1.0), product of:
            0.82305825 = queryWeight, product of:
              1.2038089 = boost
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.07985987 = queryNorm
            3.2105222 = fieldWeight in 4868, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.375 = fieldNorm(doc=4868)
    
  3. Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; Ives, Z.: DBpedia: a nucleus for a Web of open data (2007) 2.77
    2.7714474 = sum of:
      2.7714474 = sum of:
        1.0098162 = weight(author_txt:lehmann in 4260) [ClassicSimilarity], result of:
          1.0098162 = score(doc=4260,freq=1.0), product of:
            0.567957 = queryWeight, product of:
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.07985987 = queryNorm
            1.77798 = fieldWeight in 4260, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.25 = fieldNorm(doc=4260)
        1.7616313 = weight(author_txt:auer in 4260) [ClassicSimilarity], result of:
          1.7616313 = score(doc=4260,freq=1.0), product of:
            0.82305825 = queryWeight, product of:
              1.2038089 = boost
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.07985987 = queryNorm
            2.1403482 = fieldWeight in 4260, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.25 = fieldNorm(doc=4260)
    
  4. Bizer, C.; Lehmann, J.; Kobilarov, G.; Auer, S.; Becker, C.; Cyganiak, R.; Hellmann, S.: DBpedia: a crystallization point for the Web of Data. (2009) 2.77
    2.7714474 = sum of:
      2.7714474 = sum of:
        1.0098162 = weight(author_txt:lehmann in 1643) [ClassicSimilarity], result of:
          1.0098162 = score(doc=1643,freq=1.0), product of:
            0.567957 = queryWeight, product of:
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.07985987 = queryNorm
            1.77798 = fieldWeight in 1643, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.25 = fieldNorm(doc=1643)
        1.7616313 = weight(author_txt:auer in 1643) [ClassicSimilarity], result of:
          1.7616313 = score(doc=1643,freq=1.0), product of:
            0.82305825 = queryWeight, product of:
              1.2038089 = boost
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.07985987 = queryNorm
            2.1403482 = fieldWeight in 1643, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.25 = fieldNorm(doc=1643)
    
  5. Auer, G.: Sitzung der EG Online-Kataloge in Heidelberg (1993) 2.20
    2.202039 = sum of:
      2.202039 = product of:
        4.404078 = sum of:
          4.404078 = weight(author_txt:auer in 4692) [ClassicSimilarity], result of:
            4.404078 = score(doc=4692,freq=1.0), product of:
              0.82305825 = queryWeight, product of:
                1.2038089 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.07985987 = queryNorm
              5.3508706 = fieldWeight in 4692, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.625 = fieldNorm(doc=4692)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Shah, U.; Finin, T.; Joshi, A.; Cost, R.S.; Mayfield, J.: Information retrieval on the Semantic Web (2002) 0.12
    0.12353493 = sum of:
      0.12353493 = product of:
        0.61767465 = sum of:
          0.06784517 = weight(abstract_txt:query in 696) [ClassicSimilarity], result of:
            0.06784517 = score(doc=696,freq=2.0), product of:
              0.10764529 = queryWeight, product of:
                1.3499401 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01677424 = queryNorm
              0.6302661 = fieldWeight in 696, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.1452156 = weight(abstract_txt:semantically in 696) [ClassicSimilarity], result of:
            0.1452156 = score(doc=696,freq=1.0), product of:
              0.22525162 = queryWeight, product of:
                1.9527693 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.01677424 = queryNorm
              0.64468175 = fieldWeight in 696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.17047706 = weight(abstract_txt:enriched in 696) [ClassicSimilarity], result of:
            0.17047706 = score(doc=696,freq=1.0), product of:
              0.25067037 = queryWeight, product of:
                2.060006 = boost
                7.2542357 = idf(docFreq=84, maxDocs=44218)
                0.01677424 = queryNorm
              0.6800846 = fieldWeight in 696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2542357 = idf(docFreq=84, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.15260412 = weight(abstract_txt:structured in 696) [ClassicSimilarity], result of:
            0.15260412 = score(doc=696,freq=2.0), product of:
              0.21153894 = queryWeight, product of:
                2.317703 = boost
                5.4411373 = idf(docFreq=520, maxDocs=44218)
                0.01677424 = queryNorm
              0.72139966 = fieldWeight in 696, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4411373 = idf(docFreq=520, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.081532694 = weight(abstract_txt:content in 696) [ClassicSimilarity], result of:
            0.081532694 = score(doc=696,freq=1.0), product of:
              0.2080624 = queryWeight, product of:
                2.9674525 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.01677424 = queryNorm
              0.39186656 = fieldWeight in 696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
        0.2 = coord(5/25)
    
  2. Ebersbach, A.; Glaser, M.; Heigl, R.; Warta, A.: Wiki : Kooperation im Web (2008) 0.12
    0.117986254 = sum of:
      0.117986254 = product of:
        0.7374141 = sum of:
          0.28511763 = weight(abstract_txt:wikis in 651) [ClassicSimilarity], result of:
            0.28511763 = score(doc=651,freq=3.0), product of:
              0.27653843 = queryWeight, product of:
                2.1636884 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.01677424 = queryNorm
              1.0310235 = fieldWeight in 651, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.078125 = fieldNorm(doc=651)
          0.1374335 = weight(abstract_txt:wikipedia in 651) [ClassicSimilarity], result of:
            0.1374335 = score(doc=651,freq=1.0), product of:
              0.28067604 = queryWeight, product of:
                2.6697173 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.01677424 = queryNorm
              0.48965168 = fieldWeight in 651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.078125 = fieldNorm(doc=651)
          0.067943916 = weight(abstract_txt:content in 651) [ClassicSimilarity], result of:
            0.067943916 = score(doc=651,freq=1.0), product of:
              0.2080624 = queryWeight, product of:
                2.9674525 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.01677424 = queryNorm
              0.3265555 = fieldWeight in 651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.078125 = fieldNorm(doc=651)
          0.24691913 = weight(abstract_txt:wiki in 651) [ClassicSimilarity], result of:
            0.24691913 = score(doc=651,freq=1.0), product of:
              0.41480768 = queryWeight, product of:
                3.2455328 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.01677424 = queryNorm
              0.5952617 = fieldWeight in 651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.078125 = fieldNorm(doc=651)
        0.16 = coord(4/25)
    
  3. Mehler, A.; Waltinger, U.: Automatic enrichment of metadata (2009) 0.12
    0.11572404 = sum of:
      0.11572404 = product of:
        0.5786202 = sum of:
          0.055969413 = weight(abstract_txt:query in 4840) [ClassicSimilarity], result of:
            0.055969413 = score(doc=4840,freq=1.0), product of:
              0.10764529 = queryWeight, product of:
                1.3499401 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01677424 = queryNorm
              0.519943 = fieldWeight in 4840, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.109375 = fieldNorm(doc=4840)
          0.06570419 = weight(abstract_txt:means in 4840) [ClassicSimilarity], result of:
            0.06570419 = score(doc=4840,freq=1.0), product of:
              0.11979073 = queryWeight, product of:
                1.424061 = boost
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.01677424 = queryNorm
              0.5484914 = fieldWeight in 4840, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.109375 = fieldNorm(doc=4840)
          0.16941822 = weight(abstract_txt:semantically in 4840) [ClassicSimilarity], result of:
            0.16941822 = score(doc=4840,freq=1.0), product of:
              0.22525162 = queryWeight, product of:
                1.9527693 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.01677424 = queryNorm
              0.7521287 = fieldWeight in 4840, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.109375 = fieldNorm(doc=4840)
          0.1924069 = weight(abstract_txt:wikipedia in 4840) [ClassicSimilarity], result of:
            0.1924069 = score(doc=4840,freq=1.0), product of:
              0.28067604 = queryWeight, product of:
                2.6697173 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.01677424 = queryNorm
              0.68551236 = fieldWeight in 4840, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.109375 = fieldNorm(doc=4840)
          0.09512148 = weight(abstract_txt:content in 4840) [ClassicSimilarity], result of:
            0.09512148 = score(doc=4840,freq=1.0), product of:
              0.2080624 = queryWeight, product of:
                2.9674525 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.01677424 = queryNorm
              0.45717767 = fieldWeight in 4840, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.109375 = fieldNorm(doc=4840)
        0.2 = coord(5/25)
    
  4. Ehlen, D.: Semantic Wiki : Konzeption eines Semantic MediaWiki für das Reallexikon zur Deutschen Kunstgeschichte (2010) 0.11
    0.11489868 = sum of:
      0.11489868 = product of:
        0.957489 = sum of:
          0.27935708 = weight(abstract_txt:wikis in 3689) [ClassicSimilarity], result of:
            0.27935708 = score(doc=3689,freq=2.0), product of:
              0.27653843 = queryWeight, product of:
                2.1636884 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.01677424 = queryNorm
              1.0101926 = fieldWeight in 3689, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.09375 = fieldNorm(doc=3689)
          0.16492018 = weight(abstract_txt:wikipedia in 3689) [ClassicSimilarity], result of:
            0.16492018 = score(doc=3689,freq=1.0), product of:
              0.28067604 = queryWeight, product of:
                2.6697173 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.01677424 = queryNorm
              0.587582 = fieldWeight in 3689, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.09375 = fieldNorm(doc=3689)
          0.5132117 = weight(abstract_txt:wiki in 3689) [ClassicSimilarity], result of:
            0.5132117 = score(doc=3689,freq=3.0), product of:
              0.41480768 = queryWeight, product of:
                3.2455328 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.01677424 = queryNorm
              1.2372282 = fieldWeight in 3689, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.09375 = fieldNorm(doc=3689)
        0.12 = coord(3/25)
    
  5. Fuchs, U.: Freie Inhalte? : Idee und Realisierung am Beispiel der Wikipedia (2006) 0.11
    0.105105 = sum of:
      0.105105 = product of:
        0.875875 = sum of:
          0.4561882 = weight(abstract_txt:wikis in 5698) [ClassicSimilarity], result of:
            0.4561882 = score(doc=5698,freq=3.0), product of:
              0.27653843 = queryWeight, product of:
                2.1636884 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.01677424 = queryNorm
              1.6496376 = fieldWeight in 5698, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.125 = fieldNorm(doc=5698)
          0.31097648 = weight(abstract_txt:wikipedia in 5698) [ClassicSimilarity], result of:
            0.31097648 = score(doc=5698,freq=2.0), product of:
              0.28067604 = queryWeight, product of:
                2.6697173 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.01677424 = queryNorm
              1.1079552 = fieldWeight in 5698, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.125 = fieldNorm(doc=5698)
          0.10871027 = weight(abstract_txt:content in 5698) [ClassicSimilarity], result of:
            0.10871027 = score(doc=5698,freq=1.0), product of:
              0.2080624 = queryWeight, product of:
                2.9674525 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.01677424 = queryNorm
              0.5224888 = fieldWeight in 5698, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.125 = fieldNorm(doc=5698)
        0.12 = coord(3/25)