Document (#36750)

Author
Iorio, A. di
Peroni, S.
Vitali, F.
Title
¬A Semantic Web approach to everyday overlapping markup
Source
Journal of the American Society for Information Science and Technology. 62(2011) no.9, S.1696-1716
Year
2011
Abstract
Overlapping structures in XML are not symptoms of a misunderstanding of the intrinsic characteristics of a text document nor evidence of extreme scholarly requirements far beyond those needed by the most common XML-based applications. On the contrary, overlaps have started to appear in a large number of incredibly popular applications hidden under the guise of syntactical tricks to the basic hierarchy of the XML data format. Unfortunately, syntactical tricks have the drawback that the affected structures require complicated workarounds to support even the simplest query or usage. In this article, we present Extremely Annotational Resource Description Framework (RDF) Markup (EARMARK), an approach to overlapping markup that simplifies and streamlines the management of multiple hierarchies on the same content, and provides an approach to sophisticated queries and usages over such structures without the need of ad-hoc applications, simply by using Semantic Web tools and languages. We compare how relevant tasks (e.g., the identification of the contribution of an author in a word processor document) are of some substantial complexity when using the original data format and become more or less trivial when using EARMARK. We finally evaluate positively the memory and disk requirements of EARMARK documents in comparison to Open Office and Microsoft Word XML-based formats.
Theme
Semantic Web
Wissensrepräsentation
Object
RDF
EARMARK

Similar documents (content)

  1. Shah, U.; Finin, T.; Joshi, A.; Cost, R.S.; Mayfield, J.: Information retrieval on the Semantic Web (2002) 0.10
    0.09597389 = sum of:
      0.09597389 = product of:
        0.47986943 = sum of:
          0.053916756 = weight(abstract_txt:when in 696) [ClassicSimilarity], result of:
            0.053916756 = score(doc=696,freq=3.0), product of:
              0.080042094 = queryWeight, product of:
                1.0629841 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.01815174 = queryNorm
              0.673605 = fieldWeight in 696, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.03449107 = weight(abstract_txt:document in 696) [ClassicSimilarity], result of:
            0.03449107 = score(doc=696,freq=1.0), product of:
              0.08570657 = queryWeight, product of:
                1.0999542 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.01815174 = queryNorm
              0.40243202 = fieldWeight in 696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.05523818 = weight(abstract_txt:semantic in 696) [ClassicSimilarity], result of:
            0.05523818 = score(doc=696,freq=2.0), product of:
              0.09311635 = queryWeight, product of:
                1.146517 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.01815174 = queryNorm
              0.5932168 = fieldWeight in 696, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.034364 = weight(abstract_txt:approach in 696) [ClassicSimilarity], result of:
            0.034364 = score(doc=696,freq=1.0), product of:
              0.09786842 = queryWeight, product of:
                1.4395756 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01815174 = queryNorm
              0.3511245 = fieldWeight in 696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
          0.3018594 = weight(abstract_txt:markup in 696) [ClassicSimilarity], result of:
            0.3018594 = score(doc=696,freq=2.0), product of:
              0.33069927 = queryWeight, product of:
                2.6462436 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.01815174 = queryNorm
              0.91279125 = fieldWeight in 696, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.09375 = fieldNorm(doc=696)
        0.2 = coord(5/25)
    
  2. Spinning the Semantic Web : bringing the World Wide Web to its full potential (2003) 0.09
    0.09104674 = sum of:
      0.09104674 = product of:
        0.37936145 = sum of:
          0.022994045 = weight(abstract_txt:document in 1981) [ClassicSimilarity], result of:
            0.022994045 = score(doc=1981,freq=1.0), product of:
              0.08570657 = queryWeight, product of:
                1.0999542 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.01815174 = queryNorm
              0.26828802 = fieldWeight in 1981, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=1981)
          0.05207906 = weight(abstract_txt:semantic in 1981) [ClassicSimilarity], result of:
            0.05207906 = score(doc=1981,freq=4.0), product of:
              0.09311635 = queryWeight, product of:
                1.146517 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.01815174 = queryNorm
              0.5592902 = fieldWeight in 1981, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=1981)
          0.038433064 = weight(abstract_txt:format in 1981) [ClassicSimilarity], result of:
            0.038433064 = score(doc=1981,freq=1.0), product of:
              0.12070924 = queryWeight, product of:
                1.3053826 = boost
                5.0942993 = idf(docFreq=736, maxDocs=44218)
                0.01815174 = queryNorm
              0.3183937 = fieldWeight in 1981, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0942993 = idf(docFreq=736, maxDocs=44218)
                0.0625 = fieldNorm(doc=1981)
          0.018111106 = weight(abstract_txt:using in 1981) [ClassicSimilarity], result of:
            0.018111106 = score(doc=1981,freq=1.0), product of:
              0.08367536 = queryWeight, product of:
                1.3311039 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.01815174 = queryNorm
              0.21644491 = fieldWeight in 1981, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=1981)
          0.04650459 = weight(abstract_txt:applications in 1981) [ClassicSimilarity], result of:
            0.04650459 = score(doc=1981,freq=1.0), product of:
              0.15690309 = queryWeight, product of:
                1.8227576 = boost
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.01815174 = queryNorm
              0.29639053 = fieldWeight in 1981, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.0625 = fieldNorm(doc=1981)
          0.20123959 = weight(abstract_txt:markup in 1981) [ClassicSimilarity], result of:
            0.20123959 = score(doc=1981,freq=2.0), product of:
              0.33069927 = queryWeight, product of:
                2.6462436 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.01815174 = queryNorm
              0.6085275 = fieldWeight in 1981, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.0625 = fieldNorm(doc=1981)
        0.24 = coord(6/25)
    
  3. Köhler, J.; Philippi, S.; Specht, M.; Rüegg, A.: Ontology based text indexing and querying for the semantic web (2006) 0.09
    0.08979319 = sum of:
      0.08979319 = product of:
        0.3741383 = sum of:
          0.02075257 = weight(abstract_txt:when in 3280) [ClassicSimilarity], result of:
            0.02075257 = score(doc=3280,freq=1.0), product of:
              0.080042094 = queryWeight, product of:
                1.0629841 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.01815174 = queryNorm
              0.2592707 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.02603953 = weight(abstract_txt:semantic in 3280) [ClassicSimilarity], result of:
            0.02603953 = score(doc=3280,freq=1.0), product of:
              0.09311635 = queryWeight, product of:
                1.146517 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.01815174 = queryNorm
              0.2796451 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.06601766 = weight(abstract_txt:word in 3280) [ClassicSimilarity], result of:
            0.06601766 = score(doc=3280,freq=2.0), product of:
              0.13741493 = queryWeight, product of:
                1.3927864 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.01815174 = queryNorm
              0.48042563 = fieldWeight in 3280, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.03239869 = weight(abstract_txt:approach in 3280) [ClassicSimilarity], result of:
            0.03239869 = score(doc=3280,freq=2.0), product of:
              0.09786842 = queryWeight, product of:
                1.4395756 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01815174 = queryNorm
              0.33104333 = fieldWeight in 3280, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.04650459 = weight(abstract_txt:applications in 3280) [ClassicSimilarity], result of:
            0.04650459 = score(doc=3280,freq=1.0), product of:
              0.15690309 = queryWeight, product of:
                1.8227576 = boost
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.01815174 = queryNorm
              0.29639053 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.18242525 = weight(abstract_txt:syntactical in 3280) [ClassicSimilarity], result of:
            0.18242525 = score(doc=3280,freq=1.0), product of:
              0.3409263 = queryWeight, product of:
                2.1938038 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.01815174 = queryNorm
              0.53508705 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
        0.24 = coord(6/25)
    
  4. Cui, H.; Heidorn, P.B.: ¬The reusability of induced knowledge for the automatic semantic markup of taxonomic descriptions (2007) 0.08
    0.08167531 = sum of:
      0.08167531 = product of:
        0.40837657 = sum of:
          0.036825456 = weight(abstract_txt:semantic in 84) [ClassicSimilarity], result of:
            0.036825456 = score(doc=84,freq=2.0), product of:
              0.09311635 = queryWeight, product of:
                1.146517 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.01815174 = queryNorm
              0.39547786 = fieldWeight in 84, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.038433064 = weight(abstract_txt:format in 84) [ClassicSimilarity], result of:
            0.038433064 = score(doc=84,freq=1.0), product of:
              0.12070924 = queryWeight, product of:
                1.3053826 = boost
                5.0942993 = idf(docFreq=736, maxDocs=44218)
                0.01815174 = queryNorm
              0.3183937 = fieldWeight in 84, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0942993 = idf(docFreq=736, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.025612973 = weight(abstract_txt:using in 84) [ClassicSimilarity], result of:
            0.025612973 = score(doc=84,freq=2.0), product of:
              0.08367536 = queryWeight, product of:
                1.3311039 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.01815174 = queryNorm
              0.30609933 = fieldWeight in 84, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.022909334 = weight(abstract_txt:approach in 84) [ClassicSimilarity], result of:
            0.022909334 = score(doc=84,freq=1.0), product of:
              0.09786842 = queryWeight, product of:
                1.4395756 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01815174 = queryNorm
              0.234083 = fieldWeight in 84, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.28459576 = weight(abstract_txt:markup in 84) [ClassicSimilarity], result of:
            0.28459576 = score(doc=84,freq=4.0), product of:
              0.33069927 = queryWeight, product of:
                2.6462436 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.01815174 = queryNorm
              0.86058784 = fieldWeight in 84, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
        0.2 = coord(5/25)
    
  5. Saruladha, K.; Aghila, G.; Penchala, S.K.: Design of new indexing techniques based on ontology for information retrieval systems (2010) 0.08
    0.0814773 = sum of:
      0.0814773 = product of:
        0.33948874 = sum of:
          0.02075257 = weight(abstract_txt:when in 4317) [ClassicSimilarity], result of:
            0.02075257 = score(doc=4317,freq=1.0), product of:
              0.080042094 = queryWeight, product of:
                1.0629841 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.01815174 = queryNorm
              0.2592707 = fieldWeight in 4317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0625 = fieldNorm(doc=4317)
          0.022994045 = weight(abstract_txt:document in 4317) [ClassicSimilarity], result of:
            0.022994045 = score(doc=4317,freq=1.0), product of:
              0.08570657 = queryWeight, product of:
                1.0999542 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.01815174 = queryNorm
              0.26828802 = fieldWeight in 4317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=4317)
          0.02603953 = weight(abstract_txt:semantic in 4317) [ClassicSimilarity], result of:
            0.02603953 = score(doc=4317,freq=1.0), product of:
              0.09311635 = queryWeight, product of:
                1.146517 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.01815174 = queryNorm
              0.2796451 = fieldWeight in 4317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=4317)
          0.040595807 = weight(abstract_txt:requirements in 4317) [ClassicSimilarity], result of:
            0.040595807 = score(doc=4317,freq=1.0), product of:
              0.12519625 = queryWeight, product of:
                1.3294231 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.01815174 = queryNorm
              0.32425737 = fieldWeight in 4317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0625 = fieldNorm(doc=4317)
          0.046681534 = weight(abstract_txt:word in 4317) [ClassicSimilarity], result of:
            0.046681534 = score(doc=4317,freq=1.0), product of:
              0.13741493 = queryWeight, product of:
                1.3927864 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.01815174 = queryNorm
              0.33971223 = fieldWeight in 4317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=4317)
          0.18242525 = weight(abstract_txt:syntactical in 4317) [ClassicSimilarity], result of:
            0.18242525 = score(doc=4317,freq=1.0), product of:
              0.3409263 = queryWeight, product of:
                2.1938038 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.01815174 = queryNorm
              0.53508705 = fieldWeight in 4317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0625 = fieldNorm(doc=4317)
        0.24 = coord(6/25)