Document (#34649)

Author
Chen, M.
Liu, X.
Qin, J.
Title
Semantic relation extraction from socially-generated tags : a methodology for metadata generation
Source
Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
Imprint
Göttingen : Univ.-Verl.
Year
2008
Pages
S.117-127
Abstract
The growing predominance of social semantics in the form of tagging presents the metadata community with both opportunities and challenges as for leveraging this new form of information content representation and for retrieval. One key challenge is the absence of contextual information associated with these tags. This paper presents an experiment working with Flickr tags as an example of utilizing social semantics sources for enriching subject metadata. The procedure included four steps: 1) Collecting a sample of Flickr tags, 2) Calculating cooccurrences between tags through mutual information, 3) Tracing contextual information of tag pairs via Google search results, 4) Applying natural language processing and machine learning techniques to extract semantic relations between tags. The experiment helped us to build a context sentence collection from the Google search results, which was then processed by natural language processing and machine learning algorithms. This new approach achieved a reasonably good rate of accuracy in assigning semantic relations to tag pairs. This paper also explores the implications of this approach for using social semantics to enrich subject metadata.
Content
Vgl. unter: http://dcpapers.dublincore.org/ojs/pubs/article/view/924/920.
Theme
Social tagging

Similar documents (author)

  1. Chen, Y.N.; Chen, S.J.: ¬A metadata practice of the OFLA FRBR model : a case study for the National Palace Museum in Taipai (2004) 4.35
    4.3499155 = sum of:
      4.3499155 = weight(author_txt:chen in 3384) [ClassicSimilarity], result of:
        4.3499155 = fieldWeight in 3384, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.5 = fieldNorm(doc=3384)
    
  2. Chen, C.C.; Chen, H.H.; Chen, K.H.: ¬The design of the XML/Metadata management system (2000) 4.00
    3.9956524 = sum of:
      3.9956524 = weight(author_txt:chen in 4633) [ClassicSimilarity], result of:
        3.9956524 = fieldWeight in 4633, product of:
          1.7320508 = tf(freq=3.0), with freq of:
            3.0 = termFreq=3.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.375 = fieldNorm(doc=4633)
    
  3. Chen, W.Y.: Observations on cataloguing and classification (1991) 3.84
    3.8448186 = sum of:
      3.8448186 = weight(author_txt:chen in 4184) [ClassicSimilarity], result of:
        3.8448186 = fieldWeight in 4184, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.625 = fieldNorm(doc=4184)
    
  4. Chen, H.: Knowledge-based document retrieval : framework and design (1992) 3.84
    3.8448186 = sum of:
      3.8448186 = weight(author_txt:chen in 5283) [ClassicSimilarity], result of:
        3.8448186 = fieldWeight in 5283, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.625 = fieldNorm(doc=5283)
    
  5. Chen, P.S.: On inference rules of logic-based information retrieval systems (1994) 3.84
    3.8448186 = sum of:
      3.8448186 = weight(author_txt:chen in 6731) [ClassicSimilarity], result of:
        3.8448186 = fieldWeight in 6731, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.625 = fieldNorm(doc=6731)
    

Similar documents (content)

  1. Huang, H.; Jörgensen, C.: Characterizing user tagging and Co-occurring metadata in general and specialized metadata collections (2013) 0.29
    0.28979644 = sum of:
      0.28979644 = product of:
        1.4489822 = sum of:
          0.0085142655 = weight(abstract_txt:information in 1046) [ClassicSimilarity], result of:
            0.0085142655 = score(doc=1046,freq=1.0), product of:
              0.05627066 = queryWeight, product of:
                1.2054517 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.019281775 = queryNorm
              0.15130915 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.010538605 = weight(abstract_txt:this in 1046) [ClassicSimilarity], result of:
            0.010538605 = score(doc=1046,freq=1.0), product of:
              0.06987835 = queryWeight, product of:
                1.5018796 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.019281775 = queryNorm
              0.1508136 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.4783798 = weight(abstract_txt:flickr in 1046) [ClassicSimilarity], result of:
            0.4783798 = score(doc=1046,freq=10.0), product of:
              0.3041042 = queryWeight, product of:
                1.98155 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.019281775 = queryNorm
              1.5730785 = fieldWeight in 1046, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.19739175 = weight(abstract_txt:metadata in 1046) [ClassicSimilarity], result of:
            0.19739175 = score(doc=1046,freq=8.0), product of:
              0.22875637 = queryWeight, product of:
                2.4304988 = boost
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.019281775 = queryNorm
              0.8628907 = fieldWeight in 1046, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.7541578 = weight(abstract_txt:tags in 1046) [ClassicSimilarity], result of:
            0.7541578 = score(doc=1046,freq=11.0), product of:
              0.57551706 = queryWeight, product of:
                4.7215385 = boost
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.019281775 = queryNorm
              1.3104004 = fieldWeight in 1046, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
        0.2 = coord(5/25)
    
  2. Wang, Y.; Tai, Y.; Yang, Y.: Determination of semantic types of tags in social tagging systems (2018) 0.27
    0.2683704 = sum of:
      0.2683704 = product of:
        0.9584657 = sum of:
          0.02743159 = weight(abstract_txt:language in 4648) [ClassicSimilarity], result of:
            0.02743159 = score(doc=4648,freq=1.0), product of:
              0.08395912 = queryWeight, product of:
                1.0411847 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.019281775 = queryNorm
              0.32672557 = fieldWeight in 4648, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=4648)
          0.010642831 = weight(abstract_txt:information in 4648) [ClassicSimilarity], result of:
            0.010642831 = score(doc=4648,freq=1.0), product of:
              0.05627066 = queryWeight, product of:
                1.2054517 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.019281775 = queryNorm
              0.18913643 = fieldWeight in 4648, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=4648)
          0.063985 = weight(abstract_txt:relations in 4648) [ClassicSimilarity], result of:
            0.063985 = score(doc=4648,freq=1.0), product of:
              0.14766786 = queryWeight, product of:
                1.3808192 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.019281775 = queryNorm
              0.43330348 = fieldWeight in 4648, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.078125 = fieldNorm(doc=4648)
          0.018629797 = weight(abstract_txt:this in 4648) [ClassicSimilarity], result of:
            0.018629797 = score(doc=4648,freq=2.0), product of:
              0.06987835 = queryWeight, product of:
                1.5018796 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.019281775 = queryNorm
              0.2666033 = fieldWeight in 4648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=4648)
          0.05968569 = weight(abstract_txt:social in 4648) [ClassicSimilarity], result of:
            0.05968569 = score(doc=4648,freq=2.0), product of:
              0.12808584 = queryWeight, product of:
                1.5750343 = boost
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.019281775 = queryNorm
              0.46598196 = fieldWeight in 4648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.078125 = fieldNorm(doc=4648)
          0.14252448 = weight(abstract_txt:semantic in 4648) [ClassicSimilarity], result of:
            0.14252448 = score(doc=4648,freq=8.0), product of:
              0.14415419 = queryWeight, product of:
                1.6709101 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.019281775 = queryNorm
              0.98869467 = fieldWeight in 4648, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=4648)
          0.6355663 = weight(abstract_txt:tags in 4648) [ClassicSimilarity], result of:
            0.6355663 = score(doc=4648,freq=5.0), product of:
              0.57551706 = queryWeight, product of:
                4.7215385 = boost
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.019281775 = queryNorm
              1.1043396 = fieldWeight in 4648, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.078125 = fieldNorm(doc=4648)
        0.28 = coord(7/25)
    
  3. Park, M.S.: Understanding characteristics of semantic associations in health consumer generated knowledge representation in social media (2019) 0.25
    0.25195917 = sum of:
      0.25195917 = product of:
        0.8998542 = sum of:
          0.021945274 = weight(abstract_txt:language in 5413) [ClassicSimilarity], result of:
            0.021945274 = score(doc=5413,freq=1.0), product of:
              0.08395912 = queryWeight, product of:
                1.0411847 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.019281775 = queryNorm
              0.26138046 = fieldWeight in 5413, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=5413)
          0.051187996 = weight(abstract_txt:relations in 5413) [ClassicSimilarity], result of:
            0.051187996 = score(doc=5413,freq=1.0), product of:
              0.14766786 = queryWeight, product of:
                1.3808192 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.019281775 = queryNorm
              0.3466428 = fieldWeight in 5413, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0625 = fieldNorm(doc=5413)
          0.014903838 = weight(abstract_txt:this in 5413) [ClassicSimilarity], result of:
            0.014903838 = score(doc=5413,freq=2.0), product of:
              0.06987835 = queryWeight, product of:
                1.5018796 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.019281775 = queryNorm
              0.21328263 = fieldWeight in 5413, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=5413)
          0.033763327 = weight(abstract_txt:social in 5413) [ClassicSimilarity], result of:
            0.033763327 = score(doc=5413,freq=1.0), product of:
              0.12808584 = queryWeight, product of:
                1.5750343 = boost
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.019281775 = queryNorm
              0.26359922 = fieldWeight in 5413, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.0625 = fieldNorm(doc=5413)
          0.10665555 = weight(abstract_txt:semantic in 5413) [ClassicSimilarity], result of:
            0.10665555 = score(doc=5413,freq=7.0), product of:
              0.14415419 = queryWeight, product of:
                1.6709101 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.019281775 = queryNorm
              0.7398713 = fieldWeight in 5413, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=5413)
          0.06978852 = weight(abstract_txt:metadata in 5413) [ClassicSimilarity], result of:
            0.06978852 = score(doc=5413,freq=1.0), product of:
              0.22875637 = queryWeight, product of:
                2.4304988 = boost
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.019281775 = queryNorm
              0.30507794 = fieldWeight in 5413, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.0625 = fieldNorm(doc=5413)
          0.6016097 = weight(abstract_txt:tags in 5413) [ClassicSimilarity], result of:
            0.6016097 = score(doc=5413,freq=7.0), product of:
              0.57551706 = queryWeight, product of:
                4.7215385 = boost
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.019281775 = queryNorm
              1.0453378 = fieldWeight in 5413, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.0625 = fieldNorm(doc=5413)
        0.28 = coord(7/25)
    
  4. Catarino, M.E.; Baptista, A.A.: Relating folksonomies with Dublin Core (2008) 0.24
    0.23632316 = sum of:
      0.23632316 = product of:
        0.8440113 = sum of:
          0.023863504 = weight(abstract_txt:presents in 2652) [ClassicSimilarity], result of:
            0.023863504 = score(doc=2652,freq=1.0), product of:
              0.08878304 = queryWeight, product of:
                1.0706779 = boost
                4.300552 = idf(docFreq=1629, maxDocs=44218)
                0.019281775 = queryNorm
              0.2687845 = fieldWeight in 2652, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.300552 = idf(docFreq=1629, maxDocs=44218)
                0.0625 = fieldNorm(doc=2652)
          0.0085142655 = weight(abstract_txt:information in 2652) [ClassicSimilarity], result of:
            0.0085142655 = score(doc=2652,freq=1.0), product of:
              0.05627066 = queryWeight, product of:
                1.2054517 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.019281775 = queryNorm
              0.15130915 = fieldWeight in 2652, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2652)
          0.010538605 = weight(abstract_txt:this in 2652) [ClassicSimilarity], result of:
            0.010538605 = score(doc=2652,freq=1.0), product of:
              0.06987835 = queryWeight, product of:
                1.5018796 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.019281775 = queryNorm
              0.1508136 = fieldWeight in 2652, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=2652)
          0.04774855 = weight(abstract_txt:social in 2652) [ClassicSimilarity], result of:
            0.04774855 = score(doc=2652,freq=2.0), product of:
              0.12808584 = queryWeight, product of:
                1.5750343 = boost
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.019281775 = queryNorm
              0.37278557 = fieldWeight in 2652, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.0625 = fieldNorm(doc=2652)
          0.04031201 = weight(abstract_txt:semantic in 2652) [ClassicSimilarity], result of:
            0.04031201 = score(doc=2652,freq=1.0), product of:
              0.14415419 = queryWeight, product of:
                1.6709101 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.019281775 = queryNorm
              0.2796451 = fieldWeight in 2652, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=2652)
          0.15605189 = weight(abstract_txt:metadata in 2652) [ClassicSimilarity], result of:
            0.15605189 = score(doc=2652,freq=5.0), product of:
              0.22875637 = queryWeight, product of:
                2.4304988 = boost
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.019281775 = queryNorm
              0.68217504 = fieldWeight in 2652, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.0625 = fieldNorm(doc=2652)
          0.55698246 = weight(abstract_txt:tags in 2652) [ClassicSimilarity], result of:
            0.55698246 = score(doc=2652,freq=6.0), product of:
              0.57551706 = queryWeight, product of:
                4.7215385 = boost
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.019281775 = queryNorm
              0.96779484 = fieldWeight in 2652, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.0625 = fieldNorm(doc=2652)
        0.28 = coord(7/25)
    
  5. Syn, S.Y.; Spring, M.B.: Finding subject terms for classificatory metadata from user-generated social tags (2013) 0.23
    0.23434383 = sum of:
      0.23434383 = product of:
        0.9764326 = sum of:
          0.035990734 = weight(abstract_txt:processing in 745) [ClassicSimilarity], result of:
            0.035990734 = score(doc=745,freq=1.0), product of:
              0.11676186 = queryWeight, product of:
                1.2278472 = boost
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.019281775 = queryNorm
              0.3082405 = fieldWeight in 745, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.0625 = fieldNorm(doc=745)
          0.010538605 = weight(abstract_txt:this in 745) [ClassicSimilarity], result of:
            0.010538605 = score(doc=745,freq=1.0), product of:
              0.06987835 = queryWeight, product of:
                1.5018796 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.019281775 = queryNorm
              0.1508136 = fieldWeight in 745, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=745)
          0.0754971 = weight(abstract_txt:social in 745) [ClassicSimilarity], result of:
            0.0754971 = score(doc=745,freq=5.0), product of:
              0.12808584 = queryWeight, product of:
                1.5750343 = boost
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.019281775 = queryNorm
              0.5894258 = fieldWeight in 745, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.2175875 = idf(docFreq=1770, maxDocs=44218)
                0.0625 = fieldNorm(doc=745)
          0.04031201 = weight(abstract_txt:semantic in 745) [ClassicSimilarity], result of:
            0.04031201 = score(doc=745,freq=1.0), product of:
              0.14415419 = queryWeight, product of:
                1.6709101 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.019281775 = queryNorm
              0.2796451 = fieldWeight in 745, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=745)
          0.17094627 = weight(abstract_txt:metadata in 745) [ClassicSimilarity], result of:
            0.17094627 = score(doc=745,freq=6.0), product of:
              0.22875637 = queryWeight, product of:
                2.4304988 = boost
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.019281775 = queryNorm
              0.7472853 = fieldWeight in 745, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.0625 = fieldNorm(doc=745)
          0.6431479 = weight(abstract_txt:tags in 745) [ClassicSimilarity], result of:
            0.6431479 = score(doc=745,freq=8.0), product of:
              0.57551706 = queryWeight, product of:
                4.7215385 = boost
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.019281775 = queryNorm
              1.1175132 = fieldWeight in 745, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.321609 = idf(docFreq=215, maxDocs=44218)
                0.0625 = fieldNorm(doc=745)
        0.24 = coord(6/25)