Document (#33267)

Author
Shirky, C.
Title
Ontology is overrated : categories, links, and tags
Source
http://www.shirky.com/writings/ontology_overrated.html
Year
2005
Series
Clay Shirky's writings about the Internet
Abstract
Today I want to talk about categorization, and I want to convince you that a lot of what we think we know about categorization is wrong. In particular, I want to convince you that many of the ways we're attempting to apply categorization to the electronic world are actually a bad fit, because we've adopted habits of mind that are left over from earlier strategies. I also want to convince you that what we're seeing when we see the Web is actually a radical break with previous categorization strategies, rather than an extension of them. The second part of the talk is more speculative, because it is often the case that old systems get broken before people know what's going to take their place. (Anyone watching the music industry can see this at work today.) That's what I think is happening with categorization. What I think is coming instead are much more organic ways of organizing information than our current categorization schemes allow, based on two units -- the link, which can point to anything, and the tag, which is a way of attaching labels to links. The strategy of tagging -- free-form labeling, without regard to categorical constraints -- seems like a recipe for disaster, but as the Web has shown us, you can extract a surprising amount of value from big messy data sets.
Footnote
This piece is based on two talks I gave in the spring of 2005 -- one at the O'Reilly ETech conference in March, entitled "Ontology Is Overrated", and one at the IMCExpo in April entitled "Folksonomies & Tags: The rise of user-developed classification." The written version is a heavily edited concatenation of those two talks.
Theme
Folksonomies
Social tagging

Similar documents (content)

  1. Pioneers in library and information science (2004) 0.11
    0.11423425 = sum of:
      0.11423425 = product of:
        0.40797946 = sum of:
          0.07607808 = weight(abstract_txt:happening in 2025) [ClassicSimilarity], result of:
            0.07607808 = score(doc=2025,freq=1.0), product of:
              0.13621144 = queryWeight, product of:
                1.0447762 = boost
                8.936469 = idf(docFreq=14, maxDocs=41962)
                0.014588961 = queryNorm
              0.5585293 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.936469 = idf(docFreq=14, maxDocs=41962)
                0.0625 = fieldNorm(doc=2025)
          0.024875132 = weight(abstract_txt:because in 2025) [ClassicSimilarity], result of:
            0.024875132 = score(doc=2025,freq=1.0), product of:
              0.081450574 = queryWeight, product of:
                1.1425587 = boost
                4.886425 = idf(docFreq=860, maxDocs=41962)
                0.014588961 = queryNorm
              0.30540156 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.886425 = idf(docFreq=860, maxDocs=41962)
                0.0625 = fieldNorm(doc=2025)
          0.010580314 = weight(abstract_txt:that in 2025) [ClassicSimilarity], result of:
            0.010580314 = score(doc=2025,freq=2.0), product of:
              0.04962337 = queryWeight, product of:
                1.4100827 = boost
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.014588961 = queryNorm
              0.21321233 = fieldWeight in 2025, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.0625 = fieldNorm(doc=2025)
          0.049136586 = weight(abstract_txt:know in 2025) [ClassicSimilarity], result of:
            0.049136586 = score(doc=2025,freq=1.0), product of:
              0.12822928 = queryWeight, product of:
                1.4335903 = boost
                6.131091 = idf(docFreq=247, maxDocs=41962)
                0.014588961 = queryNorm
              0.3831932 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.131091 = idf(docFreq=247, maxDocs=41962)
                0.0625 = fieldNorm(doc=2025)
          0.061687957 = weight(abstract_txt:actually in 2025) [ClassicSimilarity], result of:
            0.061687957 = score(doc=2025,freq=1.0), product of:
              0.14922817 = queryWeight, product of:
                1.5465246 = boost
                6.614082 = idf(docFreq=152, maxDocs=41962)
                0.014588961 = queryNorm
              0.41338012 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.614082 = idf(docFreq=152, maxDocs=41962)
                0.0625 = fieldNorm(doc=2025)
          0.08708446 = weight(abstract_txt:what in 2025) [ClassicSimilarity], result of:
            0.08708446 = score(doc=2025,freq=6.0), product of:
              0.13020787 = queryWeight, product of:
                2.0429845 = boost
                4.368655 = idf(docFreq=1444, maxDocs=41962)
                0.014588961 = queryNorm
              0.668811 = fieldWeight in 2025, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.368655 = idf(docFreq=1444, maxDocs=41962)
                0.0625 = fieldNorm(doc=2025)
          0.09853693 = weight(abstract_txt:think in 2025) [ClassicSimilarity], result of:
            0.09853693 = score(doc=2025,freq=1.0), product of:
              0.2334248 = queryWeight, product of:
                2.368921 = boost
                6.7541704 = idf(docFreq=132, maxDocs=41962)
                0.014588961 = queryNorm
              0.42213565 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7541704 = idf(docFreq=132, maxDocs=41962)
                0.0625 = fieldNorm(doc=2025)
        0.28 = coord(7/25)
    
  2. Müller, J.F.: ¬A librarian's guide to the Internet : a guide to searching and evaluating information (2003) 0.11
    0.10692192 = sum of:
      0.10692192 = product of:
        0.445508 = sum of:
          0.044090867 = weight(abstract_txt:strategies in 503) [ClassicSimilarity], result of:
            0.044090867 = score(doc=503,freq=1.0), product of:
              0.09103788 = queryWeight, product of:
                1.2079321 = boost
                5.16601 = idf(docFreq=650, maxDocs=41962)
                0.014588961 = queryNorm
              0.48431343 = fieldWeight in 503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.16601 = idf(docFreq=650, maxDocs=41962)
                0.09375 = fieldNorm(doc=503)
          0.0458206 = weight(abstract_txt:links in 503) [ClassicSimilarity], result of:
            0.0458206 = score(doc=503,freq=1.0), product of:
              0.093403585 = queryWeight, product of:
                1.2235261 = boost
                5.2327013 = idf(docFreq=608, maxDocs=41962)
                0.014588961 = queryNorm
              0.49056575 = fieldWeight in 503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2327013 = idf(docFreq=608, maxDocs=41962)
                0.09375 = fieldNorm(doc=503)
          0.0112221185 = weight(abstract_txt:that in 503) [ClassicSimilarity], result of:
            0.0112221185 = score(doc=503,freq=1.0), product of:
              0.04962337 = queryWeight, product of:
                1.4100827 = boost
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.014588961 = queryNorm
              0.22614583 = fieldWeight in 503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.09375 = fieldNorm(doc=503)
          0.073704876 = weight(abstract_txt:know in 503) [ClassicSimilarity], result of:
            0.073704876 = score(doc=503,freq=1.0), product of:
              0.12822928 = queryWeight, product of:
                1.4335903 = boost
                6.131091 = idf(docFreq=247, maxDocs=41962)
                0.014588961 = queryNorm
              0.57478976 = fieldWeight in 503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.131091 = idf(docFreq=247, maxDocs=41962)
                0.09375 = fieldNorm(doc=503)
          0.09236701 = weight(abstract_txt:what in 503) [ClassicSimilarity], result of:
            0.09236701 = score(doc=503,freq=3.0), product of:
              0.13020787 = queryWeight, product of:
                2.0429845 = boost
                4.368655 = idf(docFreq=1444, maxDocs=41962)
                0.014588961 = queryNorm
              0.70938116 = fieldWeight in 503, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.368655 = idf(docFreq=1444, maxDocs=41962)
                0.09375 = fieldNorm(doc=503)
          0.17830253 = weight(abstract_txt:want in 503) [ClassicSimilarity], result of:
            0.17830253 = score(doc=503,freq=1.0), product of:
              0.2911419 = queryWeight, product of:
                3.0549128 = boost
                6.5325317 = idf(docFreq=165, maxDocs=41962)
                0.014588961 = queryNorm
              0.61242485 = fieldWeight in 503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5325317 = idf(docFreq=165, maxDocs=41962)
                0.09375 = fieldNorm(doc=503)
        0.24 = coord(6/25)
    
  3. Allo, P.; Baumgaertner, B.; D'Alfonso, S.; Fresco, N.; Gobbo, F.; Grubaugh, C.; Iliadis, A.; Illari, P.; Kerr, E.; Primiero, G.; Russo, F.; Schulz, C.; Taddeo, M.; Turilli, M.; Vakarelov, O.; Zenil, H.: ¬The philosophy of information : an introduction (2013) 0.09
    0.09200851 = sum of:
      0.09200851 = product of:
        0.3833688 = sum of:
          0.015546958 = weight(abstract_txt:because in 381) [ClassicSimilarity], result of:
            0.015546958 = score(doc=381,freq=1.0), product of:
              0.081450574 = queryWeight, product of:
                1.1425587 = boost
                4.886425 = idf(docFreq=860, maxDocs=41962)
                0.014588961 = queryNorm
              0.19087598 = fieldWeight in 381, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.886425 = idf(docFreq=860, maxDocs=41962)
                0.0390625 = fieldNorm(doc=381)
          0.008098866 = weight(abstract_txt:that in 381) [ClassicSimilarity], result of:
            0.008098866 = score(doc=381,freq=3.0), product of:
              0.04962337 = queryWeight, product of:
                1.4100827 = boost
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.014588961 = queryNorm
              0.16320668 = fieldWeight in 381, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.0390625 = fieldNorm(doc=381)
          0.055206604 = weight(abstract_txt:talk in 381) [ClassicSimilarity], result of:
            0.055206604 = score(doc=381,freq=1.0), product of:
              0.18957943 = queryWeight, product of:
                1.7431191 = boost
                7.454865 = idf(docFreq=65, maxDocs=41962)
                0.014588961 = queryNorm
              0.29120567 = fieldWeight in 381, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.454865 = idf(docFreq=65, maxDocs=41962)
                0.0390625 = fieldNorm(doc=381)
          0.11235545 = weight(abstract_txt:we're in 381) [ClassicSimilarity], result of:
            0.11235545 = score(doc=381,freq=1.0), product of:
              0.30445746 = queryWeight, product of:
                2.208995 = boost
                9.447295 = idf(docFreq=8, maxDocs=41962)
                0.014588961 = queryNorm
              0.36903498 = fieldWeight in 381, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.447295 = idf(docFreq=8, maxDocs=41962)
                0.0390625 = fieldNorm(doc=381)
          0.08709516 = weight(abstract_txt:think in 381) [ClassicSimilarity], result of:
            0.08709516 = score(doc=381,freq=2.0), product of:
              0.2334248 = queryWeight, product of:
                2.368921 = boost
                6.7541704 = idf(docFreq=132, maxDocs=41962)
                0.014588961 = queryNorm
              0.3731187 = fieldWeight in 381, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7541704 = idf(docFreq=132, maxDocs=41962)
                0.0390625 = fieldNorm(doc=381)
          0.10506578 = weight(abstract_txt:want in 381) [ClassicSimilarity], result of:
            0.10506578 = score(doc=381,freq=2.0), product of:
              0.2911419 = queryWeight, product of:
                3.0549128 = boost
                6.5325317 = idf(docFreq=165, maxDocs=41962)
                0.014588961 = queryNorm
              0.3608748 = fieldWeight in 381, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5325317 = idf(docFreq=165, maxDocs=41962)
                0.0390625 = fieldNorm(doc=381)
        0.24 = coord(6/25)
    
  4. Goren-Bar, D.; Kuflik, T.: Supporting user-subjective categorization with self-organizing maps and learning vector quantization (2005) 0.09
    0.086741105 = sum of:
      0.086741105 = product of:
        0.7228426 = sum of:
          0.012958185 = weight(abstract_txt:that in 4326) [ClassicSimilarity], result of:
            0.012958185 = score(doc=4326,freq=3.0), product of:
              0.04962337 = queryWeight, product of:
                1.4100827 = boost
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.014588961 = queryNorm
              0.2611307 = fieldWeight in 4326, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4122221 = idf(docFreq=10221, maxDocs=41962)
                0.0625 = fieldNorm(doc=4326)
          0.046863407 = weight(abstract_txt:today in 4326) [ClassicSimilarity], result of:
            0.046863407 = score(doc=4326,freq=1.0), product of:
              0.124243334 = queryWeight, product of:
                1.4111332 = boost
                6.035048 = idf(docFreq=272, maxDocs=41962)
                0.014588961 = queryNorm
              0.3771905 = fieldWeight in 4326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.035048 = idf(docFreq=272, maxDocs=41962)
                0.0625 = fieldNorm(doc=4326)
          0.66302097 = weight(abstract_txt:categorization in 4326) [ClassicSimilarity], result of:
            0.66302097 = score(doc=4326,freq=12.0), product of:
              0.45784175 = queryWeight, product of:
                4.691911 = boost
                6.6886926 = idf(docFreq=141, maxDocs=41962)
                0.014588961 = queryNorm
              1.4481444 = fieldWeight in 4326, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                6.6886926 = idf(docFreq=141, maxDocs=41962)
                0.0625 = fieldNorm(doc=4326)
        0.12 = coord(3/25)
    
  5. Stephens, O.: Introduction to OpenRefine (2014) 0.09
    0.08637565 = sum of:
      0.08637565 = product of:
        0.43187824 = sum of:
          0.023700397 = weight(abstract_txt:ways in 4885) [ClassicSimilarity], result of:
            0.023700397 = score(doc=4885,freq=1.0), product of:
              0.0788656 = queryWeight, product of:
                1.124282 = boost
                4.8082604 = idf(docFreq=930, maxDocs=41962)
                0.014588961 = queryNorm
              0.30051628 = fieldWeight in 4885, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8082604 = idf(docFreq=930, maxDocs=41962)
                0.0625 = fieldNorm(doc=4885)
          0.09725011 = weight(abstract_txt:messy in 4885) [ClassicSimilarity], result of:
            0.09725011 = score(doc=4885,freq=1.0), product of:
              0.16043556 = queryWeight, product of:
                1.1338792 = boost
                9.698609 = idf(docFreq=6, maxDocs=41962)
                0.014588961 = queryNorm
              0.6061631 = fieldWeight in 4885, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.698609 = idf(docFreq=6, maxDocs=41962)
                0.0625 = fieldNorm(doc=4885)
          0.06948963 = weight(abstract_txt:know in 4885) [ClassicSimilarity], result of:
            0.06948963 = score(doc=4885,freq=2.0), product of:
              0.12822928 = queryWeight, product of:
                1.4335903 = boost
                6.131091 = idf(docFreq=247, maxDocs=41962)
                0.014588961 = queryNorm
              0.541917 = fieldWeight in 4885, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.131091 = idf(docFreq=247, maxDocs=41962)
                0.0625 = fieldNorm(doc=4885)
          0.03555208 = weight(abstract_txt:what in 4885) [ClassicSimilarity], result of:
            0.03555208 = score(doc=4885,freq=1.0), product of:
              0.13020787 = queryWeight, product of:
                2.0429845 = boost
                4.368655 = idf(docFreq=1444, maxDocs=41962)
                0.014588961 = queryNorm
              0.27304095 = fieldWeight in 4885, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.368655 = idf(docFreq=1444, maxDocs=41962)
                0.0625 = fieldNorm(doc=4885)
          0.20588602 = weight(abstract_txt:want in 4885) [ClassicSimilarity], result of:
            0.20588602 = score(doc=4885,freq=3.0), product of:
              0.2911419 = queryWeight, product of:
                3.0549128 = boost
                6.5325317 = idf(docFreq=165, maxDocs=41962)
                0.014588961 = queryNorm
              0.70716727 = fieldWeight in 4885, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5325317 = idf(docFreq=165, maxDocs=41962)
                0.0625 = fieldNorm(doc=4885)
        0.2 = coord(5/25)