Document (#35007)

Author
Malak, P.
Title
Is the Artificial Intelligence applicable for the libraries purposes?
Source
Librarianship in the information age: Proceedings of the 13th BOBCATSSS Symposium, 31 January - 2 February 2005 in Budapest, Hungary. Eds.: Marte Langeland u.a
Imprint
Budapest : ELTE
Year
2005
Pages
S.232-237
Abstract
Artificial Intelligence (well introduced in the "Matrix" movie) is a high level tool and technology. Sophisticated methods, genetics algorithms, neural networks, self-learning systems, and so on, are the leading fields of the AI research. What are potential relations between very classical science and service - libraries and very futuristic one - AI? If we consider library as a service - still developing one, we can use AI achievements for improving the service. Some parts of Al research are concerned with texts analyses (neural networks), documents classification - either for information classification, as well as for (mostly) user information needs meeting in searching systems. Another possibility is self-learning systems designed for information value estimation or documents abstracts or summaries automatic preparation. Combining Al tools and methods and linguistic rules and principles we can design a virtual librarian - searching agent, that works not only with the contents of documents (as we can meet already), but also, and even mainly with the information as it is - evaluating its value and assigning document as valuable or not for further use. One of used linguistic tools - a very simple one - is frequencies, and their distribution within certain natural language. The basic frequencies computation is a sufficient base for further, more extensive linguistic studies. Knowing the current frequencies of the signals (words) in the document one can evaluate the probability of particular word occurrence and, basing on the probabilities, evaluate entropy. The entropy calculation for the document allows the information value estimation. The relations between number of words and their frequencies, from one side, and the entropy on the other side, for different scientific, web available, documents have been studied. The research resulted with some interesting observations and facts concerning contemporary scientific language, and information accessibility on the Internet. The results can be useful for intelligent information retrieval systems, enabling them to proper fulfilling and meeting user information needs, and strict, proper user queries answer, as well as documents acquisition for research libraries. For each discipline there are different words being used in documents. This concern especially so-called characteristic vocabulary, i.e. the frequent, but limited in occurrence to a specific type of publications entries (e.g.: technical, biological, mathematical, sociological, and so on, terms).

Similar documents (content)

  1. Lan, K.C.; Ho, K.S.; Luk, R.W.P.; Leong, H.V.: Dialogue act recognition using maximum entropy (2008) 0.15
    0.15072198 = sum of:
      0.15072198 = product of:
        0.62800825 = sum of:
          0.08329413 = weight(abstract_txt:occurrence in 1717) [ClassicSimilarity], result of:
            0.08329413 = score(doc=1717,freq=1.0), product of:
              0.1964417 = queryWeight, product of:
                1.2278451 = boost
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.023582477 = queryNorm
              0.4240145 = fieldWeight in 1717, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.021189608 = weight(abstract_txt:systems in 1717) [ClassicSimilarity], result of:
            0.021189608 = score(doc=1717,freq=1.0), product of:
              0.09936864 = queryWeight, product of:
                1.2349983 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.023582477 = queryNorm
              0.2132424 = fieldWeight in 1717, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.06745539 = weight(abstract_txt:very in 1717) [ClassicSimilarity], result of:
            0.06745539 = score(doc=1717,freq=2.0), product of:
              0.15506802 = queryWeight, product of:
                1.3360832 = boost
                4.921521 = idf(docFreq=875, maxDocs=44218)
                0.023582477 = queryNorm
              0.43500513 = fieldWeight in 1717, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.921521 = idf(docFreq=875, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.021411287 = weight(abstract_txt:information in 1717) [ClassicSimilarity], result of:
            0.021411287 = score(doc=1717,freq=2.0), product of:
              0.100060485 = queryWeight, product of:
                1.752621 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.023582477 = queryNorm
              0.21398345 = fieldWeight in 1717, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.20175053 = weight(abstract_txt:entropy in 1717) [ClassicSimilarity], result of:
            0.20175053 = score(doc=1717,freq=1.0), product of:
              0.40556857 = queryWeight, product of:
                2.16075 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.023582477 = queryNorm
              0.4974511 = fieldWeight in 1717, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.2329073 = weight(abstract_txt:frequencies in 1717) [ClassicSimilarity], result of:
            0.2329073 = score(doc=1717,freq=1.0), product of:
              0.4912352 = queryWeight, product of:
                2.745911 = boost
                7.5860133 = idf(docFreq=60, maxDocs=44218)
                0.023582477 = queryNorm
              0.47412583 = fieldWeight in 1717, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5860133 = idf(docFreq=60, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
        0.24 = coord(6/25)
    
  2. Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.15
    0.14937496 = sum of:
      0.14937496 = product of:
        0.6223957 = sum of:
          0.10514437 = weight(abstract_txt:proper in 604) [ClassicSimilarity], result of:
            0.10514437 = score(doc=604,freq=2.0), product of:
              0.1821118 = queryWeight, product of:
                1.1822131 = boost
                6.532101 = idf(docFreq=174, maxDocs=44218)
                0.023582477 = queryNorm
              0.57736164 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.532101 = idf(docFreq=174, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.021189608 = weight(abstract_txt:systems in 604) [ClassicSimilarity], result of:
            0.021189608 = score(doc=604,freq=1.0), product of:
              0.09936864 = queryWeight, product of:
                1.2349983 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.023582477 = queryNorm
              0.2132424 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.12275155 = weight(abstract_txt:words in 604) [ClassicSimilarity], result of:
            0.12275155 = score(doc=604,freq=4.0), product of:
              0.18345061 = queryWeight, product of:
                1.4532219 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.023582477 = queryNorm
              0.66912585 = fieldWeight in 604, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.015140067 = weight(abstract_txt:information in 604) [ClassicSimilarity], result of:
            0.015140067 = score(doc=604,freq=1.0), product of:
              0.100060485 = queryWeight, product of:
                1.752621 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.023582477 = queryNorm
              0.15130915 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.1252628 = weight(abstract_txt:documents in 604) [ClassicSimilarity], result of:
            0.1252628 = score(doc=604,freq=5.0), product of:
              0.21748161 = queryWeight, product of:
                2.2376833 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.023582477 = queryNorm
              0.5759696 = fieldWeight in 604, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.2329073 = weight(abstract_txt:frequencies in 604) [ClassicSimilarity], result of:
            0.2329073 = score(doc=604,freq=1.0), product of:
              0.4912352 = queryWeight, product of:
                2.745911 = boost
                7.5860133 = idf(docFreq=60, maxDocs=44218)
                0.023582477 = queryNorm
              0.47412583 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5860133 = idf(docFreq=60, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
        0.24 = coord(6/25)
    
  3. Hutchins, W.J.; Somers, H.L.: ¬An introduction to machine translation (1992) 0.15
    0.14920303 = sum of:
      0.14920303 = product of:
        0.53286797 = sum of:
          0.1136518 = weight(abstract_txt:intelligence in 4512) [ClassicSimilarity], result of:
            0.1136518 = score(doc=4512,freq=2.0), product of:
              0.14637631 = queryWeight, product of:
                1.0598933 = boost
                5.8562455 = idf(docFreq=343, maxDocs=44218)
                0.023582477 = queryNorm
              0.77643573 = fieldWeight in 4512, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8562455 = idf(docFreq=343, maxDocs=44218)
                0.09375 = fieldNorm(doc=4512)
          0.035784356 = weight(abstract_txt:well in 4512) [ClassicSimilarity], result of:
            0.035784356 = score(doc=4512,freq=1.0), product of:
              0.09770627 = queryWeight, product of:
                1.0605559 = boost
                3.9066048 = idf(docFreq=2416, maxDocs=44218)
                0.023582477 = queryNorm
              0.3662442 = fieldWeight in 4512, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9066048 = idf(docFreq=2416, maxDocs=44218)
                0.09375 = fieldNorm(doc=4512)
          0.12628624 = weight(abstract_txt:artificial in 4512) [ClassicSimilarity], result of:
            0.12628624 = score(doc=4512,freq=2.0), product of:
              0.15703288 = queryWeight, product of:
                1.0977969 = boost
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.023582477 = queryNorm
              0.80420256 = fieldWeight in 4512, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.09375 = fieldNorm(doc=4512)
          0.025500592 = weight(abstract_txt:research in 4512) [ClassicSimilarity], result of:
            0.025500592 = score(doc=4512,freq=1.0), product of:
              0.08579726 = queryWeight, product of:
                1.147568 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.023582477 = queryNorm
              0.2972192 = fieldWeight in 4512, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.09375 = fieldNorm(doc=4512)
          0.031784408 = weight(abstract_txt:systems in 4512) [ClassicSimilarity], result of:
            0.031784408 = score(doc=4512,freq=1.0), product of:
              0.09936864 = queryWeight, product of:
                1.2349983 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.023582477 = queryNorm
              0.3198636 = fieldWeight in 4512, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.09375 = fieldNorm(doc=4512)
          0.1677436 = weight(abstract_txt:linguistic in 4512) [ClassicSimilarity], result of:
            0.1677436 = score(doc=4512,freq=2.0), product of:
              0.21721058 = queryWeight, product of:
                1.5812948 = boost
                5.8247695 = idf(docFreq=354, maxDocs=44218)
                0.023582477 = queryNorm
              0.7722626 = fieldWeight in 4512, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8247695 = idf(docFreq=354, maxDocs=44218)
                0.09375 = fieldNorm(doc=4512)
          0.03211693 = weight(abstract_txt:information in 4512) [ClassicSimilarity], result of:
            0.03211693 = score(doc=4512,freq=2.0), product of:
              0.100060485 = queryWeight, product of:
                1.752621 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.023582477 = queryNorm
              0.32097518 = fieldWeight in 4512, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.09375 = fieldNorm(doc=4512)
        0.28 = coord(7/25)
    
  4. Deerwester, S.; Dumais, S.; Landauer, T.; Furnass, G.; Beck, L.: Improving information retrieval with latent semantic indexing (1988) 0.15
    0.14841688 = sum of:
      0.14841688 = product of:
        0.6184037 = sum of:
          0.068266876 = weight(abstract_txt:relations in 2396) [ClassicSimilarity], result of:
            0.068266876 = score(doc=2396,freq=1.0), product of:
              0.1312915 = queryWeight, product of:
                1.003795 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.023582477 = queryNorm
              0.5199642 = fieldWeight in 2396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.09375 = fieldNorm(doc=2396)
          0.047474302 = weight(abstract_txt:document in 2396) [ClassicSimilarity], result of:
            0.047474302 = score(doc=2396,freq=1.0), product of:
              0.1179685 = queryWeight, product of:
                1.1653473 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.023582477 = queryNorm
              0.40243202 = fieldWeight in 2396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.09375 = fieldNorm(doc=2396)
          0.031784408 = weight(abstract_txt:systems in 2396) [ClassicSimilarity], result of:
            0.031784408 = score(doc=2396,freq=1.0), product of:
              0.09936864 = queryWeight, product of:
                1.2349983 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.023582477 = queryNorm
              0.3198636 = fieldWeight in 2396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.09375 = fieldNorm(doc=2396)
          0.022710102 = weight(abstract_txt:information in 2396) [ClassicSimilarity], result of:
            0.022710102 = score(doc=2396,freq=1.0), product of:
              0.100060485 = queryWeight, product of:
                1.752621 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.023582477 = queryNorm
              0.22696373 = fieldWeight in 2396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.09375 = fieldNorm(doc=2396)
          0.3026258 = weight(abstract_txt:entropy in 2396) [ClassicSimilarity], result of:
            0.3026258 = score(doc=2396,freq=1.0), product of:
              0.40556857 = queryWeight, product of:
                2.16075 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.023582477 = queryNorm
              0.74617666 = fieldWeight in 2396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.09375 = fieldNorm(doc=2396)
          0.14554219 = weight(abstract_txt:documents in 2396) [ClassicSimilarity], result of:
            0.14554219 = score(doc=2396,freq=3.0), product of:
              0.21748161 = queryWeight, product of:
                2.2376833 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.023582477 = queryNorm
              0.6692161 = fieldWeight in 2396, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.09375 = fieldNorm(doc=2396)
        0.24 = coord(6/25)
    
  5. Saarikoski, J.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study of the use of self-organising maps in information retrieval (2009) 0.14
    0.14096077 = sum of:
      0.14096077 = product of:
        0.5034313 = sum of:
          0.11238856 = weight(abstract_txt:self in 2836) [ClassicSimilarity], result of:
            0.11238856 = score(doc=2836,freq=6.0), product of:
              0.1320044 = queryWeight, product of:
                1.0065166 = boost
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.023582477 = queryNorm
              0.85140014 = fieldWeight in 2836, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.017000394 = weight(abstract_txt:research in 2836) [ClassicSimilarity], result of:
            0.017000394 = score(doc=2836,freq=1.0), product of:
              0.08579726 = queryWeight, product of:
                1.147568 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.023582477 = queryNorm
              0.19814612 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.07752521 = weight(abstract_txt:document in 2836) [ClassicSimilarity], result of:
            0.07752521 = score(doc=2836,freq=6.0), product of:
              0.1179685 = queryWeight, product of:
                1.1653473 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.023582477 = queryNorm
              0.65716875 = fieldWeight in 2836, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.033447407 = weight(abstract_txt:value in 2836) [ClassicSimilarity], result of:
            0.033447407 = score(doc=2836,freq=1.0), product of:
              0.122394755 = queryWeight, product of:
                1.1870083 = boost
                4.3723974 = idf(docFreq=1516, maxDocs=44218)
                0.023582477 = queryNorm
              0.27327484 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3723974 = idf(docFreq=1516, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.08863349 = weight(abstract_txt:neural in 2836) [ClassicSimilarity], result of:
            0.08863349 = score(doc=2836,freq=1.0), product of:
              0.2047494 = queryWeight, product of:
                1.2535396 = boost
                6.926203 = idf(docFreq=117, maxDocs=44218)
                0.023582477 = queryNorm
              0.43288767 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.926203 = idf(docFreq=117, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.026223365 = weight(abstract_txt:information in 2836) [ClassicSimilarity], result of:
            0.026223365 = score(doc=2836,freq=3.0), product of:
              0.100060485 = queryWeight, product of:
                1.752621 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.023582477 = queryNorm
              0.26207513 = fieldWeight in 2836, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.14821292 = weight(abstract_txt:documents in 2836) [ClassicSimilarity], result of:
            0.14821292 = score(doc=2836,freq=7.0), product of:
              0.21748161 = queryWeight, product of:
                2.2376833 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.023582477 = queryNorm
              0.6814963 = fieldWeight in 2836, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
        0.28 = coord(7/25)