Document (#35008)

Author
Malak, P.
Title
Is the Artificial Intelligence applicable for the libraries purposes?
Source
Librarianship in the information age: Proceedings of the 13th BOBCATSSS Symposium, 31 January - 2 February 2005 in Budapest, Hungary. Eds.: Marte Langeland u.a
Imprint
Budapest : ELTE
Year
2005
Pages
S.232-237
Abstract
Artificial Intelligence (well introduced in the "Matrix" movie) is a high level tool and technology. Sophisticated methods, genetics algorithms, neural networks, self-learning systems, and so on, are the leading fields of the AI research. What are potential relations between very classical science and service - libraries and very futuristic one - AI? If we consider library as a service - still developing one, we can use AI achievements for improving the service. Some parts of Al research are concerned with texts analyses (neural networks), documents classification - either for information classification, as well as for (mostly) user information needs meeting in searching systems. Another possibility is self-learning systems designed for information value estimation or documents abstracts or summaries automatic preparation. Combining Al tools and methods and linguistic rules and principles we can design a virtual librarian - searching agent, that works not only with the contents of documents (as we can meet already), but also, and even mainly with the information as it is - evaluating its value and assigning document as valuable or not for further use. One of used linguistic tools - a very simple one - is frequencies, and their distribution within certain natural language. The basic frequencies computation is a sufficient base for further, more extensive linguistic studies. Knowing the current frequencies of the signals (words) in the document one can evaluate the probability of particular word occurrence and, basing on the probabilities, evaluate entropy. The entropy calculation for the document allows the information value estimation. The relations between number of words and their frequencies, from one side, and the entropy on the other side, for different scientific, web available, documents have been studied. The research resulted with some interesting observations and facts concerning contemporary scientific language, and information accessibility on the Internet. The results can be useful for intelligent information retrieval systems, enabling them to proper fulfilling and meeting user information needs, and strict, proper user queries answer, as well as documents acquisition for research libraries. For each discipline there are different words being used in documents. This concern especially so-called characteristic vocabulary, i.e. the frequent, but limited in occurrence to a specific type of publications entries (e.g.: technical, biological, mathematical, sociological, and so on, terms).

Similar documents (content)

  1. Hutchins, W.J.; Somers, H.L.: ¬An introduction to machine translation (1992) 0.15
    0.15257384 = sum of:
      0.15257384 = product of:
        0.54490656 = sum of:
          0.03627688 = weight(abstract_txt:well in 5513) [ClassicSimilarity], result of:
            0.03627688 = score(doc=5513,freq=1.0), product of:
              0.09845935 = queryWeight, product of:
                1.0680177 = boost
                3.930083 = idf(docFreq=2281, maxDocs=42740)
                0.023457233 = queryNorm
              0.36844528 = fieldWeight in 5513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.930083 = idf(docFreq=2281, maxDocs=42740)
                0.09375 = fieldNorm(doc=5513)
          0.117466375 = weight(abstract_txt:intelligence in 5513) [ClassicSimilarity], result of:
            0.117466375 = score(doc=5513,freq=2.0), product of:
              0.14941894 = queryWeight, product of:
                1.0742545 = boost
                5.929549 = idf(docFreq=308, maxDocs=42740)
                0.023457233 = queryNorm
              0.78615457 = fieldWeight in 5513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.929549 = idf(docFreq=308, maxDocs=42740)
                0.09375 = fieldNorm(doc=5513)
          0.13102673 = weight(abstract_txt:artificial in 5513) [ClassicSimilarity], result of:
            0.13102673 = score(doc=5513,freq=2.0), product of:
              0.16070762 = queryWeight, product of:
                1.1140959 = boost
                6.1494617 = idf(docFreq=247, maxDocs=42740)
                0.023457233 = queryNorm
              0.8153112 = fieldWeight in 5513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1494617 = idf(docFreq=247, maxDocs=42740)
                0.09375 = fieldNorm(doc=5513)
          0.026454007 = weight(abstract_txt:research in 5513) [ClassicSimilarity], result of:
            0.026454007 = score(doc=5513,freq=1.0), product of:
              0.08779657 = queryWeight, product of:
                1.16455 = boost
                3.2139761 = idf(docFreq=4669, maxDocs=42740)
                0.023457233 = queryNorm
              0.30131027 = fieldWeight in 5513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2139761 = idf(docFreq=4669, maxDocs=42740)
                0.09375 = fieldNorm(doc=5513)
          0.031724285 = weight(abstract_txt:systems in 5513) [ClassicSimilarity], result of:
            0.031724285 = score(doc=5513,freq=1.0), product of:
              0.09910095 = queryWeight, product of:
                1.2372522 = boost
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.023457233 = queryNorm
              0.3201209 = fieldWeight in 5513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.09375 = fieldNorm(doc=5513)
          0.16961509 = weight(abstract_txt:linguistic in 5513) [ClassicSimilarity], result of:
            0.16961509 = score(doc=5513,freq=2.0), product of:
              0.21850935 = queryWeight, product of:
                1.5910543 = boost
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.023457233 = queryNorm
              0.77623725 = fieldWeight in 5513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.09375 = fieldNorm(doc=5513)
          0.032343213 = weight(abstract_txt:information in 5513) [ClassicSimilarity], result of:
            0.032343213 = score(doc=5513,freq=2.0), product of:
              0.10038573 = queryWeight, product of:
                1.7610446 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.023457233 = queryNorm
              0.32218933 = fieldWeight in 5513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.09375 = fieldNorm(doc=5513)
        0.28 = coord(7/25)
    
  2. Lan, K.C.; Ho, K.S.; Luk, R.W.P.; Leong, H.V.: Dialogue act recognition using maximum entropy (2008) 0.15
    0.15026367 = sum of:
      0.15026367 = product of:
        0.62609863 = sum of:
          0.08362934 = weight(abstract_txt:occurrence in 3718) [ClassicSimilarity], result of:
            0.08362934 = score(doc=3718,freq=1.0), product of:
              0.19668588 = queryWeight, product of:
                1.2325114 = boost
                6.803078 = idf(docFreq=128, maxDocs=42740)
                0.023457233 = queryNorm
              0.4251924 = fieldWeight in 3718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.803078 = idf(docFreq=128, maxDocs=42740)
                0.0625 = fieldNorm(doc=3718)
          0.021149524 = weight(abstract_txt:systems in 3718) [ClassicSimilarity], result of:
            0.021149524 = score(doc=3718,freq=1.0), product of:
              0.09910095 = queryWeight, product of:
                1.2372522 = boost
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.023457233 = queryNorm
              0.21341394 = fieldWeight in 3718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.0625 = fieldNorm(doc=3718)
          0.06744343 = weight(abstract_txt:very in 3718) [ClassicSimilarity], result of:
            0.06744343 = score(doc=3718,freq=2.0), product of:
              0.15482733 = queryWeight, product of:
                1.3392874 = boost
                4.928299 = idf(docFreq=840, maxDocs=42740)
                0.023457233 = queryNorm
              0.43560418 = fieldWeight in 3718, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.928299 = idf(docFreq=840, maxDocs=42740)
                0.0625 = fieldNorm(doc=3718)
          0.02156214 = weight(abstract_txt:information in 3718) [ClassicSimilarity], result of:
            0.02156214 = score(doc=3718,freq=2.0), product of:
              0.10038573 = queryWeight, product of:
                1.7610446 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.023457233 = queryNorm
              0.21479288 = fieldWeight in 3718, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.0625 = fieldNorm(doc=3718)
          0.20200588 = weight(abstract_txt:entropy in 3718) [ClassicSimilarity], result of:
            0.20200588 = score(doc=3718,freq=1.0), product of:
              0.4053285 = queryWeight, product of:
                2.1669734 = boost
                7.974011 = idf(docFreq=39, maxDocs=42740)
                0.023457233 = queryNorm
              0.49837568 = fieldWeight in 3718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.974011 = idf(docFreq=39, maxDocs=42740)
                0.0625 = fieldNorm(doc=3718)
          0.2303083 = weight(abstract_txt:frequencies in 3718) [ClassicSimilarity], result of:
            0.2303083 = score(doc=3718,freq=1.0), product of:
              0.4868746 = queryWeight, product of:
                2.7423818 = boost
                7.568546 = idf(docFreq=59, maxDocs=42740)
                0.023457233 = queryNorm
              0.4730341 = fieldWeight in 3718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.568546 = idf(docFreq=59, maxDocs=42740)
                0.0625 = fieldNorm(doc=3718)
        0.24 = coord(6/25)
    
  3. Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.15
    0.14851171 = sum of:
      0.14851171 = product of:
        0.6187988 = sum of:
          0.10530904 = weight(abstract_txt:proper in 2605) [ClassicSimilarity], result of:
            0.10530904 = score(doc=2605,freq=2.0), product of:
              0.18204044 = queryWeight, product of:
                1.1857368 = boost
                6.5448966 = idf(docFreq=166, maxDocs=42740)
                0.023457233 = queryNorm
              0.5784926 = fieldWeight in 2605, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5448966 = idf(docFreq=166, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.021149524 = weight(abstract_txt:systems in 2605) [ClassicSimilarity], result of:
            0.021149524 = score(doc=2605,freq=1.0), product of:
              0.09910095 = queryWeight, product of:
                1.2372522 = boost
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.023457233 = queryNorm
              0.21341394 = fieldWeight in 2605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.122596785 = weight(abstract_txt:words in 2605) [ClassicSimilarity], result of:
            0.122596785 = score(doc=2605,freq=4.0), product of:
              0.18303348 = queryWeight, product of:
                1.4561807 = boost
                5.358442 = idf(docFreq=546, maxDocs=42740)
                0.023457233 = queryNorm
              0.6698052 = fieldWeight in 2605, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.358442 = idf(docFreq=546, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.015246736 = weight(abstract_txt:information in 2605) [ClassicSimilarity], result of:
            0.015246736 = score(doc=2605,freq=1.0), product of:
              0.10038573 = queryWeight, product of:
                1.7610446 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.023457233 = queryNorm
              0.1518815 = fieldWeight in 2605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.12418844 = weight(abstract_txt:documents in 2605) [ClassicSimilarity], result of:
            0.12418844 = score(doc=2605,freq=5.0), product of:
              0.21592616 = queryWeight, product of:
                2.2367508 = boost
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.023457233 = queryNorm
              0.5751431 = fieldWeight in 2605, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
          0.2303083 = weight(abstract_txt:frequencies in 2605) [ClassicSimilarity], result of:
            0.2303083 = score(doc=2605,freq=1.0), product of:
              0.4868746 = queryWeight, product of:
                2.7423818 = boost
                7.568546 = idf(docFreq=59, maxDocs=42740)
                0.023457233 = queryNorm
              0.4730341 = fieldWeight in 2605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.568546 = idf(docFreq=59, maxDocs=42740)
                0.0625 = fieldNorm(doc=2605)
        0.24 = coord(6/25)
    
  4. Deerwester, S.; Dumais, S.; Landauer, T.; Furnass, G.; Beck, L.: Improving information retrieval with latent semantic indexing (1988) 0.15
    0.14834595 = sum of:
      0.14834595 = product of:
        0.61810815 = sum of:
          0.06933084 = weight(abstract_txt:relations in 4397) [ClassicSimilarity], result of:
            0.06933084 = score(doc=4397,freq=1.0), product of:
              0.13246188 = queryWeight, product of:
                1.0114626 = boost
                5.5829573 = idf(docFreq=436, maxDocs=42740)
                0.023457233 = queryNorm
              0.5234022 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5829573 = idf(docFreq=436, maxDocs=42740)
                0.09375 = fieldNorm(doc=4397)
          0.04688019 = weight(abstract_txt:document in 4397) [ClassicSimilarity], result of:
            0.04688019 = score(doc=4397,freq=1.0), product of:
              0.11681445 = queryWeight, product of:
                1.1633176 = boost
                4.280766 = idf(docFreq=1606, maxDocs=42740)
                0.023457233 = queryNorm
              0.40132183 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.280766 = idf(docFreq=1606, maxDocs=42740)
                0.09375 = fieldNorm(doc=4397)
          0.031724285 = weight(abstract_txt:systems in 4397) [ClassicSimilarity], result of:
            0.031724285 = score(doc=4397,freq=1.0), product of:
              0.09910095 = queryWeight, product of:
                1.2372522 = boost
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.023457233 = queryNorm
              0.3201209 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.414623 = idf(docFreq=3820, maxDocs=42740)
                0.09375 = fieldNorm(doc=4397)
          0.022870103 = weight(abstract_txt:information in 4397) [ClassicSimilarity], result of:
            0.022870103 = score(doc=4397,freq=1.0), product of:
              0.10038573 = queryWeight, product of:
                1.7610446 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.023457233 = queryNorm
              0.22782224 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.09375 = fieldNorm(doc=4397)
          0.30300882 = weight(abstract_txt:entropy in 4397) [ClassicSimilarity], result of:
            0.30300882 = score(doc=4397,freq=1.0), product of:
              0.4053285 = queryWeight, product of:
                2.1669734 = boost
                7.974011 = idf(docFreq=39, maxDocs=42740)
                0.023457233 = queryNorm
              0.74756354 = fieldWeight in 4397, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.974011 = idf(docFreq=39, maxDocs=42740)
                0.09375 = fieldNorm(doc=4397)
          0.14429392 = weight(abstract_txt:documents in 4397) [ClassicSimilarity], result of:
            0.14429392 = score(doc=4397,freq=3.0), product of:
              0.21592616 = queryWeight, product of:
                2.2367508 = boost
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.023457233 = queryNorm
              0.66825587 = fieldWeight in 4397, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.09375 = fieldNorm(doc=4397)
        0.24 = coord(6/25)
    
  5. Saarikoski, J.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study of the use of self-organising maps in information retrieval (2009) 0.14
    0.14243878 = sum of:
      0.14243878 = product of:
        0.50870997 = sum of:
          0.1149192 = weight(abstract_txt:self in 4837) [ClassicSimilarity], result of:
            0.1149192 = score(doc=4837,freq=6.0), product of:
              0.13378642 = queryWeight, product of:
                1.016507 = boost
                5.610801 = idf(docFreq=424, maxDocs=42740)
                0.023457233 = queryNorm
              0.85897505 = fieldWeight in 4837, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.610801 = idf(docFreq=424, maxDocs=42740)
                0.0625 = fieldNorm(doc=4837)
          0.07655503 = weight(abstract_txt:document in 4837) [ClassicSimilarity], result of:
            0.07655503 = score(doc=4837,freq=6.0), product of:
              0.11681445 = queryWeight, product of:
                1.1633176 = boost
                4.280766 = idf(docFreq=1606, maxDocs=42740)
                0.023457233 = queryNorm
              0.6553558 = fieldWeight in 4837, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.280766 = idf(docFreq=1606, maxDocs=42740)
                0.0625 = fieldNorm(doc=4837)
          0.017636005 = weight(abstract_txt:research in 4837) [ClassicSimilarity], result of:
            0.017636005 = score(doc=4837,freq=1.0), product of:
              0.08779657 = queryWeight, product of:
                1.16455 = boost
                3.2139761 = idf(docFreq=4669, maxDocs=42740)
                0.023457233 = queryNorm
              0.20087351 = fieldWeight in 4837, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2139761 = idf(docFreq=4669, maxDocs=42740)
                0.0625 = fieldNorm(doc=4837)
          0.034042157 = weight(abstract_txt:value in 4837) [ClassicSimilarity], result of:
            0.034042157 = score(doc=4837,freq=1.0), product of:
              0.12366379 = queryWeight, product of:
                1.196937 = boost
                4.4044785 = idf(docFreq=1419, maxDocs=42740)
                0.023457233 = queryNorm
              0.2752799 = fieldWeight in 4837, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4044785 = idf(docFreq=1419, maxDocs=42740)
                0.0625 = fieldNorm(doc=4837)
          0.09220777 = weight(abstract_txt:neural in 4837) [ClassicSimilarity], result of:
            0.09220777 = score(doc=4837,freq=1.0), product of:
              0.2099161 = queryWeight, product of:
                1.2732897 = boost
                7.0281615 = idf(docFreq=102, maxDocs=42740)
                0.023457233 = queryNorm
              0.4392601 = fieldWeight in 4837, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0281615 = idf(docFreq=102, maxDocs=42740)
                0.0625 = fieldNorm(doc=4837)
          0.026408121 = weight(abstract_txt:information in 4837) [ClassicSimilarity], result of:
            0.026408121 = score(doc=4837,freq=3.0), product of:
              0.10038573 = queryWeight, product of:
                1.7610446 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.023457233 = queryNorm
              0.26306647 = fieldWeight in 4837, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.0625 = fieldNorm(doc=4837)
          0.14694174 = weight(abstract_txt:documents in 4837) [ClassicSimilarity], result of:
            0.14694174 = score(doc=4837,freq=7.0), product of:
              0.21592616 = queryWeight, product of:
                2.2367508 = boost
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.023457233 = queryNorm
              0.68051845 = fieldWeight in 4837, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.115389 = idf(docFreq=1895, maxDocs=42740)
                0.0625 = fieldNorm(doc=4837)
        0.28 = coord(7/25)