Document (#27207)

Author
AI-Sughaiyer, I.A.
AI-Kharashi, I.A.
Title
Arabic morphological analysis techniques : a comprehensive survey
Source
Journal of the American Society for Information Science and technology. 55(2004) no.3, S.189-213
Year
2004
Abstract
After several decades of heavy research activity an English stemmers, Arabic morphological analysis techniques have become a popular area of research. The Arabic language is one of the Semitic languages; it exhibits a very systematic but complex morphological structure based an root-pattern schemes. As a consequence, survey of such techniques proves to be more necessary. The aim of this paper is to summarize and organize the information available in the literature in an attempt to motivate researchers to look into these techniques and try to develop more advanced ones. This paper introduces, classifies, and surveys Arabic morphological analysis techniques. Furthermore, conclusions, open areas, and future directions are provided at the end.
Theme
Computerlinguistik

Similar documents (content)

  1. Kanaan, G.; Al-Shalabi, R.; Ghwanmeh, S.; Al-Ma'adeed, H.: ¬A comparison of text-classification techniques applied to Arabic text (2009) 0.20
    0.1992601 = sum of:
      0.1992601 = product of:
        0.99630046 = sum of:
          0.019534934 = weight(abstract_txt:research in 3096) [ClassicSimilarity], result of:
            0.019534934 = score(doc=3096,freq=2.0), product of:
              0.04647508 = queryWeight, product of:
                1.0403472 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.014090819 = queryNorm
              0.4203314 = fieldWeight in 3096, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.09375 = fieldNorm(doc=3096)
          0.017069345 = weight(abstract_txt:more in 3096) [ClassicSimilarity], result of:
            0.017069345 = score(doc=3096,freq=1.0), product of:
              0.053518023 = queryWeight, product of:
                1.1163961 = boost
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.014090819 = queryNorm
              0.31894574 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.09375 = fieldNorm(doc=3096)
          0.018071039 = weight(abstract_txt:paper in 3096) [ClassicSimilarity], result of:
            0.018071039 = score(doc=3096,freq=1.0), product of:
              0.055591818 = queryWeight, product of:
                1.1378204 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.014090819 = queryNorm
              0.3250665 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.09375 = fieldNorm(doc=3096)
          0.100732975 = weight(abstract_txt:techniques in 3096) [ClassicSimilarity], result of:
            0.100732975 = score(doc=3096,freq=1.0), product of:
              0.23720105 = queryWeight, product of:
                3.7161787 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.014090819 = queryNorm
              0.42467338 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.09375 = fieldNorm(doc=3096)
          0.8408922 = weight(abstract_txt:arabic in 3096) [ClassicSimilarity], result of:
            0.8408922 = score(doc=3096,freq=5.0), product of:
              0.5299103 = queryWeight, product of:
                4.968033 = boost
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.014090819 = queryNorm
              1.5868576 = fieldWeight in 3096, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.09375 = fieldNorm(doc=3096)
        0.2 = coord(5/25)
    
  2. Anizi, M.; Dichy, J.: Improving information retrieval in Arabic through a multi-agent approach and a rich lexical resource (2011) 0.17
    0.16723695 = sum of:
      0.16723695 = product of:
        0.83618474 = sum of:
          0.009208856 = weight(abstract_txt:research in 4738) [ClassicSimilarity], result of:
            0.009208856 = score(doc=4738,freq=1.0), product of:
              0.04647508 = queryWeight, product of:
                1.0403472 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.014090819 = queryNorm
              0.19814612 = fieldWeight in 4738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.0625 = fieldNorm(doc=4738)
          0.012047359 = weight(abstract_txt:paper in 4738) [ClassicSimilarity], result of:
            0.012047359 = score(doc=4738,freq=1.0), product of:
              0.055591818 = queryWeight, product of:
                1.1378204 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.014090819 = queryNorm
              0.216711 = fieldWeight in 4738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0625 = fieldNorm(doc=4738)
          0.036617134 = weight(abstract_txt:analysis in 4738) [ClassicSimilarity], result of:
            0.036617134 = score(doc=4738,freq=3.0), product of:
              0.09258257 = queryWeight, product of:
                1.7983677 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.014090819 = queryNorm
              0.39550784 = fieldWeight in 4738, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.0625 = fieldNorm(doc=4738)
          0.35455132 = weight(abstract_txt:arabic in 4738) [ClassicSimilarity], result of:
            0.35455132 = score(doc=4738,freq=2.0), product of:
              0.5299103 = queryWeight, product of:
                4.968033 = boost
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.014090819 = queryNorm
              0.66907793 = fieldWeight in 4738, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.0625 = fieldNorm(doc=4738)
          0.42376012 = weight(abstract_txt:morphological in 4738) [ClassicSimilarity], result of:
            0.42376012 = score(doc=4738,freq=2.0), product of:
              0.5968013 = queryWeight, product of:
                5.272276 = boost
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.014090819 = queryNorm
              0.7100523 = fieldWeight in 4738, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.0625 = fieldNorm(doc=4738)
        0.2 = coord(5/25)
    
  3. Mansour, N.; Haraty, R.A.; Daher, W.; Houri, M.: ¬An auto-indexing method for Arabic text (2008) 0.17
    0.16632722 = sum of:
      0.16632722 = product of:
        0.83163613 = sum of:
          0.011379564 = weight(abstract_txt:more in 2103) [ClassicSimilarity], result of:
            0.011379564 = score(doc=2103,freq=1.0), product of:
              0.053518023 = queryWeight, product of:
                1.1163961 = boost
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.014090819 = queryNorm
              0.2126305 = fieldWeight in 2103, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.0625 = fieldNorm(doc=2103)
          0.012047359 = weight(abstract_txt:paper in 2103) [ClassicSimilarity], result of:
            0.012047359 = score(doc=2103,freq=1.0), product of:
              0.055591818 = queryWeight, product of:
                1.1378204 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.014090819 = queryNorm
              0.216711 = fieldWeight in 2103, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0625 = fieldNorm(doc=2103)
          0.029897764 = weight(abstract_txt:analysis in 2103) [ClassicSimilarity], result of:
            0.029897764 = score(doc=2103,freq=2.0), product of:
              0.09258257 = queryWeight, product of:
                1.7983677 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.014090819 = queryNorm
              0.3229308 = fieldWeight in 2103, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.0625 = fieldNorm(doc=2103)
          0.35455132 = weight(abstract_txt:arabic in 2103) [ClassicSimilarity], result of:
            0.35455132 = score(doc=2103,freq=2.0), product of:
              0.5299103 = queryWeight, product of:
                4.968033 = boost
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.014090819 = queryNorm
              0.66907793 = fieldWeight in 2103, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.0625 = fieldNorm(doc=2103)
          0.42376012 = weight(abstract_txt:morphological in 2103) [ClassicSimilarity], result of:
            0.42376012 = score(doc=2103,freq=2.0), product of:
              0.5968013 = queryWeight, product of:
                5.272276 = boost
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.014090819 = queryNorm
              0.7100523 = fieldWeight in 2103, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.0625 = fieldNorm(doc=2103)
        0.2 = coord(5/25)
    
  4. Abdelali, A.: Localization in modern standard Arabic (2004) 0.12
    0.12103253 = sum of:
      0.12103253 = product of:
        0.75645334 = sum of:
          0.014224454 = weight(abstract_txt:more in 2066) [ClassicSimilarity], result of:
            0.014224454 = score(doc=2066,freq=1.0), product of:
              0.053518023 = queryWeight, product of:
                1.1163961 = boost
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.014090819 = queryNorm
              0.2657881 = fieldWeight in 2066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.078125 = fieldNorm(doc=2066)
          0.015059198 = weight(abstract_txt:paper in 2066) [ClassicSimilarity], result of:
            0.015059198 = score(doc=2066,freq=1.0), product of:
              0.055591818 = queryWeight, product of:
                1.1378204 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.014090819 = queryNorm
              0.27088875 = fieldWeight in 2066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.078125 = fieldNorm(doc=2066)
          0.026426138 = weight(abstract_txt:analysis in 2066) [ClassicSimilarity], result of:
            0.026426138 = score(doc=2066,freq=1.0), product of:
              0.09258257 = queryWeight, product of:
                1.7983677 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.014090819 = queryNorm
              0.2854332 = fieldWeight in 2066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.078125 = fieldNorm(doc=2066)
          0.70074356 = weight(abstract_txt:arabic in 2066) [ClassicSimilarity], result of:
            0.70074356 = score(doc=2066,freq=5.0), product of:
              0.5299103 = queryWeight, product of:
                4.968033 = boost
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.014090819 = queryNorm
              1.3223814 = fieldWeight in 2066, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.078125 = fieldNorm(doc=2066)
        0.16 = coord(4/25)
    
  5. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.11
    0.105113536 = sum of:
      0.105113536 = product of:
        0.87594616 = sum of:
          0.18982637 = weight(abstract_txt:stemmers in 2950) [ClassicSimilarity], result of:
            0.18982637 = score(doc=2950,freq=2.0), product of:
              0.18968253 = queryWeight, product of:
                1.4861647 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.014090819 = queryNorm
              1.0007583 = fieldWeight in 2950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.037372205 = weight(abstract_txt:analysis in 2950) [ClassicSimilarity], result of:
            0.037372205 = score(doc=2950,freq=2.0), product of:
              0.09258257 = queryWeight, product of:
                1.7983677 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.014090819 = queryNorm
              0.40366352 = fieldWeight in 2950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
          0.64874756 = weight(abstract_txt:morphological in 2950) [ClassicSimilarity], result of:
            0.64874756 = score(doc=2950,freq=3.0), product of:
              0.5968013 = queryWeight, product of:
                5.272276 = boost
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.014090819 = queryNorm
              1.0870411 = fieldWeight in 2950, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.033325 = idf(docFreq=38, maxDocs=44218)
                0.078125 = fieldNorm(doc=2950)
        0.12 = coord(3/25)