Document (#40932)

Author
Calvanese, D.
Kalayci, T.E.
Montali, M.
Santoso, A.
Title
OBDA for log extraction in process mining
Source
Reasoning Web: Semantic Interoperability on the Web, 13th International Summer School 2017, London, UK, July 7-11, 2017, Tutorial Lectures. Eds.: Ianni, G. et al
Imprint
Cham : Springer International Publishing
Year
2017
Pages
S.292-345
Series
Lecture Notes in Computer Scienc;10370) (Information Systems and Applications, incl. Internet/Web, and HCI
Abstract
Process mining is an emerging area that synergically combines model-based and data-oriented analysis techniques to obtain useful insights on how business processes are executed within an organization. Through process mining, decision makers can discover process models from data, compare expected and actual behaviors, and enrich models with key information about their actual execution. To be applicable, process mining techniques require the input data to be explicitly structured in the form of an event log, which lists when and by whom different case objects (i.e., process instances) have been subject to the execution of tasks. Unfortunately, in many real world set-ups, such event logs are not explicitly given, but are instead implicitly represented in legacy information systems. To apply process mining in this widespread setting, there is a pressing need for techniques able to support various process stakeholders in data preparation and log extraction from legacy information systems. The purpose of this paper is to single out this challenging, open issue, and didactically introduce how techniques from intelligent data management, and in particular ontology-based data access, provide a viable solution with a solid theoretical basis.

Similar documents (content)

  1. Barrio, P.; Gravano, L.: Sampling strategies for information extraction over the deep web (2017) 0.30
    0.2962447 = sum of:
      0.2962447 = product of:
        0.82290196 = sum of:
          0.009693266 = weight(abstract_txt:this in 3412) [ClassicSimilarity], result of:
            0.009693266 = score(doc=3412,freq=3.0), product of:
              0.042409286 = queryWeight, product of:
                1.0042732 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.017500425 = queryNorm
              0.2285647 = fieldWeight in 3412, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.015985584 = weight(abstract_txt:information in 3412) [ClassicSimilarity], result of:
            0.015985584 = score(doc=3412,freq=8.0), product of:
              0.04268844 = queryWeight, product of:
                1.007573 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.017500425 = queryNorm
              0.37447104 = fieldWeight in 3412, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.008409822 = weight(abstract_txt:from in 3412) [ClassicSimilarity], result of:
            0.008409822 = score(doc=3412,freq=1.0), product of:
              0.055638976 = queryWeight, product of:
                1.1502995 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.017500425 = queryNorm
              0.15114984 = fieldWeight in 3412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.18908513 = weight(abstract_txt:extraction in 3412) [ClassicSimilarity], result of:
            0.18908513 = score(doc=3412,freq=9.0), product of:
              0.18614365 = queryWeight, product of:
                1.7179079 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.017500425 = queryNorm
              1.0158021 = fieldWeight in 3412, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.20942739 = weight(abstract_txt:execution in 3412) [ClassicSimilarity], result of:
            0.20942739 = score(doc=3412,freq=2.0), product of:
              0.32898027 = queryWeight, product of:
                2.2838137 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.017500425 = queryNorm
              0.6365956 = fieldWeight in 3412, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.069812484 = weight(abstract_txt:techniques in 3412) [ClassicSimilarity], result of:
            0.069812484 = score(doc=3412,freq=2.0), product of:
              0.19927198 = queryWeight, product of:
                2.5137024 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.017500425 = queryNorm
              0.35033768 = fieldWeight in 3412, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.04183955 = weight(abstract_txt:data in 3412) [ClassicSimilarity], result of:
            0.04183955 = score(doc=3412,freq=2.0), product of:
              0.16214837 = queryWeight, product of:
                2.7771075 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017500425 = queryNorm
              0.2580325 = fieldWeight in 3412, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.1563427 = weight(abstract_txt:mining in 3412) [ClassicSimilarity], result of:
            0.1563427 = score(doc=3412,freq=1.0), product of:
              0.46293774 = queryWeight, product of:
                4.2835817 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017500425 = queryNorm
              0.33771864 = fieldWeight in 3412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.12230602 = weight(abstract_txt:process in 3412) [ClassicSimilarity], result of:
            0.12230602 = score(doc=3412,freq=3.0), product of:
              0.31873932 = queryWeight, product of:
                4.4959717 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.017500425 = queryNorm
              0.383718 = fieldWeight in 3412, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
        0.36 = coord(9/25)
    
  2. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.24
    0.24378006 = sum of:
      0.24378006 = product of:
        1.0157503 = sum of:
          0.014416837 = weight(abstract_txt:from in 2899) [ClassicSimilarity], result of:
            0.014416837 = score(doc=2899,freq=1.0), product of:
              0.055638976 = queryWeight, product of:
                1.1502995 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.017500425 = queryNorm
              0.259114 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.10804863 = weight(abstract_txt:extraction in 2899) [ClassicSimilarity], result of:
            0.10804863 = score(doc=2899,freq=1.0), product of:
              0.18614365 = queryWeight, product of:
                1.7179079 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.017500425 = queryNorm
              0.58045834 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.084625505 = weight(abstract_txt:techniques in 2899) [ClassicSimilarity], result of:
            0.084625505 = score(doc=2899,freq=1.0), product of:
              0.19927198 = queryWeight, product of:
                2.5137024 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.017500425 = queryNorm
              0.42467338 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.10143438 = weight(abstract_txt:data in 2899) [ClassicSimilarity], result of:
            0.10143438 = score(doc=2899,freq=4.0), product of:
              0.16214837 = queryWeight, product of:
                2.7771075 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017500425 = queryNorm
              0.62556523 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.53603214 = weight(abstract_txt:mining in 2899) [ClassicSimilarity], result of:
            0.53603214 = score(doc=2899,freq=4.0), product of:
              0.46293774 = queryWeight, product of:
                4.2835817 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017500425 = queryNorm
              1.1578925 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.17119275 = weight(abstract_txt:process in 2899) [ClassicSimilarity], result of:
            0.17119275 = score(doc=2899,freq=2.0), product of:
              0.31873932 = queryWeight, product of:
                4.4959717 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.017500425 = queryNorm
              0.5370933 = fieldWeight in 2899, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
        0.24 = coord(6/25)
    
  3. Benoit, G.: Data mining (2002) 0.23
    0.23278436 = sum of:
      0.23278436 = product of:
        0.72745115 = sum of:
          0.006395897 = weight(abstract_txt:this in 4296) [ClassicSimilarity], result of:
            0.006395897 = score(doc=4296,freq=1.0), product of:
              0.042409286 = queryWeight, product of:
                1.0042732 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.017500425 = queryNorm
              0.1508136 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.0091346195 = weight(abstract_txt:information in 4296) [ClassicSimilarity], result of:
            0.0091346195 = score(doc=4296,freq=2.0), product of:
              0.04268844 = queryWeight, product of:
                1.007573 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.017500425 = queryNorm
              0.21398345 = fieldWeight in 4296, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.01664713 = weight(abstract_txt:from in 4296) [ClassicSimilarity], result of:
            0.01664713 = score(doc=4296,freq=3.0), product of:
              0.055638976 = queryWeight, product of:
                1.1502995 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.017500425 = queryNorm
              0.29919907 = fieldWeight in 4296, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.030347623 = weight(abstract_txt:models in 4296) [ClassicSimilarity], result of:
            0.030347623 = score(doc=4296,freq=1.0), product of:
              0.10461148 = queryWeight, product of:
                1.2878504 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.017500425 = queryNorm
              0.2900984 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.101869226 = weight(abstract_txt:extraction in 4296) [ClassicSimilarity], result of:
            0.101869226 = score(doc=4296,freq=2.0), product of:
              0.18614365 = queryWeight, product of:
                1.7179079 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.017500425 = queryNorm
              0.54726136 = fieldWeight in 4296, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.082820825 = weight(abstract_txt:data in 4296) [ClassicSimilarity], result of:
            0.082820825 = score(doc=4296,freq=6.0), product of:
              0.16214837 = queryWeight, product of:
                2.7771075 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017500425 = queryNorm
              0.5107719 = fieldWeight in 4296, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.39953476 = weight(abstract_txt:mining in 4296) [ClassicSimilarity], result of:
            0.39953476 = score(doc=4296,freq=5.0), product of:
              0.46293774 = queryWeight, product of:
                4.2835817 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017500425 = queryNorm
              0.8630421 = fieldWeight in 4296, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.080701046 = weight(abstract_txt:process in 4296) [ClassicSimilarity], result of:
            0.080701046 = score(doc=4296,freq=1.0), product of:
              0.31873932 = queryWeight, product of:
                4.4959717 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.017500425 = queryNorm
              0.25318822 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
        0.32 = coord(8/25)
    
  4. Saz, J.T.: Perspectivas en recuperacion y explotacion de informacion electronica : el 'data mining' (1997) 0.20
    0.2020638 = sum of:
      0.2020638 = product of:
        1.2628988 = sum of:
          0.14104252 = weight(abstract_txt:techniques in 3723) [ClassicSimilarity], result of:
            0.14104252 = score(doc=3723,freq=1.0), product of:
              0.19927198 = queryWeight, product of:
                2.5137024 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.017500425 = queryNorm
              0.707789 = fieldWeight in 3723, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.15625 = fieldNorm(doc=3723)
          0.14640792 = weight(abstract_txt:data in 3723) [ClassicSimilarity], result of:
            0.14640792 = score(doc=3723,freq=3.0), product of:
              0.16214837 = queryWeight, product of:
                2.7771075 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017500425 = queryNorm
              0.9029256 = fieldWeight in 3723, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.15625 = fieldNorm(doc=3723)
          0.7736957 = weight(abstract_txt:mining in 3723) [ClassicSimilarity], result of:
            0.7736957 = score(doc=3723,freq=3.0), product of:
              0.46293774 = queryWeight, product of:
                4.2835817 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017500425 = queryNorm
              1.6712738 = fieldWeight in 3723, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.15625 = fieldNorm(doc=3723)
          0.20175262 = weight(abstract_txt:process in 3723) [ClassicSimilarity], result of:
            0.20175262 = score(doc=3723,freq=1.0), product of:
              0.31873932 = queryWeight, product of:
                4.4959717 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.017500425 = queryNorm
              0.6329706 = fieldWeight in 3723, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.15625 = fieldNorm(doc=3723)
        0.16 = coord(4/25)
    
  5. Garcia Marco, F.J.: ¬El factor humano en los sistemas de información (2003) 0.20
    0.2020638 = sum of:
      0.2020638 = product of:
        1.2628988 = sum of:
          0.14104252 = weight(abstract_txt:techniques in 929) [ClassicSimilarity], result of:
            0.14104252 = score(doc=929,freq=1.0), product of:
              0.19927198 = queryWeight, product of:
                2.5137024 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.017500425 = queryNorm
              0.707789 = fieldWeight in 929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.15625 = fieldNorm(doc=929)
          0.14640792 = weight(abstract_txt:data in 929) [ClassicSimilarity], result of:
            0.14640792 = score(doc=929,freq=3.0), product of:
              0.16214837 = queryWeight, product of:
                2.7771075 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017500425 = queryNorm
              0.9029256 = fieldWeight in 929, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.15625 = fieldNorm(doc=929)
          0.7736957 = weight(abstract_txt:mining in 929) [ClassicSimilarity], result of:
            0.7736957 = score(doc=929,freq=3.0), product of:
              0.46293774 = queryWeight, product of:
                4.2835817 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017500425 = queryNorm
              1.6712738 = fieldWeight in 929, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.15625 = fieldNorm(doc=929)
          0.20175262 = weight(abstract_txt:process in 929) [ClassicSimilarity], result of:
            0.20175262 = score(doc=929,freq=1.0), product of:
              0.31873932 = queryWeight, product of:
                4.4959717 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.017500425 = queryNorm
              0.6329706 = fieldWeight in 929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.15625 = fieldNorm(doc=929)
        0.16 = coord(4/25)