Search (6 results, page 1 of 1)

  • × author_ss:"MacFarlane, A."
  • × type_ss:"a"
  1. Lu, W.; MacFarlane, A.; Venuti, F.: Okapi-based XML indexing (2009) 0.02
    0.018958557 = product of:
      0.05687567 = sum of:
        0.05687567 = product of:
          0.11375134 = sum of:
            0.11375134 = weight(_text_:indexing in 3629) [ClassicSimilarity], result of:
              0.11375134 = score(doc=3629,freq=16.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.59810436 = fieldWeight in 3629, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3629)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - Being an important data exchange and information storage standard, XML has generated a great deal of interest and particular attention has been paid to the issue of XML indexing. Clear use cases for structured search in XML have been established. However, most of the research in the area is either based on relational database systems or specialized semi-structured data management systems. This paper aims to propose a method for XML indexing based on the information retrieval (IR) system Okapi. Design/methodology/approach - First, the paper reviews the structure of inverted files and gives an overview of the issues of why this indexing mechanism cannot properly support XML retrieval, using the underlying data structures of Okapi as an example. Then the paper explores a revised method implemented on Okapi using path indexing structures. The paper evaluates these index structures through the metrics of indexing run time, path search run time and space costs using the INEX and Reuters RVC1 collections. Findings - Initial results on the INEX collections show that there is a substantial overhead in space costs for the method, but this increase does not affect run time adversely. Indexing results on differing sized Reuters RVC1 sub-collections show that the increase in space costs with increasing the size of a collection is significant, but in terms of run time the increase is linear. Path search results show sub-millisecond run times, demonstrating minimal overhead for XML search. Practical implications - Overall, the results show the method implemented to support XML search in a traditional IR system such as Okapi is viable. Originality/value - The paper provides useful information on a method for XML indexing based on the IR system Okapi.
  2. MacFarlane, A.; McCann, J.A.; Robertson, S.E.: Parallel methods for the generation of partitioned inverted files (2005) 0.02
    0.016086869 = product of:
      0.048260607 = sum of:
        0.048260607 = product of:
          0.09652121 = sum of:
            0.09652121 = weight(_text_:indexing in 651) [ClassicSimilarity], result of:
              0.09652121 = score(doc=651,freq=8.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.5075084 = fieldWeight in 651, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=651)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The generation of inverted indexes is one of the most computationally intensive activities for information retrieval systems: indexing large multi-gigabyte text databases can take many hours or even days to complete. We examine the generation of partitioned inverted files in order to speed up the process of indexing. Two types of index partitions are investigated: TermId and DocId. Design/methodology/approach - We use standard measures used in parallel computing such as speedup and efficiency to examine the computing results and also the space costs of our trial indexing experiments. Findings - The results from runs on both partitioning methods are compared and contrasted, concluding that DocId is the more efficient method. Practical implications - The practical implications are that the DocId partitioning method would in most circumstances be used for distributing inverted file data in a parallel computer, particularly if indexing speed is the primary consideration. Originality/value - The paper is of value to database administrators who manage large-scale text collections, and who need to use parallel computing to implement their text retrieval services.
  3. MacFarlane, A.: Knowledge organisation and its role in multimedia information retrieval (2016) 0.01
    0.011375135 = product of:
      0.034125403 = sum of:
        0.034125403 = product of:
          0.068250805 = sum of:
            0.068250805 = weight(_text_:indexing in 2911) [ClassicSimilarity], result of:
              0.068250805 = score(doc=2911,freq=4.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.3588626 = fieldWeight in 2911, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2911)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Various kinds of knowledge organisation, such as thesauri, are routinely used to label or tag multimedia content such as images and music and to support information retrieval, i.e. user search for such content. In this paper, we outline why this is the case, in particular focusing on the semantic gap between content and concept based multimedia retrieval. We survey some indexing vocabularies used for multimedia retrieval, and argue that techniques such as thesauri will be needed for the foreseeable future in order to support users in their need for multimedia content. In particular, we argue that artificial intelligence techniques are not mature enough to solve the problem of indexing multimedia conceptually and will not be able to replace human indexers for the foreseeable future.
  4. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01
    0.008975455 = product of:
      0.026926363 = sum of:
        0.026926363 = product of:
          0.053852726 = sum of:
            0.053852726 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.053852726 = score(doc=5108,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    20. 1.2007 18:30:22
  5. Konkova, E.; Göker, A.; Butterworth, R.; MacFarlane, A.: Social tagging: exploring the image, the tags, and the game (2014) 0.01
    0.0080434345 = product of:
      0.024130303 = sum of:
        0.024130303 = product of:
          0.048260607 = sum of:
            0.048260607 = weight(_text_:indexing in 1370) [ClassicSimilarity], result of:
              0.048260607 = score(doc=1370,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.2537542 = fieldWeight in 1370, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1370)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Large image collections on the Web need to be organized for effective retrieval. Metadata has a key role in image retrieval but rely on professionally assigned tags which is not a viable option. Current content-based image retrieval systems have not demonstrated sufficient utility on large-scale image sources on the web, and are usually used as a supplement to existing text-based image retrieval systems. We present two social tagging alternatives in the form of photo-sharing networks and image labeling games. Here we analyze these applications to evaluate their usefulness from the semantic point of view, investigating the management of social tagging for indexing. The findings of the study have shown that social tagging can generate a sizeable number of tags that can be classified as in terpretive for an image, and that tagging behaviour has a manageable and adjustable nature depending on tagging guidelines.
  6. Inskip, C.; MacFarlane, A.; Rafferty, P.: Organising music for movies (2010) 0.01
    0.0067028617 = product of:
      0.020108584 = sum of:
        0.020108584 = product of:
          0.04021717 = sum of:
            0.04021717 = weight(_text_:indexing in 3941) [ClassicSimilarity], result of:
              0.04021717 = score(doc=3941,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.21146181 = fieldWeight in 3941, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3941)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this paper is to examine and discuss the classification of commercial popular music when large digital collections are organised for use in films. Design/methodology/approach - A range of systems are investigated and their organization is discussed, focusing on an analysis of the metadata used by the systems and choices given to the end-user to construct a query. The indexing of the music is compared with a check-list of music facets which has been derived from recent musicological literature on semiotic analysis of popular music. These facets include aspects of communication, cultural and musical expression, codes and competences. Findings - In addition to bibliographic detail, descriptive metadata are used to organise music in these systems. Genre, subject and mood are used widely; some musical facets also appear. The extent to which attempts are being made to reflect these facets in the organization of these systems is discussed. A number of recommendations are made which may help to improve this process. Originality/value - The paper discusses an area of creative music search which has not previously been investigated in any depth and makes recommendations based on findings and the literature which may be used in the development of commercial systems as well as making a contribution to the literature.