Search (9 results, page 1 of 1)

  • × author_ss:"MacFarlane, A."
  1. MacFarlane, A.; McCann, J.A.; Robertson, S.E.: Parallel methods for the generation of partitioned inverted files (2005) 0.01
    0.010478289 = product of:
      0.041913155 = sum of:
        0.041913155 = weight(_text_:services in 651) [ClassicSimilarity], result of:
          0.041913155 = score(doc=651,freq=2.0), product of:
            0.17221296 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046906993 = queryNorm
            0.2433798 = fieldWeight in 651, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046875 = fieldNorm(doc=651)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The generation of inverted indexes is one of the most computationally intensive activities for information retrieval systems: indexing large multi-gigabyte text databases can take many hours or even days to complete. We examine the generation of partitioned inverted files in order to speed up the process of indexing. Two types of index partitions are investigated: TermId and DocId. Design/methodology/approach - We use standard measures used in parallel computing such as speedup and efficiency to examine the computing results and also the space costs of our trial indexing experiments. Findings - The results from runs on both partitioning methods are compared and contrasted, concluding that DocId is the more efficient method. Practical implications - The practical implications are that the DocId partitioning method would in most circumstances be used for distributing inverted file data in a parallel computer, particularly if indexing speed is the primary consideration. Originality/value - The paper is of value to database administrators who manage large-scale text collections, and who need to use parallel computing to implement their text retrieval services.
  2. MacFarlane, A.: Evaluation of web search for the information practitioner (2007) 0.01
    0.010478289 = product of:
      0.041913155 = sum of:
        0.041913155 = weight(_text_:services in 817) [ClassicSimilarity], result of:
          0.041913155 = score(doc=817,freq=2.0), product of:
            0.17221296 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046906993 = queryNorm
            0.2433798 = fieldWeight in 817, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046875 = fieldNorm(doc=817)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The aim of the paper is to put forward a structured mechanism for web search evaluation. The paper seeks to point to useful scientific research and show how information practitioners can use these methods in evaluation of search on the web for their users. Design/methodology/approach - The paper puts forward an approach which utilizes traditional laboratory-based evaluation measures such as average precision/precision at N documents, augmented with diagnostic measures such as link broken, etc., which are used to show why precision measures are depressed as well as the quality of the search engines crawling mechanism. Findings - The paper shows how to use diagnostic measures in conjunction with precision in order to evaluate web search. Practical implications - The methodology presented in this paper will be useful to any information professional who regularly uses web search as part of their information seeking and needs to evaluate web search services. Originality/value - The paper argues that the use of diagnostic measures is essential in web search, as precision measures on their own do not allow a searcher to understand why search results differ between search engines.
  3. MacFarlane, A.; McCann, J.A.; Robertson, S.E.: Parallel methods for the update of partitioned inverted files (2007) 0.01
    0.008731907 = product of:
      0.03492763 = sum of:
        0.03492763 = weight(_text_:services in 819) [ClassicSimilarity], result of:
          0.03492763 = score(doc=819,freq=2.0), product of:
            0.17221296 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046906993 = queryNorm
            0.2028165 = fieldWeight in 819, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.0390625 = fieldNorm(doc=819)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - An issue that tends to be ignored in information retrieval is the issue of updating inverted files. This is largely because inverted files were devised to provide fast query service, and much work has been done with the emphasis strongly on queries. This paper aims to study the effect of using parallel methods for the update of inverted files in order to reduce costs, by looking at two types of partitioning for inverted files: document identifier and term identifier. Design/methodology/approach - Raw update service and update with query service are studied with these partitioning schemes using an incremental update strategy. The paper uses standard measures used in parallel computing such as speedup to examine the computing results and also the costs of reorganising indexes while servicing transactions. Findings - Empirical results show that for both transaction processing and index reorganisation the document identifier method is superior. However, there is evidence that the term identifier partitioning method could be useful in a concurrent transaction processing context. Practical implications - There is an increasing need to service updates, which is now becoming a requirement of inverted files (for dynamic collections such as the web), demonstrating that a shift in requirements of inverted file maintenance is needed from the past. Originality/value - The paper is of value to database administrators who manage large-scale and dynamic text collections, and who need to use parallel computing to implement their text retrieval services.
  4. MacFarlane, A.; Missaoui, S.; Frankowska-Takhari, S.: On machine learning and knowledge organization in multimedia information retrieval (2020) 0.01
    0.008731907 = product of:
      0.03492763 = sum of:
        0.03492763 = weight(_text_:services in 5732) [ClassicSimilarity], result of:
          0.03492763 = score(doc=5732,freq=2.0), product of:
            0.17221296 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046906993 = queryNorm
            0.2028165 = fieldWeight in 5732, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5732)
      0.25 = coord(1/4)
    
    Abstract
    Recent technological developments have increased the use of machine learning to solve many problems, including many in information retrieval. Multimedia information retrieval as a problem represents a significant challenge to machine learning as a technological solution, but some problems can still be addressed by using appropriate AI techniques. We review the technological developments and provide a perspective on the use of machine learning in conjunction with knowledge organization to address multimedia IR needs. The semantic gap in multimedia IR remains a significant problem in the field, and solutions to them are many years off. However, new technological developments allow the use of knowledge organization and machine learning in multimedia search systems and services. Specifically, we argue that, the improvement of detection of some classes of lowlevel features in images music and video can be used in conjunction with knowledge organization to tag or label multimedia content for better retrieval performance. We provide an overview of the use of knowledge organization schemes in machine learning and make recommendations to information professionals on the use of this technology with knowledge organization techniques to solve multimedia IR problems. We introduce a five-step process model that extracts features from multimedia objects (Step 1) from both knowledge organization (Step 1a) and machine learning (Step 1b), merging them together (Step 2) to create an index of those multimedia objects (Step 3). We also overview further steps in creating an application to utilize the multimedia objects (Step 4) and maintaining and updating the database of features on those objects (Step 5).
  5. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01
    0.0063552503 = product of:
      0.025421001 = sum of:
        0.025421001 = product of:
          0.050842002 = sum of:
            0.050842002 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.050842002 = score(doc=5108,freq=2.0), product of:
                0.1642603 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046906993 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    20. 1.2007 18:30:22
  6. MacFarlane, A.; Tuson, A.: Local search : a guide for the information retrieval practitioner (2009) 0.00
    0.004415923 = product of:
      0.017663691 = sum of:
        0.017663691 = product of:
          0.035327382 = sum of:
            0.035327382 = weight(_text_:management in 2457) [ClassicSimilarity], result of:
              0.035327382 = score(doc=2457,freq=2.0), product of:
                0.15810528 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046906993 = queryNorm
                0.22344214 = fieldWeight in 2457, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2457)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 45(2009) no.1, S.159-174
  7. Konkova, E.; Göker, A.; Butterworth, R.; MacFarlane, A.: Social tagging: exploring the image, the tags, and the game (2014) 0.00
    0.004415923 = product of:
      0.017663691 = sum of:
        0.017663691 = product of:
          0.035327382 = sum of:
            0.035327382 = weight(_text_:management in 1370) [ClassicSimilarity], result of:
              0.035327382 = score(doc=1370,freq=2.0), product of:
                0.15810528 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046906993 = queryNorm
                0.22344214 = fieldWeight in 1370, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1370)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Large image collections on the Web need to be organized for effective retrieval. Metadata has a key role in image retrieval but rely on professionally assigned tags which is not a viable option. Current content-based image retrieval systems have not demonstrated sufficient utility on large-scale image sources on the web, and are usually used as a supplement to existing text-based image retrieval systems. We present two social tagging alternatives in the form of photo-sharing networks and image labeling games. Here we analyze these applications to evaluate their usefulness from the semantic point of view, investigating the management of social tagging for indexing. The findings of the study have shown that social tagging can generate a sizeable number of tags that can be classified as in terpretive for an image, and that tagging behaviour has a manageable and adjustable nature depending on tagging guidelines.
  8. Inskip, C.; Butterworth, R.; MacFarlane, A.: ¬A study of the information needs of the users of a folk music library and the implications for the design of a digital library system (2008) 0.00
    0.0036799356 = product of:
      0.014719742 = sum of:
        0.014719742 = product of:
          0.029439485 = sum of:
            0.029439485 = weight(_text_:management in 2053) [ClassicSimilarity], result of:
              0.029439485 = score(doc=2053,freq=2.0), product of:
                0.15810528 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046906993 = queryNorm
                0.18620178 = fieldWeight in 2053, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2053)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 44(2008) no.2, S.647-662
  9. Lu, W.; MacFarlane, A.; Venuti, F.: Okapi-based XML indexing (2009) 0.00
    0.0036799356 = product of:
      0.014719742 = sum of:
        0.014719742 = product of:
          0.029439485 = sum of:
            0.029439485 = weight(_text_:management in 3629) [ClassicSimilarity], result of:
              0.029439485 = score(doc=3629,freq=2.0), product of:
                0.15810528 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046906993 = queryNorm
                0.18620178 = fieldWeight in 3629, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3629)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - Being an important data exchange and information storage standard, XML has generated a great deal of interest and particular attention has been paid to the issue of XML indexing. Clear use cases for structured search in XML have been established. However, most of the research in the area is either based on relational database systems or specialized semi-structured data management systems. This paper aims to propose a method for XML indexing based on the information retrieval (IR) system Okapi. Design/methodology/approach - First, the paper reviews the structure of inverted files and gives an overview of the issues of why this indexing mechanism cannot properly support XML retrieval, using the underlying data structures of Okapi as an example. Then the paper explores a revised method implemented on Okapi using path indexing structures. The paper evaluates these index structures through the metrics of indexing run time, path search run time and space costs using the INEX and Reuters RVC1 collections. Findings - Initial results on the INEX collections show that there is a substantial overhead in space costs for the method, but this increase does not affect run time adversely. Indexing results on differing sized Reuters RVC1 sub-collections show that the increase in space costs with increasing the size of a collection is significant, but in terms of run time the increase is linear. Path search results show sub-millisecond run times, demonstrating minimal overhead for XML search. Practical implications - Overall, the results show the method implemented to support XML search in a traditional IR system such as Okapi is viable. Originality/value - The paper provides useful information on a method for XML indexing based on the IR system Okapi.