Search (9 results, page 1 of 1)

  • × author_ss:"Willett, P."
  1. Artymiuk, P.J.; Spriggs, R.V.; Willett, P.: Graph theoretic methods for the analysis of structural relationships in biological macromolecules (2005) 0.03
    0.026304178 = product of:
      0.10521671 = sum of:
        0.10521671 = sum of:
          0.068480164 = weight(_text_:methods in 5258) [ClassicSimilarity], result of:
            0.068480164 = score(doc=5258,freq=4.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.37691376 = fieldWeight in 5258, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.046875 = fieldNorm(doc=5258)
          0.03673655 = weight(_text_:22 in 5258) [ClassicSimilarity], result of:
            0.03673655 = score(doc=5258,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.23214069 = fieldWeight in 5258, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=5258)
      0.25 = coord(1/4)
    
    Abstract
    Subgraph isomorphism and maximum common subgraph isomorphism algorithms from graph theory provide an effective and an efficient way of identifying structural relationships between biological macromolecules. They thus provide a natural complement to the pattern matching algorithms that are used in bioinformatics to identify sequence relationships. Examples are provided of the use of graph theory to analyze proteins for which three-dimensional crystallographic or NMR structures are available, focusing on the use of the Bron-Kerbosch clique detection algorithm to identify common folding motifs and of the Ullmann subgraph isomorphism algorithm to identify patterns of amino acid residues. Our methods are also applicable to other types of biological macromolecule, such as carbohydrate and nucleic acid structures.
    Date
    22. 7.2006 14:40:10
  2. Griffiths, A.; Robinson, L.A.; Willett, P.: Hierarchic agglomerative clustering methods for automatic document classification (1984) 0.02
    0.016140928 = product of:
      0.064563714 = sum of:
        0.064563714 = product of:
          0.12912743 = sum of:
            0.12912743 = weight(_text_:methods in 2414) [ClassicSimilarity], result of:
              0.12912743 = score(doc=2414,freq=2.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.71071535 = fieldWeight in 2414, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.125 = fieldNorm(doc=2414)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
  3. Ekmekcioglu, F.C.; Robertson, A.M.; Willett, P.: Effectiveness of query expansion in ranked-output document retrieval systems (1992) 0.01
    0.013978455 = product of:
      0.05591382 = sum of:
        0.05591382 = product of:
          0.11182764 = sum of:
            0.11182764 = weight(_text_:methods in 5689) [ClassicSimilarity], result of:
              0.11182764 = score(doc=5689,freq=6.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.6154976 = fieldWeight in 5689, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5689)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Reports an evaluation of 3 methods for the expansion of natural language queries in ranked output retrieval systems. The methods are based on term co-occurrence data, on Soundex codes, and on a string similarity measure. Searches for 110 queries in a data base of 26.280 titles and abstracts suggest that there is no significant difference in retrieval effectiveness between any of these methods and unexpanded searches
  4. Robertson, A.M.; Willett, P.: Retrieval techniques for historical English text : searching the sixteenth and seventeenth century titles in the Catalogue of Caterbury Cathedral Library using spelling-correction methods (1992) 0.01
    0.012231149 = product of:
      0.048924595 = sum of:
        0.048924595 = product of:
          0.09784919 = sum of:
            0.09784919 = weight(_text_:methods in 4209) [ClassicSimilarity], result of:
              0.09784919 = score(doc=4209,freq=6.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.5385604 = fieldWeight in 4209, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4209)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    A range of techniques has been developed for the correction of misspellings in machine readable texts. Discusses the use of such techniques for the identification of words in the sixteenth and seventeenth century titles from the Catalogue of Canterbury Cathedral Library that are most similar to query words in modern English. The experiments used digram matching, non phonetic coding, and dynamic programming methods for spelling correction. These allow very high recall searches to be carried out, although the latter methods are very demanding of computer resources
  5. Al-Hawamdeh, S.; Smith, G.; Willett, P.; Vere, R. de: Using nearest-neighbour searching techniques to access full-text documents (1991) 0.01
    0.008070464 = product of:
      0.032281857 = sum of:
        0.032281857 = product of:
          0.064563714 = sum of:
            0.064563714 = weight(_text_:methods in 2300) [ClassicSimilarity], result of:
              0.064563714 = score(doc=2300,freq=2.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.35535768 = fieldWeight in 2300, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2300)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Summarises the results to date of a continuing programme of research at Sheffield Univ. to investigate the use of nearest-neighbour retrieval algorithms for full text searching. Given a natural language query statement, the research methods result in a ranking of the paragraphs comprising a full text document in order of decreasing similarity with the query, where the similarity for each paragraph is determined by the number of keyword stems that it has in common with the query
  6. Robertson, A.M.; Willett, P.: Generation of equifrequent groups of words using a genetic algorithm (1994) 0.01
    0.0070616566 = product of:
      0.028246626 = sum of:
        0.028246626 = product of:
          0.056493253 = sum of:
            0.056493253 = weight(_text_:methods in 8158) [ClassicSimilarity], result of:
              0.056493253 = score(doc=8158,freq=2.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.31093797 = fieldWeight in 8158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=8158)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Genetic algorithms are a class of non-deterministic algorithms that derive from Darwinian evolution and that provide good, though not necessarily optimal, solutions to combinatorial problems. We describe their application to the identification of characteristics that occur approximately equifrequently in a database, using two different methods for the creation of the chromosome data structures that lie at the heart of a genetic algortihm. Experiments with files of English and Turkish text suggest that the genetic algorithm developed here can produce results superior to those produced by existing non-deterministic algorithms; however, the results are inferior to those produced by an existing deterministic algorithm
  7. Ellis, D.; Furner-Hines, J.; Willett, P.: Measuring the degree of similarity between objects in text retrieval systems (1993) 0.01
    0.0060528484 = product of:
      0.024211394 = sum of:
        0.024211394 = product of:
          0.048422787 = sum of:
            0.048422787 = weight(_text_:methods in 6716) [ClassicSimilarity], result of:
              0.048422787 = score(doc=6716,freq=2.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.26651827 = fieldWeight in 6716, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6716)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Describes the use of a variety of similarity coefficients in the measurement of the degree of similarity between objects that contain textual information, such as documents, paragraphs, index terms or queries. The work is intended as a preliminary to future investigation of the calculations involved in measuring the degree of similarity between structured objects that may be represented by graph theoretic forms. Descusses the role of similarity coefficients in text retrieval in terms of: document and query similarity; document and document similarity; cocitation analysis; term and term similarity; and the similarity between sets of judgements, such as relevance judgements. Describes several methods for expressing the formulae used to define similarity coefficients and compares their attributes. Concludes with details the characteristics of similarity coefficients; equivalence and monotonicity; consideration of negative matches; geometric analyses; and the meaning of correlation coefficients
  8. Furner, J.; Willett, P.: ¬A survey of hypertext-based public-access point-of-information systems in UK libraries (1995) 0.01
    0.0060528484 = product of:
      0.024211394 = sum of:
        0.024211394 = product of:
          0.048422787 = sum of:
            0.048422787 = weight(_text_:methods in 2044) [ClassicSimilarity], result of:
              0.048422787 = score(doc=2044,freq=2.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.26651827 = fieldWeight in 2044, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2044)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    We have recently completed a survey of the operational use of hypertext-based information systems in academic, public and special libraries in the UK. A literatur search, questionnaire and both telephone and face-to-face interviews demonstrate that the principle application of hypertext systems is for the implementation of public-access point-of-information systems, which provide guidance to the users of local information resources. In this paper, we describe the principle issuse relating to the design and usage of these systems that were raised in the interviews and that we experienced when using the systems for ourselves. We then present a set of technical recommendations with the intention of helping the developers of future systems, with special attention being given to the need to develop effective methods for system evaluation
  9. Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.01
    0.0050440403 = product of:
      0.020176161 = sum of:
        0.020176161 = product of:
          0.040352322 = sum of:
            0.040352322 = weight(_text_:methods in 4214) [ClassicSimilarity], result of:
              0.040352322 = score(doc=4214,freq=2.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.22209854 = fieldWeight in 4214, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4214)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    An important stage in the process or retrieval of objects from a hypertext database is the creation of a set of internodal links that are intended to represent the relationships existing between objects; this operation is often undertaken manually, just as index terms are often manually assigned to documents in a conventional retrieval system. In an earlier article (1994), the results were published of a study in which several different sets of links were inserted, each by a different person, between the paragraphs of each of a number of full-text documents. These results showed little similarity between the link-sets, a finding that was comparable with those of studies of inter-indexer consistency, which suggest that there is generally only a low level of agreement between the sets of index terms assigned to a document by different indexers. In this article, a description is provided of an investigation into the nature of the relationship existing between (i) the levels of inter-linker consistency obtaining among the group of hypertext databases used in our earlier experiments, and (ii) the levels of effectiveness of a number of searches carried out in those databases. An account is given of the implementation of the searches and of the methods used in the calculation of numerical values expressing their effectiveness. Analysis of the results of a comparison between recorded levels of consistency and those of effectiveness does not allow us to draw conclusions about the consistency - effectiveness relationship that are equivalent to those drawn in comparable studies of inter-indexer consistency