Search (476 results, page 2 of 24)

  • × language_ss:"e"
  • × type_ss:"a"
  • × type_ss:"el"
  1. Roy, W.; Gray, C.: Preparing existing metadata for repository batch import : a recipe for a fickle food (2018) 0.02
    0.02067415 = product of:
      0.0413483 = sum of:
        0.0413483 = sum of:
          0.010148063 = weight(_text_:a in 4550) [ClassicSimilarity], result of:
            0.010148063 = score(doc=4550,freq=18.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.19109234 = fieldWeight in 4550, product of:
                4.2426405 = tf(freq=18.0), with freq of:
                  18.0 = termFreq=18.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4550)
          0.03120024 = weight(_text_:22 in 4550) [ClassicSimilarity], result of:
            0.03120024 = score(doc=4550,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.19345059 = fieldWeight in 4550, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4550)
      0.5 = coord(1/2)
    
    Abstract
    In 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range of methods have been proposed for harvesting publications metadata en masse, but many technological solutions can easily become detached from a workflow that is both reproducible for support staff and applicable to a range of situations. Many repositories offer the capacity for batch upload via CSV, so our method provides a template Python script that leverages the Habanero library for populating CSV files with existing metadata retrieved from the CrossRef API. In our case, we have combined this with useful metadata contained in a TSV file downloaded from Web of Science in order to enrich our metadata as well. The appeal of this 'low-maintenance' method is that it provides more robust options for gathering metadata semi-automatically, and only requires the user's ability to access Web of Science and the Python program, while still remaining flexible enough for local customizations.
    Date
    10.11.2018 16:27:22
    Type
    a
  2. Boldi, P.; Santini, M.; Vigna, S.: PageRank as a function of the damping factor (2005) 0.02
    0.020383961 = product of:
      0.040767923 = sum of:
        0.040767923 = sum of:
          0.009567685 = weight(_text_:a in 2564) [ClassicSimilarity], result of:
            0.009567685 = score(doc=2564,freq=16.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.18016359 = fieldWeight in 2564, product of:
                4.0 = tf(freq=16.0), with freq of:
                  16.0 = termFreq=16.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2564)
          0.03120024 = weight(_text_:22 in 2564) [ClassicSimilarity], result of:
            0.03120024 = score(doc=2564,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.19345059 = fieldWeight in 2564, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2564)
      0.5 = coord(1/2)
    
    Abstract
    PageRank is defined as the stationary state of a Markov chain. The chain is obtained by perturbing the transition matrix induced by a web graph with a damping factor alpha that spreads uniformly part of the rank. The choice of alpha is eminently empirical, and in most cases the original suggestion alpha=0.85 by Brin and Page is still used. Recently, however, the behaviour of PageRank with respect to changes in alpha was discovered to be useful in link-spam detection. Moreover, an analytical justification of the value chosen for alpha is still missing. In this paper, we give the first mathematical analysis of PageRank when alpha changes. In particular, we show that, contrarily to popular belief, for real-world graphs values of alpha close to 1 do not give a more meaningful ranking. Then, we give closed-form formulae for PageRank derivatives of any order, and an extension of the Power Method that approximates them with convergence O(t**k*alpha**t) for the k-th derivative. Finally, we show a tight connection between iterated computation and analytical behaviour by proving that the k-th iteration of the Power Method gives exactly the PageRank value obtained using a Maclaurin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank.
    Date
    16. 1.2016 10:22:28
    Type
    a
  3. Dowding, H.; Gengenbach, M.; Graham, B.; Meister, S.; Moran, J.; Peltzman, S.; Seifert, J.; Waugh, D.: OSS4EVA: using open-source tools to fulfill digital preservation requirements (2016) 0.02
    0.01974305 = product of:
      0.0394861 = sum of:
        0.0394861 = sum of:
          0.008285859 = weight(_text_:a in 3200) [ClassicSimilarity], result of:
            0.008285859 = score(doc=3200,freq=12.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.15602624 = fieldWeight in 3200, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3200)
          0.03120024 = weight(_text_:22 in 3200) [ClassicSimilarity], result of:
            0.03120024 = score(doc=3200,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.19345059 = fieldWeight in 3200, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3200)
      0.5 = coord(1/2)
    
    Abstract
    This paper builds on the findings of a workshop held at the 2015 International Conference on Digital Preservation (iPRES), entitled, "Using Open-Source Tools to Fulfill Digital Preservation Requirements" (OSS4PRES hereafter). This day-long workshop brought together participants from across the library and archives community, including practitioners proprietary vendors, and representatives from open-source projects. The resulting conversations were surprisingly revealing: while OSS' significance within the preservation landscape was made clear, participants noted that there are a number of roadblocks that discourage or altogether prevent its use in many organizations. Overcoming these challenges will be necessary to further widespread, sustainable OSS adoption within the digital preservation community. This article will mine the rich discussions that took place at OSS4PRES to (1) summarize the workshop's key themes and major points of debate, (2) provide a comprehensive analysis of the opportunities, gaps, and challenges that using OSS entails at a philosophical, institutional, and individual level, and (3) offer a tangible set of recommendations for future work designed to broaden community engagement and enhance the sustainability of open source initiatives, drawing on both participants' experience as well as additional research.
    Date
    28.10.2016 18:22:33
    Type
    a
  4. Monireh, E.; Sarker, M.K.; Bianchi, F.; Hitzler, P.; Doran, D.; Xie, N.: Reasoning over RDF knowledge bases using deep learning (2018) 0.02
    0.018982807 = product of:
      0.037965614 = sum of:
        0.037965614 = sum of:
          0.006765375 = weight(_text_:a in 4553) [ClassicSimilarity], result of:
            0.006765375 = score(doc=4553,freq=8.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.12739488 = fieldWeight in 4553, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4553)
          0.03120024 = weight(_text_:22 in 4553) [ClassicSimilarity], result of:
            0.03120024 = score(doc=4553,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.19345059 = fieldWeight in 4553, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4553)
      0.5 = coord(1/2)
    
    Abstract
    Semantic Web knowledge representation standards, and in particular RDF and OWL, often come endowed with a formal semantics which is considered to be of fundamental importance for the field. Reasoning, i.e., the drawing of logical inferences from knowledge expressed in such standards, is traditionally based on logical deductive methods and algorithms which can be proven to be sound and complete and terminating, i.e. correct in a very strong sense. For various reasons, though, in particular the scalability issues arising from the ever increasing amounts of Semantic Web data available and the inability of deductive algorithms to deal with noise in the data, it has been argued that alternative means of reasoning should be investigated which bear high promise for high scalability and better robustness. From this perspective, deductive algorithms can be considered the gold standard regarding correctness against which alternative methods need to be tested. In this paper, we show that it is possible to train a Deep Learning system on RDF knowledge graphs, such that it is able to perform reasoning over new RDF knowledge graphs, with high precision and recall compared to the deductive gold standard.
    Date
    16.11.2018 14:22:01
    Type
    a
  5. Baker, T.: ¬A grammar of Dublin Core (2000) 0.02
    0.018531231 = product of:
      0.037062462 = sum of:
        0.037062462 = sum of:
          0.012102271 = weight(_text_:a in 1236) [ClassicSimilarity], result of:
            0.012102271 = score(doc=1236,freq=40.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.22789092 = fieldWeight in 1236, product of:
                6.3245554 = tf(freq=40.0), with freq of:
                  40.0 = termFreq=40.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.03125 = fieldNorm(doc=1236)
          0.02496019 = weight(_text_:22 in 1236) [ClassicSimilarity], result of:
            0.02496019 = score(doc=1236,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.15476047 = fieldWeight in 1236, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=1236)
      0.5 = coord(1/2)
    
    Abstract
    Dublin Core is often presented as a modern form of catalog card -- a set of elements (and now qualifiers) that describe resources in a complete package. Sometimes it is proposed as an exchange format for sharing records among multiple collections. The founding principle that "every element is optional and repeatable" reinforces the notion that a Dublin Core description is to be taken as a whole. This paper, in contrast, is based on a much different premise: Dublin Core is a language. More precisely, it is a small language for making a particular class of statements about resources. Like natural languages, it has a vocabulary of word-like terms, the two classes of which -- elements and qualifiers -- function within statements like nouns and adjectives; and it has a syntax for arranging elements and qualifiers into statements according to a simple pattern. Whenever tourists order a meal or ask directions in an unfamiliar language, considerate native speakers will spontaneously limit themselves to basic words and simple sentence patterns along the lines of "I am so-and-so" or "This is such-and-such". Linguists call this pidginization. In such situations, a small phrase book or translated menu can be most helpful. By analogy, today's Web has been called an Internet Commons where users and information providers from a wide range of scientific, commercial, and social domains present their information in a variety of incompatible data models and description languages. In this context, Dublin Core presents itself as a metadata pidgin for digital tourists who must find their way in this linguistically diverse landscape. Its vocabulary is small enough to learn quickly, and its basic pattern is easily grasped. It is well-suited to serve as an auxiliary language for digital libraries. This grammar starts by defining terms. It then follows a 200-year-old tradition of English grammar teaching by focusing on the structure of single statements. It concludes by looking at the growing dictionary of Dublin Core vocabulary terms -- its registry, and at how statements can be used to build the metadata equivalent of paragraphs and compositions -- the application profile.
    Date
    26.12.2011 14:01:22
    Type
    a
  6. Somers, J.: Torching the modern-day library of Alexandria : somewhere at Google there is a database containing 25 million books and nobody is allowed to read them. (2017) 0.02
    0.017720532 = product of:
      0.035441063 = sum of:
        0.035441063 = sum of:
          0.010480874 = weight(_text_:a in 3608) [ClassicSimilarity], result of:
            0.010480874 = score(doc=3608,freq=30.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.19735932 = fieldWeight in 3608, product of:
                5.477226 = tf(freq=30.0), with freq of:
                  30.0 = termFreq=30.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.03125 = fieldNorm(doc=3608)
          0.02496019 = weight(_text_:22 in 3608) [ClassicSimilarity], result of:
            0.02496019 = score(doc=3608,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.15476047 = fieldWeight in 3608, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=3608)
      0.5 = coord(1/2)
    
    Abstract
    You were going to get one-click access to the full text of nearly every book that's ever been published. Books still in print you'd have to pay for, but everything else-a collection slated to grow larger than the holdings at the Library of Congress, Harvard, the University of Michigan, at any of the great national libraries of Europe-would have been available for free at terminals that were going to be placed in every local library that wanted one. At the terminal you were going to be able to search tens of millions of books and read every page of any book you found. You'd be able to highlight passages and make annotations and share them; for the first time, you'd be able to pinpoint an idea somewhere inside the vastness of the printed record, and send somebody straight to it with a link. Books would become as instantly available, searchable, copy-pasteable-as alive in the digital world-as web pages. It was to be the realization of a long-held dream. "The universal library has been talked about for millennia," Richard Ovenden, the head of Oxford's Bodleian Libraries, has said. "It was possible to think in the Renaissance that you might be able to amass the whole of published knowledge in a single room or a single institution." In the spring of 2011, it seemed we'd amassed it in a terminal small enough to fit on a desk. "This is a watershed event and can serve as a catalyst for the reinvention of education, research, and intellectual life," one eager observer wrote at the time. On March 22 of that year, however, the legal agreement that would have unlocked a century's worth of books and peppered the country with access terminals to a universal library was rejected under Rule 23(e)(2) of the Federal Rules of Civil Procedure by the U.S. District Court for the Southern District of New York. When the library at Alexandria burned it was said to be an "international catastrophe." When the most significant humanities project of our time was dismantled in court, the scholars, archivists, and librarians who'd had a hand in its undoing breathed a sigh of relief, for they believed, at the time, that they had narrowly averted disaster.
    Type
    a
  7. Bradford, R.B.: Relationship discovery in large text collections using Latent Semantic Indexing (2006) 0.01
    0.01482369 = product of:
      0.02964738 = sum of:
        0.02964738 = sum of:
          0.0046871896 = weight(_text_:a in 1163) [ClassicSimilarity], result of:
            0.0046871896 = score(doc=1163,freq=6.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.088261776 = fieldWeight in 1163, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.03125 = fieldNorm(doc=1163)
          0.02496019 = weight(_text_:22 in 1163) [ClassicSimilarity], result of:
            0.02496019 = score(doc=1163,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.15476047 = fieldWeight in 1163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=1163)
      0.5 = coord(1/2)
    
    Abstract
    This paper addresses the problem of information discovery in large collections of text. For users, one of the key problems in working with such collections is determining where to focus their attention. In selecting documents for examination, users must be able to formulate reasonably precise queries. Queries that are too broad will greatly reduce the efficiency of information discovery efforts by overwhelming the users with peripheral information. In order to formulate efficient queries, a mechanism is needed to automatically alert users regarding potentially interesting information contained within the collection. This paper presents the results of an experiment designed to test one approach to generation of such alerts. The technique of latent semantic indexing (LSI) is used to identify relationships among entities of interest. Entity extraction software is used to pre-process the text of the collection so that the LSI space contains representation vectors for named entities in addition to those for individual terms. In the LSI space, the cosine of the angle between the representation vectors for two entities captures important information regarding the degree of association of those two entities. For appropriate choices of entities, determining the entity pairs with the highest mutual cosine values yields valuable information regarding the contents of the text collection. The test database used for the experiment consists of 150,000 news articles. The proposed approach for alert generation is tested using a counterterrorism analysis example. The approach is shown to have significant potential for aiding users in rapidly focusing on information of potential importance in large text collections. The approach also has value in identifying possible use of aliases.
    Source
    Proceedings of the Fourth Workshop on Link Analysis, Counterterrorism, and Security, SIAM Data Mining Conference, Bethesda, MD, 20-22 April, 2006. [http://www.siam.org/meetings/sdm06/workproceed/Link%20Analysis/15.pdf]
    Type
    a
  8. Lavoie, B.; Connaway, L.S.; Dempsey, L.: Anatomy of aggregate collections : the example of Google print for libraries (2005) 0.01
    0.0132904 = product of:
      0.0265808 = sum of:
        0.0265808 = sum of:
          0.007860656 = weight(_text_:a in 1184) [ClassicSimilarity], result of:
            0.007860656 = score(doc=1184,freq=30.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.1480195 = fieldWeight in 1184, product of:
                5.477226 = tf(freq=30.0), with freq of:
                  30.0 = termFreq=30.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0234375 = fieldNorm(doc=1184)
          0.018720143 = weight(_text_:22 in 1184) [ClassicSimilarity], result of:
            0.018720143 = score(doc=1184,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.116070345 = fieldWeight in 1184, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0234375 = fieldNorm(doc=1184)
      0.5 = coord(1/2)
    
    Abstract
    Google's December 2004 announcement of its intention to collaborate with five major research libraries - Harvard University, the University of Michigan, Stanford University, the University of Oxford, and the New York Public Library - to digitize and surface their print book collections in the Google searching universe has, predictably, stirred conflicting opinion, with some viewing the project as a welcome opportunity to enhance the visibility of library collections in new environments, and others wary of Google's prospective role as gateway to these collections. The project has been vigorously debated on discussion lists and blogs, with the participating libraries commonly referred to as "the Google 5". One point most observers seem to concede is that the questions raised by this initiative are both timely and significant. The Google Print Library Project (GPLP) has galvanized a long overdue, multi-faceted discussion about library print book collections. The print book is core to library identity and practice, but in an era of zero-sum budgeting, it is almost inevitable that print book budgets will decline as budgets for serials, digital resources, and other materials expand. As libraries re-allocate resources to accommodate changing patterns of user needs, print book budgets may be adversely impacted. Of course, the degree of impact will depend on a library's perceived mission. A public library may expect books to justify their shelf-space, with de-accession the consequence of minimal use. A national library, on the other hand, has a responsibility to the scholarly and cultural record and may seek to collect comprehensively within particular areas, with the attendant obligation to secure the long-term retention of its print book collections. The combination of limited budgets, changing user needs, and differences in library collection strategies underscores the need to think about a collective, or system-wide, print book collection - in particular, how can an inter-institutional system be organized to achieve goals that would be difficult, and/or prohibitively expensive, for any one library to undertake individually [4]? Mass digitization programs like GPLP cast new light on these and other issues surrounding the future of library print book collections, but at this early stage, it is light that illuminates only dimly. It will be some time before GPLP's implications for libraries and library print book collections can be fully appreciated and evaluated. But the strong interest and lively debate generated by this initiative suggest that some preliminary analysis - premature though it may be - would be useful, if only to undertake a rough mapping of the terrain over which GPLP potentially will extend. At the least, some early perspective helps shape interesting questions for the future, when the boundaries of GPLP become settled, workflows for producing and managing the digitized materials become systematized, and usage patterns within the GPLP framework begin to emerge.
    This article offers some perspectives on GPLP in light of what is known about library print book collections in general, and those of the Google 5 in particular, from information in OCLC's WorldCat bibliographic database and holdings file. Questions addressed include: * Coverage: What proportion of the system-wide print book collection will GPLP potentially cover? What is the degree of holdings overlap across the print book collections of the five participating libraries? * Language: What is the distribution of languages associated with the print books held by the GPLP libraries? Which languages are predominant? * Copyright: What proportion of the GPLP libraries' print book holdings are out of copyright? * Works: How many distinct works are represented in the holdings of the GPLP libraries? How does a focus on works impact coverage and holdings overlap? * Convergence: What are the effects on coverage of using a different set of five libraries? What are the effects of adding the holdings of additional libraries to those of the GPLP libraries, and how do these effects vary by library type? These questions certainly do not exhaust the analytical possibilities presented by GPLP. More in-depth analysis might look at Google 5 coverage in particular subject areas; it also would be interesting to see how many books covered by the GPLP have already been digitized in other contexts. However, these questions are left to future studies. The purpose here is to explore a few basic questions raised by GPLP, and in doing so, provide an empirical context for the debate that is sure to continue for some time to come. A secondary objective is to lay some groundwork for a general set of questions that could be used to explore the implications of any mass digitization initiative. A suggested list of questions is provided in the conclusion of the article.
    Date
    26.12.2011 14:08:22
    Type
    a
  9. DeSilva, J.M.; Traniello, J.F.A.; Claxton, A.G.; Fannin, L.D.: When and why did human brains decrease in size? : a new change-point analysis and insights from brain evolution in ants (2021) 0.01
    0.012230377 = product of:
      0.024460753 = sum of:
        0.024460753 = sum of:
          0.005740611 = weight(_text_:a in 405) [ClassicSimilarity], result of:
            0.005740611 = score(doc=405,freq=16.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.10809815 = fieldWeight in 405, product of:
                4.0 = tf(freq=16.0), with freq of:
                  16.0 = termFreq=16.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0234375 = fieldNorm(doc=405)
          0.018720143 = weight(_text_:22 in 405) [ClassicSimilarity], result of:
            0.018720143 = score(doc=405,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.116070345 = fieldWeight in 405, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0234375 = fieldNorm(doc=405)
      0.5 = coord(1/2)
    
    Abstract
    Human brain size nearly quadrupled in the six million years since Homo last shared a common ancestor with chimpanzees, but human brains are thought to have decreased in volume since the end of the last Ice Age. The timing and reason for this decrease is enigmatic. Here we use change-point analysis to estimate the timing of changes in the rate of hominin brain evolution. We find that hominin brains experienced positive rate changes at 2.1 and 1.5 million years ago, coincident with the early evolution of Homo and technological innovations evident in the archeological record. But we also find that human brain size reduction was surprisingly recent, occurring in the last 3,000 years. Our dating does not support hypotheses concerning brain size reduction as a by-product of body size reduction, a result of a shift to an agricultural diet, or a consequence of self-domestication. We suggest our analysis supports the hypothesis that the recent decrease in brain size may instead result from the externalization of knowledge and advantages of group-level decision-making due in part to the advent of social systems of distributed cognition and the storage and sharing of information. Humans live in social groups in which multiple brains contribute to the emergence of collective intelligence. Although difficult to study in the deep history of Homo, the impacts of group size, social organization, collective intelligence and other potential selective forces on brain evolution can be elucidated using ants as models. The remarkable ecological diversity of ants and their species richness encompasses forms convergent in aspects of human sociality, including large group size, agrarian life histories, division of labor, and collective cognition. Ants provide a wide range of social systems to generate and test hypotheses concerning brain size enlargement or reduction and aid in interpreting patterns of brain evolution identified in humans. Although humans and ants represent very different routes in social and cognitive evolution, the insights ants offer can broadly inform us of the selective forces that influence brain size.
    Source
    Frontiers in ecology and evolution, 22 October 2021 [https://www.frontiersin.org/articles/10.3389/fevo.2021.742639/full]
    Type
    a
  10. Denton, W.: On dentographs, a new method of visualizing library collections (2012) 0.00
    0.003827074 = product of:
      0.007654148 = sum of:
        0.007654148 = product of:
          0.015308296 = sum of:
            0.015308296 = weight(_text_:a in 580) [ClassicSimilarity], result of:
              0.015308296 = score(doc=580,freq=16.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.28826174 = fieldWeight in 580, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=580)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A dentograph is a visualization of a library's collection built on the idea that a classification scheme is a mathematical function mapping one set of things (books or the universe of knowledge) onto another (a set of numbers and letters). Dentographs can visualize aspects of just one collection or can be used to compare two or more collections. This article describes how to build them, with examples and code using Ruby and R, and discusses some problems and future directions.
    Type
    a
  11. Page, A.: ¬The search is over : the search-engines secrets of the pros (1996) 0.00
    0.0037819599 = product of:
      0.0075639198 = sum of:
        0.0075639198 = product of:
          0.0151278395 = sum of:
            0.0151278395 = weight(_text_:a in 5670) [ClassicSimilarity], result of:
              0.0151278395 = score(doc=5670,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.28486365 = fieldWeight in 5670, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=5670)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Covers 8 of the most popular search engines. Gives a summary of each and has a nice table of features that also briefly lists the pros and cons. Includes a short explanation of Boolean operators too
    Type
    a
  12. Bartol, W.; Pióro, K.; Rosselló, F.: On the coverings by tolerance classes (2003) 0.00
    0.0037439493 = product of:
      0.0074878987 = sum of:
        0.0074878987 = product of:
          0.014975797 = sum of:
            0.014975797 = weight(_text_:a in 4842) [ClassicSimilarity], result of:
              0.014975797 = score(doc=4842,freq=20.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.28200063 = fieldWeight in 4842, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4842)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A tolerance is a reflexive and symmetric, but not necessarily transitive, binary relation. Contrary what happens with equivalence relations, when dealing with tolerances one must distinguish between blocks (maximal subsets where the tolerance is a total relation) and classes (the class of an element is the set of those elements tolerable with it). Both blocks and classes of a tolerance on a set define coverings of this set, but not every covering of a set is defined in this way. The characterization of those coverings that are families of blocks of some tolerance has been known for more than a decade now. In this paper we give a characterization of those coverings of a finite set that are families of classes of some tolerance.
    Type
    a
  13. Pankowski, T.: Ontological databases with faceted queries (2022) 0.00
    0.0034867944 = product of:
      0.006973589 = sum of:
        0.006973589 = product of:
          0.013947178 = sum of:
            0.013947178 = weight(_text_:a in 666) [ClassicSimilarity], result of:
              0.013947178 = score(doc=666,freq=34.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.2626313 = fieldWeight in 666, product of:
                  5.8309517 = tf(freq=34.0), with freq of:
                    34.0 = termFreq=34.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=666)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The success of the use of ontology-based systems depends on efficient and user-friendly methods of formulating queries against the ontology. We propose a method to query a class of ontologies, called facet ontologies ( fac-ontologies ), using a faceted human-oriented approach. A fac-ontology has two important features: (a) a hierarchical view of it can be defined as a nested facet over this ontology and the view can be used as a faceted interface to create queries and to explore the ontology; (b) the ontology can be converted into an ontological database , the ABox of which is stored in a database, and the faceted queries are evaluated against this database. We show that the proposed faceted interface makes it possible to formulate queries that are semantically equivalent to $${\mathcal {SROIQ}}^{Fac}$$ SROIQ Fac , a limited version of the $${\mathcal {SROIQ}}$$ SROIQ description logic. The TBox of a fac-ontology is divided into a set of rules defining intensional predicates and a set of constraint rules to be satisfied by the database. We identify a class of so-called reflexive weak cycles in a set of constraint rules and propose a method to deal with them in the chase procedure. The considerations are illustrated with solutions implemented in the DAFO system ( data access based on faceted queries over ontologies ).
    Type
    a
  14. Burnard, L.: Text encoding for interchange : a new consortium (2000) 0.00
    0.00334869 = product of:
      0.00669738 = sum of:
        0.00669738 = product of:
          0.01339476 = sum of:
            0.01339476 = weight(_text_:a in 406) [ClassicSimilarity], result of:
              0.01339476 = score(doc=406,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.25222903 = fieldWeight in 406, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=406)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  15. Miller, P.: Towards a typology for portals (2003) 0.00
    0.00334869 = product of:
      0.00669738 = sum of:
        0.00669738 = product of:
          0.01339476 = sum of:
            0.01339476 = weight(_text_:a in 4087) [ClassicSimilarity], result of:
              0.01339476 = score(doc=4087,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.25222903 = fieldWeight in 4087, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4087)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  16. Pinto, F.; Fraser, M.: Access management, the key to a Portal (2003) 0.00
    0.00334869 = product of:
      0.00669738 = sum of:
        0.00669738 = product of:
          0.01339476 = sum of:
            0.01339476 = weight(_text_:a in 4111) [ClassicSimilarity], result of:
              0.01339476 = score(doc=4111,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.25222903 = fieldWeight in 4111, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4111)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  17. Academic publishing : No peeking (2014) 0.00
    0.00334869 = product of:
      0.00669738 = sum of:
        0.00669738 = product of:
          0.01339476 = sum of:
            0.01339476 = weight(_text_:a in 805) [ClassicSimilarity], result of:
              0.01339476 = score(doc=805,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.25222903 = fieldWeight in 805, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=805)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A publishing giant goes after the authors of its journals' papers
    Type
    a
  18. Lindholm, J.; Schönthal, T.; Jansson , K.: Experiences of harvesting Web resources in engineering using automatic classification (2003) 0.00
    0.0033143433 = product of:
      0.0066286866 = sum of:
        0.0066286866 = product of:
          0.013257373 = sum of:
            0.013257373 = weight(_text_:a in 4088) [ClassicSimilarity], result of:
              0.013257373 = score(doc=4088,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.24964198 = fieldWeight in 4088, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4088)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Authors describe the background and the work involved in setting up Engine-e, a Web index that uses automatic classification as a mean for the selection of resources in Engineering. Considerations in offering a robot-generated Web index as a successor to a manually indexed quality-controlled subject gateway are also discussed
    Type
    a
  19. Harnett, K.: Machine learning confronts the elephant in the room : a visual prank exposes an Achilles' heel of computer vision systems: Unlike humans, they can't do a double take (2018) 0.00
    0.00324456 = product of:
      0.00648912 = sum of:
        0.00648912 = product of:
          0.01297824 = sum of:
            0.01297824 = weight(_text_:a in 4449) [ClassicSimilarity], result of:
              0.01297824 = score(doc=4449,freq=46.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.24438578 = fieldWeight in 4449, product of:
                  6.78233 = tf(freq=46.0), with freq of:
                    46.0 = termFreq=46.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4449)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In a new study, computer scientists found that artificial intelligence systems fail a vision test a child could accomplish with ease. "It's a clever and important study that reminds us that 'deep learning' isn't really that deep," said Gary Marcus , a neuroscientist at New York University who was not affiliated with the work. The result takes place in the field of computer vision, where artificial intelligence systems attempt to detect and categorize objects. They might try to find all the pedestrians in a street scene, or just distinguish a bird from a bicycle (which is a notoriously difficult task). The stakes are high: As computers take over critical tasks like automated surveillance and autonomous driving, we'll want their visual processing to be at least as good as the human eyes they're replacing. It won't be easy. The new work accentuates the sophistication of human vision - and the challenge of building systems that mimic it. In the study, the researchers presented a computer vision system with a living room scene. The system processed it well. It correctly identified a chair, a person, books on a shelf. Then the researchers introduced an anomalous object into the scene - an image of elephant. The elephant's mere presence caused the system to forget itself: Suddenly it started calling a chair a couch and the elephant a chair, while turning completely blind to other objects it had previously seen. Researchers are still trying to understand exactly why computer vision systems get tripped up so easily, but they have a good guess. It has to do with an ability humans have that AI lacks: the ability to understand when a scene is confusing and thus go back for a second glance.
    Type
    a
  20. Kirriemuir, J.; Brickley, D.; Welsh, S.; Knight, J.; Hamilton, M.: Cross-searching subject gateways : the query routing and forward knowledge approach (1998) 0.00
    0.0031642143 = product of:
      0.0063284286 = sum of:
        0.0063284286 = product of:
          0.012656857 = sum of:
            0.012656857 = weight(_text_:a in 1252) [ClassicSimilarity], result of:
              0.012656857 = score(doc=1252,freq=28.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.23833402 = fieldWeight in 1252, product of:
                  5.2915025 = tf(freq=28.0), with freq of:
                    28.0 = termFreq=28.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1252)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A subject gateway, in the context of network-based resource access, can be defined as some facility that allows easier access to network-based resources in a defined subject area. The simplest types of subject gateways are sets of Web pages containing lists of links to resources. Some gateways index their lists of links and provide a simple search facility. More advanced gateways offer a much enhanced service via a system consisting of a resource database and various indexes, which can be searched and/or browsed through a Web-based interface. Each entry in the database contains information about a network-based resource, such as a Web page, Web site, mailing list or document. Entries are usually created by a cataloguer manually identifying a suitable resource, describing the resource using a template, and submitting the template to the database for indexing. Subject gateways are also known as subject-based information gateways (SBIGs), subject-based gateways, subject index gateways, virtual libraries, clearing houses, subject trees, pathfinders and other variations thereof. This paper describes the characteristics of some of the subject gateways currently accessible through the Web, and compares them to automatic "vacuum cleaner" type search engines, such as AltaVista. The application of WHOIS++, centroids, query routing, and forward knowledge to searching several of these subject gateways simultaneously is outlined. The paper concludes with looking at some of the issues facing subject gateway development in the near future. The paper touches on many of the issues mentioned in a previous paper in D-Lib Magazine, especially regarding resource-discovery related initiatives and services.
    Type
    a

Types