Search (1314 results, page 1 of 66)

Cannane, A.; Williams, H.E.: General-purpose compression for efficient retrieval (2001) 0.14
```
0.13827354 = product of:
  0.27654707 = sum of:
    0.27654707 = product of:
      0.55309415 = sum of:
        0.55309415 = weight(_text_:compression in 5705) [ClassicSimilarity], result of:
          0.55309415 = score(doc=5705,freq=20.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            1.5334243 = fieldWeight in 5705, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.046875 = fieldNorm(doc=5705)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Compression of databases not only reduces space requirements but can also reduce overall retrieval times. In text databases, compression of documents based on semistatic modeling with words has been shown to be both practical and fast. Similarly, for specific applications -such as databases of integers or scientific databases-specially designed semistatic compression schemes work well. We propose a scheme for general-purpose compression that can be applied to all types of data stored in large collections. We describe our approach -which we call RAY-in detail, and show experimentally the compression available, compression and decompression costs, and performance as a stream and random-access technique. We show that, in many cases, RAY achieves better compression than an efficient Huffman scheme and popular adaptive compression techniques, and that it can be used as an efficient general-purpose compression scheme

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10

0.09835884 = sum of:
  0.07831657 = product of:
    0.23494971 = sum of:
      0.23494971 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.23494971 = score(doc=562,freq=2.0), product of:
          0.41804656 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.049309507 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.020042272 = product of:
    0.040084545 = sum of:
      0.040084545 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
        0.040084545 = score(doc=562,freq=2.0), product of:
          0.1726735 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.049309507 = queryNorm
          0.23214069 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.5 = coord(1/2)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R.: Multi-candidate reduction : sentence compression as a tool for document summarization tasks (2007) 0.09
```
0.08835813 = product of:
  0.17671625 = sum of:
    0.17671625 = product of:
      0.3534325 = sum of:
        0.3534325 = weight(_text_:compression in 944) [ClassicSimilarity], result of:
          0.3534325 = score(doc=944,freq=6.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.97987294 = fieldWeight in 944, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0546875 = fieldNorm(doc=944)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization-a "parse-and-trim" approach and a statistical noisy-channel approach. We introduce the multi-candidate reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates are then selected for inclusion in the final summary based on a combination of static and dynamic features. Evaluations demonstrate that sentence compression is a valuable component of a larger multi-document summarization framework.
Zajic, D.M.; Dorr, B.J.; Lin, J.: Single-document and multi-document summarization techniques for email threads using sentence compression (2008) 0.09
```
0.08835813 = product of:
  0.17671625 = sum of:
    0.17671625 = product of:
      0.3534325 = sum of:
        0.3534325 = weight(_text_:compression in 2105) [ClassicSimilarity], result of:
          0.3534325 = score(doc=2105,freq=6.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.97987294 = fieldWeight in 2105, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2105)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We present two approaches to email thread summarization: collective message summarization (CMS) applies a multi-document summarization approach, while individual message summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate these ideas on the Enron email collection - a very challenging corpus because of the highly technical language. Experimental results point to two findings: that CMS represents a better approach to email thread summarization, and that current sentence compression techniques do not improve summarization performance in this genre.
Nomoto, T.: Discriminative sentence compression with conditional random fields (2007) 0.09
```
0.08745186 = product of:
  0.17490372 = sum of:
    0.17490372 = product of:
      0.34980744 = sum of:
        0.34980744 = weight(_text_:compression in 945) [ClassicSimilarity], result of:
          0.34980744 = score(doc=945,freq=8.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.96982265 = fieldWeight in 945, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.046875 = fieldNorm(doc=945)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The paper focuses on a particular approach to automatic sentence compression which makes use of a discriminative sequence classifier known as Conditional Random Fields (CRF). We devise several features for CRF that allow it to incorporate information on nonlinear relations among words. Along with that, we address the issue of data paucity by collecting data from RSS feeds available on the Internet, and turning them into training data for use with CRF, drawing on techniques from biology and information retrieval. We also discuss a recursive application of CRF on the syntactic structure of a sentence as a way of improving the readability of the compression it generates. Experiments found that our approach works reasonably well compared to the state-of-the-art system [Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91-107.].
Moffat, A.; Isal, R.Y.K.: Word-based text compression using the Burrows-Wheeler transform (2005) 0.09
```
0.08745186 = product of:
  0.17490372 = sum of:
    0.17490372 = product of:
      0.34980744 = sum of:
        0.34980744 = weight(_text_:compression in 1044) [ClassicSimilarity], result of:
          0.34980744 = score(doc=1044,freq=8.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.96982265 = fieldWeight in 1044, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.046875 = fieldNorm(doc=1044)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Block-sorting is an innovative compression mechanism introduced in 1994 by Burrows and Wheeler. It involves three steps: permuting the input one block at a time through the use of the Burrows-Wheeler transform (bwt); applying a move-to-front (mtf) transform to each of the permuted blocks; and then entropy coding the output with a Huffman or arithmetic coder. Until now, block-sorting implementations have assumed that the input message is a sequence of characters. In this paper we extend the block-sorting mechanism to word-based models. We also consider other recency transformations, and are able to show improved compression results compared to mtf and uniform arithmetic coding. For large files of text, the combination of word-based modeling, bwt, and mtf-like transformations allows excellent compression effectiveness to be attained within reasonable resource costs.
Adiego, J.; Navarro, G.; Fuente, P. de la: Lempel-Ziv compression of highly structured documents (2007) 0.07
```
0.07287655 = product of:
  0.1457531 = sum of:
    0.1457531 = product of:
      0.2915062 = sum of:
        0.2915062 = weight(_text_:compression in 4993) [ClassicSimilarity], result of:
          0.2915062 = score(doc=4993,freq=8.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.8081856 = fieldWeight in 4993, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4993)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The authors describe Lempel-Ziv to Compress Structure (LZCS), a novel Lempel-Ziv approach suitable for compressing structured documents. LZCS takes advantage of repeated substructures that may appear in the documents, by replacing them with a backward reference to their previous occurrence. The result of the LZCS transformation is still a valid structured document, which is human-readable and can be transmitted by ASCII channels. Moreover, LZCS transformed documents are easy to search, display, access at random, and navigate. In a second stage, the transformed documents can be further compressed using any semistatic technique, so that it is still possible to do all those operations efficiently; or with any adaptive technique to boost compression. LZCS is especially efficient in the compression of collections of highly structured data, such as extensible markup language (XML) forms, invoices, e-commerce, and Web-service exchange documents. The comparison with other structure-aware and standard compressors shows that LZCS is a competitive choice for these type of documents, whereas the others are not well-suited to support navigation or random access. When joined to an adaptive compressor, LZCS obtains by far the best compression ratios.
Wan, R.; Moffat, A.: Block merging for off-line compression (2007) 0.07
```
0.072144106 = product of:
  0.14428821 = sum of:
    0.14428821 = product of:
      0.28857642 = sum of:
        0.28857642 = weight(_text_:compression in 81) [ClassicSimilarity], result of:
          0.28857642 = score(doc=81,freq=4.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.8000629 = fieldWeight in 81, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0546875 = fieldNorm(doc=81)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

To bound memory consumption, most compression systems provide a facility that controls the amount of data that may be processed at once - usually as a block size, but sometimes as a direct megabyte limit. In this work we consider the Re-Pair mechanism of Larsson and Moffat (2000), which processes large messages as disjoint blocks to limit memory consumption. We show that the blocks emitted by Re-Pair can be postprocessed to yield further savings, and describe techniques that allow files of 500 MB or more to be compressed in a holistic manner using less than that much main memory. The block merging process we describe has the additional advantage of allowing new text to be appended to the end of the compressed file.
Liu, L.-J.; Shen, X.-B.; Zou, X.-C.: ¬An improved fast encoding algorithm for vector quantization (2004) 0.06
```
0.06311295 = product of:
  0.1262259 = sum of:
    0.1262259 = product of:
      0.2524518 = sum of:
        0.2524518 = weight(_text_:compression in 2067) [ClassicSimilarity], result of:
          0.2524518 = score(doc=2067,freq=6.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.69990927 = fieldWeight in 2067, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2067)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In the current information age, people have to access various information. With the popularization of the Internet in all kinds of information fields and the development of communication technology, more and more information has to be processed in high speed. Data compression is one of the techniques in information data processing applications and spreading images. The objective of data compression is to reduce data rate for transmission and storage. Vector quantization (VQ) is a very powerful method for data compression. One of the key problems for the basic VQ method, i.e., full search algorithm, is that it is computationally intensive and is difficult for real time processing. Many fast encoding algorithms have been developed for this reason. In this paper, we present a reasonable half-L2-norm pyramid data structure and a new method of searching and processing codewords to significantly speed up the searching process especially for high dimensional vectors and codebook with large size; reduce the actual requirement for memory, which is preferred in hardware implementation system, e.g., SOC (system-on-chip); and produce the same encoded image quality as full search algorithm. Simulation results show that the proposed method outperforms some existing related fast encoding algorithms.

Over, P.; Dang, H.; Harman, D.: DUC in context (2007) 0.06

0.05830124 = product of:
  0.11660248 = sum of:
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = weight(_text_:compression in 934) [ClassicSimilarity], result of:
          0.23320496 = score(doc=934,freq=2.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.64654845 = fieldWeight in 934, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0625 = fieldNorm(doc=934)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Recent years have seen increased interest in text summarization with emphasis on evaluation of prototype systems. Many factors can affect the design of such evaluations, requiring choices among competing alternatives. This paper examines several major themes running through three evaluations: SUMMAC, NTCIR, and DUC, with a concentration on DUC. The themes are extrinsic and intrinsic evaluation, evaluation procedures and methods, generic versus focused summaries, single- and multi-document summaries, length and compression issues, extracts versus abstracts, and issues with genre.

Kantor, P.B.: Information theory (2009) 0.06

0.05830124 = product of:
  0.11660248 = sum of:
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = weight(_text_:compression in 3815) [ClassicSimilarity], result of:
          0.23320496 = score(doc=3815,freq=2.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.64654845 = fieldWeight in 3815, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0625 = fieldNorm(doc=3815)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Information theory "measures quantity of information" and is that branch of applied mathematics that deals with the efficient transmission of messages in an encoded language. It is fundamental to modern methods of telecommunication, image compression, and security. Its relation to library information science is less direct. More relevant to the LIS conception of "quantity of information" are economic concepts related to the expected value of a decision, and the influence of imperfect information on that expected value.

Schrodt, R.: Tiefen und Untiefen im wissenschaftlichen Sprachgebrauch (2008) 0.05

0.052211046 = product of:
  0.10442209 = sum of:
    0.10442209 = product of:
      0.31326628 = sum of:
        0.31326628 = weight(_text_:3a in 140) [ClassicSimilarity], result of:
          0.31326628 = score(doc=140,freq=2.0), product of:
            0.41804656 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.049309507 = queryNorm
            0.7493574 = fieldWeight in 140, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0625 = fieldNorm(doc=140)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Content: Vgl. auch: https://studylibde.com/doc/13053640/richard-schrodt. Vgl. auch: http%3A%2F%2Fwww.univie.ac.at%2FGermanistik%2Fschrodt%2Fvorlesung%2Fwissenschaftssprache.doc&usg=AOvVaw1lDLDR6NFf1W0-oC9mEUJf.

Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.05
```
0.0515315 = product of:
  0.103063 = sum of:
    0.103063 = product of:
      0.206126 = sum of:
        0.206126 = weight(_text_:compression in 979) [ClassicSimilarity], result of:
          0.206126 = score(doc=979,freq=4.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.5714735 = fieldWeight in 979, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=979)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Compressing an inverted file can greatly improve query performance of an information retrieval system (IRS) by reducing disk I/Os. We observe that a good document identifier assignment (DIA) can make the document identifiers in the posting lists more clustered, and result in better compression as well as shorter query processing time. In this paper, we tackle the NP-complete problem of finding an optimal DIA to minimize the average query processing time in an IRS when the probability distribution of query terms is given. We indicate that the greedy nearest neighbor (Greedy-NN) algorithm can provide excellent performance for this problem. However, the Greedy-NN algorithm is inappropriate if used in large-scale IRSs, due to its high complexity O(N2 × n), where N denotes the number of documents and n denotes the number of distinct terms. In real-world IRSs, the distribution of query terms is skewed. Based on this fact, we propose a fast O(N × n) heuristic, called partition-based document identifier assignment (PBDIA) algorithm, which can efficiently assign consecutive document identifiers to those documents containing frequently used query terms, and improve compression efficiency of the posting lists for those terms. This can result in reduced query processing time. The experimental results show that the PBDIA algorithm can yield a competitive performance versus the Greedy-NN for the DIA problem, and that this optimization problem has significant advantages for both long queries and parallel information retrieval (IR).
Yeh, J.-Y.; Ke, H.-R.; Yang, W.-P.; Meng, I.-H.: Text summarization using a trainable summarizer and latent semantic analysis (2005) 0.05
```
0.0515315 = product of:
  0.103063 = sum of:
    0.103063 = product of:
      0.206126 = sum of:
        0.206126 = weight(_text_:compression in 1003) [ClassicSimilarity], result of:
          0.206126 = score(doc=1003,freq=4.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.5714735 = fieldWeight in 1003, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1003)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper proposes two approaches to address text summarization: modified corpus-based approach (MCBA) and LSA-based T.R.M. approach (LSA + T.R.M.). The first is a trainable summarizer, which takes into account several features, including position, positive keyword, negative keyword, centrality, and the resemblance to the title, to generate summaries. Two new ideas are exploited: (1) sentence positions are ranked to emphasize the significances of different sentence positions, and (2) the score function is trained by the genetic algorithm (GA) to obtain a suitable combination of feature weights. The second uses latent semantic analysis (LSA) to derive the semantic matrix of a document or a corpus and uses semantic sentence representation to construct a semantic text relationship map. We evaluate LSA + T.R.M. both with single documents and at the corpus level to investigate the competence of LSA in text summarization. The two novel approaches were measured at several compression rates on a data corpus composed of 100 political articles. When the compression rate was 30%, an average f-measure of 49% for MCBA, 52% for MCBA + GA, 44% and 40% for LSA + T.R.M. in single-document and corpus level were achieved respectively.
Innovations and advanced techniques in systems, computing sciences and software engineering (2008) 0.05
```
0.0515315 = product of:
  0.103063 = sum of:
    0.103063 = product of:
      0.206126 = sum of:
        0.206126 = weight(_text_:compression in 4319) [ClassicSimilarity], result of:
          0.206126 = score(doc=4319,freq=4.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.5714735 = fieldWeight in 4319, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4319)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Content

Inhalt: Image and Pattern Recognition: Compression, Image processing, Signal Processing Architectures, Signal Processing for Communication, Signal Processing Implementation, Speech Compression, and Video Coding Architectures. Languages and Systems: Algorithms, Databases, Embedded Systems and Applications, File Systems and I/O, Geographical Information Systems, Kernel and OS Structures, Knowledge Based Systems, Modeling and Simulation, Object Based Software Engineering, Programming Languages, and Programming Models and tools. Parallel Processing: Distributed Scheduling, Multiprocessing, Real-time Systems, Simulation Modeling and Development, and Web Applications. New trends in computing: Computers for People of Special Needs, Fuzzy Inference, Human Computer Interaction, Incremental Learning, Internet-based Computing Models, Machine Intelligence, Natural Language Processing, Neural Networks, and Online Decision Support System
Bookstein, A.; Raita, T.: Discovering term occurence structure in text (2001) 0.05
```
0.05101359 = product of:
  0.10202718 = sum of:
    0.10202718 = product of:
      0.20405436 = sum of:
        0.20405436 = weight(_text_:compression in 5751) [ClassicSimilarity], result of:
          0.20405436 = score(doc=5751,freq=2.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.5657299 = fieldWeight in 5751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5751)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article examines some consequences for information control of the tendency of occurrences of contentbearing terms to appear together, or clump. Properties of previously defined clumping measures are reviewed and extended, and the significance of these measures for devising retrieval strategies discussed. A new type of clumping measure, which extends the earlier measures by permitting gaps within a clump, is defined, and several variants examined. Experiments are carried out that indicate the relation between the new measure and one of the earlier measures, as well as the ability of the two types of measure to predict compression efficiency

Vetere, G.; Lenzerini, M.: Models for semantic interoperability in service-oriented architectures (2005) 0.05

0.045684673 = product of:
  0.091369346 = sum of:
    0.091369346 = product of:
      0.27410802 = sum of:
        0.27410802 = weight(_text_:3a in 306) [ClassicSimilarity], result of:
          0.27410802 = score(doc=306,freq=2.0), product of:
            0.41804656 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.049309507 = queryNorm
            0.65568775 = fieldWeight in 306, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0546875 = fieldNorm(doc=306)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Content: Vgl.: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5386707&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5386707.

Castelli, V.: Progressive search and retrieval from image databases (2002) 0.04
```
0.041225202 = product of:
  0.082450405 = sum of:
    0.082450405 = product of:
      0.16490081 = sum of:
        0.16490081 = weight(_text_:compression in 4253) [ClassicSimilarity], result of:
          0.16490081 = score(doc=4253,freq=4.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.4571788 = fieldWeight in 4253, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.03125 = fieldNorm(doc=4253)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this chapter we describe methodologies for representing information in digital libraries in order to support efficient search and content-based retrieval, and we focus our attention an image repositories. We identify different abstraction levels at which content can be specified. We discuss how simple objects can be defined as connected regions that are homogeneous with respect to pixel-level, feature-level, semantic-level, and metadata-level characteristics. We describe how information can be efficiently represented at these different levels, how a user can specify content, and what mechanisms can be used to perform the search. We present techniques for combining image representation, in particular image compression, with image-processing operators designed for content-based searches. On the one hand this approach makes it possible to extract and index content at database ingestion time even when the data volume is large. On the other hand it allows the system to retrieve simple objects for which definitions are provided at query time and that have not been pre-extracted and preindexed. Simple objects, however, are not sufficient to describe the richness of image content. We rely an the concept of composite object, a collection of simple objects satisfying a set of relations, to specify complex queries. We describe algorithms for retrieving composite objects from image databases. This article is organized as follows. The next section contains an introduction to digital libraries and a description of four application areas. The aspects of compression that are relevant to progressive retrieval are discussed in the section attached. The subsequent section provides a brief introduction to content-based searches, the standard approach to query specification in multimedia databases. Our definition of content in terms of simple and composite objects is contained in the following section. Simple objects can be defined at multiple abstraction levels; fundamental concepts, content-extraction methodologies including progressive techniques, and indexing methods are discussed in the next section. Simple objects can also be defined simultaneously at multiple abstraction levels, and aggregated to form composite objects. The semantics of both types of objects and the techniques required to search for them are the subject of the section after that. The final section contains the conclusions.

Mas, S.; Marleau, Y.: Proposition of a faceted classification model to support corporate information organization and digital records management (2009) 0.04

0.039158285 = product of:
  0.07831657 = sum of:
    0.07831657 = product of:
      0.23494971 = sum of:
        0.23494971 = weight(_text_:3a in 2918) [ClassicSimilarity], result of:
          0.23494971 = score(doc=2918,freq=2.0), product of:
            0.41804656 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.049309507 = queryNorm
            0.56201804 = fieldWeight in 2918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=2918)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Footnote: Vgl.: http://ieeexplore.ieee.org/Xplore/login.jsp?reload=true&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F4755313%2F4755314%2F04755480.pdf%3Farnumber%3D4755480&authDecision=-203.

RAK-NBM : Interpretationshilfe zu NBM 3b,3 (2000) 0.04

0.03779207 = product of:
  0.07558414 = sum of:
    0.07558414 = product of:
      0.15116829 = sum of:
        0.15116829 = weight(_text_:22 in 4362) [ClassicSimilarity], result of:
          0.15116829 = score(doc=4362,freq=4.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.8754574 = fieldWeight in 4362, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=4362)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2000 19:22:27

Search (1314 results, page 1 of 66)

Authors

Languages

Types

Themes

Subjects

Classifications