Search (7 results, page 1 of 1)

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.02
```
0.016840585 = product of:
  0.025260875 = sum of:
    0.006896985 = weight(_text_:a in 690) [ClassicSimilarity], result of:
      0.006896985 = score(doc=690,freq=6.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.13239266 = fieldWeight in 690, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=690)
    0.01836389 = product of:
      0.03672778 = sum of:
        0.03672778 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.03672778 = score(doc=690,freq=2.0), product of:
            0.15821345 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045180224 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

We describe the latent semantic indexing subspace signature model (LSISSM) for semantic content representation of unstructured text. Grounded on singular value decomposition, the model represents terms and documents by the distribution signatures of their statistical contribution across the top-ranking latent concept dimensions. LSISSM matches term signatures with document signatures according to their mapping coherence between latent semantic indexing (LSI) term subspace and LSI document subspace. LSISSM does feature reduction and finds a low-rank approximation of scalable and sparse term-document matrices. Experiments demonstrate that this approach significantly improves the performance of major clustering algorithms such as standard K-means and self-organizing maps compared with the vector space model and the traditional LSI model. The unique contribution ranking mechanism in LSISSM also improves the initialization of standard K-means compared with random seeding procedure, which sometimes causes low efficiency and effectiveness of clustering. A two-stage initialization strategy based on LSISSM significantly reduces the running time of standard K-means procedures.

Date

23. 3.2013 13:22:36

Type

a
Harris, C.; Allen, R.B.; Plaisant, C.; Shneiderman, B.: Temporal visualization for legal case histories : from interpersonal communication to online information process (1999) 0.00
```
0.003754243 = product of:
  0.011262729 = sum of:
    0.011262729 = weight(_text_:a in 6715) [ClassicSimilarity], result of:
      0.011262729 = score(doc=6715,freq=16.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.2161963 = fieldWeight in 6715, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=6715)
  0.33333334 = coord(1/3)
```
Abstract

This paper discusses visualization of legal information using a tool for temporal information called LifeLines. The direct and indirect histories of cases can become very complex. We explored ways that LifeLines could aid in viewing the links between the original case and the direct and indirect histories. The Apple Computer, Inc. v. Microsoft Corporation and Hewlett Packard Company case is used to illustrate the prototype. For example, if users want to find out how the rulings or statutes changed throughout this case, they could retrieve this information within a single display. Using the timeline, users could also choose at which point in time they would like to begin viewing the case. LifeLines support various views of a case's history. For instance, users can view the trial history of a case, the references involved in a case, and citations made to a case. The paper describes improvements to LifeLines that could help in providing a more

Type

a
Allen, R.B.: Navigating and searching in digital library catalogs (1994) 0.00
```
0.0034978096 = product of:
  0.010493428 = sum of:
    0.010493428 = weight(_text_:a in 2414) [ClassicSimilarity], result of:
      0.010493428 = score(doc=2414,freq=20.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.20142901 = fieldWeight in 2414, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2414)
  0.33333334 = coord(1/3)
```
Abstract

Two interfaces are described for navigating large collections of document and book records. An Online Public Access Catalog interface uses a classification hierarchy to facilitate browsing and searching. The system has been implemented and currently runs with over 50,000 book records. Interface widgets allow the hierarchy to be displayed and traversed easily. For example, the Book Shelf dynamically updates itself to reflect searches and attribute selections. A second interface, not yet fully implemented, allows access to the ACM Computing Reviews classification. By browsing a graphic structure such as a classification hierarchy or term network, the user can select or negate terms to incrementally enlarge or refine the query. A number of systems have been proposed that utilise this type of interface: Allen [1] allows users to traverse sections of a classification hierarchy that are adjacent to documents retrieved by a search; Doyle [6] discusses a graph-based interactive browsing environment; Croft [4] extends Doyle's termbased graph with vertices and edges representing individual documents and their degrees of similarity to each other; Frei and Jauslin [7] use tree structures to represent both system command menus and document indexing structures; and Godin [10] and Pedersen [16] model a collection's conceptual structure with termdocument lattices.

Type

a
Allen, R.B.; Wu, Y.: Metrics for the scope of a collection (2005) 0.00
```
0.00325127 = product of:
  0.009753809 = sum of:
    0.009753809 = weight(_text_:a in 4570) [ClassicSimilarity], result of:
      0.009753809 = score(doc=4570,freq=12.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.18723148 = fieldWeight in 4570, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=4570)
  0.33333334 = coord(1/3)
```
Abstract

Some collections cover many topics, while others are narrowly focused an a limited number of topics. We introduce the concept of the "scope" of a collection of documents and we compare two ways of measuring lt. These measures are based an the distances between documents. The first uses the overlap of words between pairs of documents. The second measure uses a novel method that calculates the semantic relatedness to pairs of words from the documents. Those values are combined to obtain an overall distance between the documents. The main validation for the measures compared Web pages categorized by Yahoo. Sets of pages sampied from broad categories were determined to have a higher scope than sets derived from subcategories. The measure was significant and confirmed the expected difference in scope. Finally, we discuss other measures related to scope.

Type

a

Allen, R.B.; Obry, P.; Littman, M.: ¬An interface for navigating clustered document sets returned by queries (1993) 0.00

0.002654651 = product of:
  0.007963953 = sum of:
    0.007963953 = weight(_text_:a in 2315) [ClassicSimilarity], result of:
      0.007963953 = score(doc=2315,freq=2.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.15287387 = fieldWeight in 2315, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.09375 = fieldNorm(doc=2315)
  0.33333334 = coord(1/3)

Type: a

Allen, R.B.: ¬Two digital library interfaces that exploit hierarchical structure (1995) 0.00
```
0.0022989952 = product of:
  0.006896985 = sum of:
    0.006896985 = weight(_text_:a in 2416) [ClassicSimilarity], result of:
      0.006896985 = score(doc=2416,freq=6.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.13239266 = fieldWeight in 2416, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=2416)
  0.33333334 = coord(1/3)
```
Abstract

Two library classification system interfaces have been implemented for navigating and searching large collections of document and book records. One interface allows the user to browse book records organized by the DDC hierarchy. A Book Shelf display reflects the facet position in the classification hierarchy during browsing, and it dynamically updates to reflect search hits and attribute selections. The other interface provides access to records describing computer science documents classified by the ACM Computing Reviews (CR) system. The CR classification system is a type of faceted classification in which documents can appear at several points in the hierarchy. These two interfaces demonstrate that classification structure can be effectively utilized for organizing digital libraries and, potentiall, collections of Internet-wide information services

Type

a
Allen, R.B.: Retrieval from facet spaces (1996) 0.00
```
0.0021899752 = product of:
  0.0065699257 = sum of:
    0.0065699257 = weight(_text_:a in 6028) [ClassicSimilarity], result of:
      0.0065699257 = score(doc=6028,freq=4.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.12611452 = fieldWeight in 6028, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6028)
  0.33333334 = coord(1/3)
```
Abstract

The 'facet-space' approach for accessing document records organized by faceted classifications is described. The interface gives users detailed control over the facet display and it makes use of color to reduce the number of windows which need to be presented. The interface supports searching. A cluster analysis is described for organizing search return lists based on facets distances. The implementation is applied to 1381 summaries of computer science dissertations as organised by the ACM Computing Reviews classification system

Type

a

Search (7 results, page 1 of 1)

Authors

Years

Themes