Search (27 results, page 1 of 2)

Zhang, J.; Zeng, M.L.: ¬A new similarity measure for subject hierarchical structures (2014) 0.03
```
0.026668733 = product of:
  0.0400031 = sum of:
    0.02309931 = weight(_text_:on in 1778) [ClassicSimilarity], result of:
      0.02309931 = score(doc=1778,freq=6.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.21044704 = fieldWeight in 1778, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1778)
    0.01690379 = product of:
      0.03380758 = sum of:
        0.03380758 = weight(_text_:22 in 1778) [ClassicSimilarity], result of:
          0.03380758 = score(doc=1778,freq=2.0), product of:
            0.1747608 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04990557 = queryNorm
            0.19345059 = fieldWeight in 1778, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Purpose - The purpose of this paper is to introduce a new similarity method to gauge the differences between two subject hierarchical structures. Design/methodology/approach - In the proposed similarity measure, nodes on two hierarchical structures are projected onto a two-dimensional space, respectively, and both structural similarity and subject similarity of nodes are considered in the similarity between the two hierarchical structures. The extent to which the structural similarity impacts on the similarity can be controlled by adjusting a parameter. An experiment was conducted to evaluate soundness of the measure. Eight experts whose research interests were information retrieval and information organization participated in the study. Results from the new measure were compared with results from the experts. Findings - The evaluation shows strong correlations between the results from the new method and the results from the experts. It suggests that the similarity method achieved satisfactory results. Practical implications - Hierarchical structures that are found in subject directories, taxonomies, classification systems, and other classificatory structures play an extremely important role in information organization and information representation. Measuring the similarity between two subject hierarchical structures allows an accurate overarching understanding of the degree to which the two hierarchical structures are similar. Originality/value - Both structural similarity and subject similarity of nodes were considered in the proposed similarity method, and the extent to which the structural similarity impacts on the similarity can be adjusted. In addition, a new evaluation method for a hierarchical structure similarity was presented.

Date

8. 4.2015 16:22:13
Wolfram, D.; Zhang, J.: ¬The influence of indexing practices and weighting algorithms on document spaces (2008) 0.01
```
0.011928434 = product of:
  0.0357853 = sum of:
    0.0357853 = weight(_text_:on in 1963) [ClassicSimilarity], result of:
      0.0357853 = score(doc=1963,freq=10.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.32602316 = fieldWeight in 1963, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=1963)
  0.33333334 = coord(1/3)
```
Abstract

Index modeling and computer simulation techniques are used to examine the influence of indexing frequency distributions, indexing exhaustivity distributions, and three weighting methods on hypothetical document spaces in a vector-based information retrieval (IR) system. The way documents are indexed plays an important role in retrieval. The authors demonstrate the influence of different indexing characteristics on document space density (DSD) changes and document space discriminative capacity for IR. Document environments that contain a relatively higher percentage of infrequently occurring terms provide lower density outcomes than do environments where a higher percentage of frequently occurring terms exists. Different indexing exhaustivity levels, however, have little influence on the document space densities. A weighting algorithm that favors higher weights for infrequently occurring terms results in the lowest overall document space densities, which allows documents to be more readily differentiated from one another. This in turn can positively influence IR. The authors also discuss the influence on outcomes using two methods of normalization of term weights (i.e., means and ranges) for the different weighting methods.
Zhang, J.; An, L.; Tang, T.; Hong, Y.: Visual health subject directory analysis based on users' traversal activities (2009) 0.01
```
0.011928434 = product of:
  0.0357853 = sum of:
    0.0357853 = weight(_text_:on in 3112) [ClassicSimilarity], result of:
      0.0357853 = score(doc=3112,freq=10.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.32602316 = fieldWeight in 3112, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=3112)
  0.33333334 = coord(1/3)
```
Abstract

Concerns about health issues cover a wide spectrum. Consumer health information, which has become more available on the Internet, plays an extremely important role in addressing these concerns. A subject directory as an information organization and browsing mechanism is widely used in consumer health-related Websites. In this study we employed the information visualization technique Self-Organizing Map (SOM) in combination with a new U-matrix algorithm to analyze health subject clusters through a Web transaction log. An experimental study was conducted to test the proposed methods. The findings show that the clusters identified from the same cells based on path-length-1 outperformed both the clusters from the adjacent cells based on path-length-1 and the clusters from the same cells based on path-length-2 in the visual SOM display. The U-matrix method successfully distinguished the irrelevant subjects situated in the adjacent cells with different colors in the SOM display. The findings of this study lead to a better understanding of the health-related subject relationship from the users' traversal perspective.
Wolfram, D.; Wang, P.; Zhang, J.: Identifying Web search session patterns using cluster analysis : a comparison of three search environments (2009) 0.01
```
0.010669115 = product of:
  0.032007344 = sum of:
    0.032007344 = weight(_text_:on in 2796) [ClassicSimilarity], result of:
      0.032007344 = score(doc=2796,freq=8.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.29160398 = fieldWeight in 2796, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=2796)
  0.33333334 = coord(1/3)
```
Abstract

Session characteristics taken from large transaction logs of three Web search environments (academic Web site, public search engine, consumer health information portal) were modeled using cluster analysis to determine if coherent session groups emerged for each environment and whether the types of session groups are similar across the three environments. The analysis revealed three distinct clusters of session behaviors common to each environment: hit and run sessions on focused topics, relatively brief sessions on popular topics, and sustained sessions using obscure terms with greater query modification. The findings also revealed shifts in session characteristics over time for one of the datasets, away from hit and run sessions toward more popular search topics. A better understanding of session characteristics can help system designers to develop more responsive systems to support search features that cater to identifiable groups of searchers based on their search behaviors. For example, the system may identify struggling searchers based on session behaviors that match those identified in the current study to provide context sensitive help.
Zhang, J.; Korfhage, R.R.: ¬A distance and angle similarity measure method (1999) 0.01
```
0.010058938 = product of:
  0.030176813 = sum of:
    0.030176813 = weight(_text_:on in 3915) [ClassicSimilarity], result of:
      0.030176813 = score(doc=3915,freq=4.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.27492687 = fieldWeight in 3915, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0625 = fieldNorm(doc=3915)
  0.33333334 = coord(1/3)
```
Abstract

This article presents a distance and angle similarity measure. The integrated similarity measure takes the strenghts of both the distance and direction of measured documents into account. This article analyzes the features of the similarity measure by comparing it with the traditional distance-based similarity measure and the cosine measure, providing the iso-similarity contour, investigating the impacts of the parameters and variables on the new similarity measure. It also gives the further research issues on the topic
Zhang, J.; Dimitroff, A.: ¬The impact of webpage content characteristics on webpage visibility in search engine results : part I (2005) 0.01
```
0.010058938 = product of:
  0.030176813 = sum of:
    0.030176813 = weight(_text_:on in 1032) [ClassicSimilarity], result of:
      0.030176813 = score(doc=1032,freq=4.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.27492687 = fieldWeight in 1032, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0625 = fieldNorm(doc=1032)
  0.33333334 = coord(1/3)
```
Abstract

Content characteristics of a webpage include factors such as keyword position in a webpage, keyword duplication, layout, and their combination. These factors may impact webpage visibility in a search engine. Four hypotheses are presented relating to the impact of selected content characteristics on webpage visibility in search engine results lists. Webpage visibility can be improved by increasing the frequency of keywords in the title, in the full-text and in both the title and full-text.
Zhang, J.; Wolfram, D.; Wang, P.; Hong, Y.; Gillis, R.: Visualization of health-subject analysis based on query term co-occurrences (2008) 0.01
```
0.009940362 = product of:
  0.029821085 = sum of:
    0.029821085 = weight(_text_:on in 2376) [ClassicSimilarity], result of:
      0.029821085 = score(doc=2376,freq=10.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.271686 = fieldWeight in 2376, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2376)
  0.33333334 = coord(1/3)
```
Abstract

A multidimensional-scaling approach is used to analyze frequently used medical-topic terms in queries submitted to a Web-based consumer health information system. Based on a year-long transaction log file, five medical focus keywords (stomach, hip, stroke, depression, and cholesterol) and their co-occurring query terms are analyzed. An overlap-coefficient similarity measure and a conversion measure are used to calculate the proximity of terms to one another based on their co-occurrences in queries. The impact of the dimensionality of the visual configuration, the cutoff point of term co-occurrence for inclusion in the analysis, and the Minkowski metric power k on the stress value are discussed. A visual clustering of groups of terms based on the proximity within each focus-keyword group is also conducted. Term distributions within each visual configuration are characterized and are compared with formal medical vocabulary. This investigation reveals that there are significant differences between consumer health query-term usage and more formal medical terminology used by medical professionals when describing the same medical subject. Future directions are discussed.
Zhang, L.; Liu, Q.L.; Zhang, J.; Wang, H.F.; Pan, Y.; Yu, Y.: Semplore: an IR approach to scalable hybrid query of Semantic Web data (2007) 0.01
```
0.009940362 = product of:
  0.029821085 = sum of:
    0.029821085 = weight(_text_:on in 231) [ClassicSimilarity], result of:
      0.029821085 = score(doc=231,freq=10.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.271686 = fieldWeight in 231, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=231)
  0.33333334 = coord(1/3)
```
Abstract

As an extension to the current Web, Semantic Web will not only contain structured data with machine understandable semantics but also textual information. While structured queries can be used to find information more precisely on the Semantic Web, keyword searches are still needed to help exploit textual information. It thus becomes very important that we can combine precise structured queries with imprecise keyword searches to have a hybrid query capability. In addition, due to the huge volume of information on the Semantic Web, the hybrid query must be processed in a very scalable way. In this paper, we define such a hybrid query capability that combines unary tree-shaped structured queries with keyword searches. We show how existing information retrieval (IR) index structures and functions can be reused to index semantic web data and its textual information, and how the hybrid query is evaluated on the index structure using IR engines in an efficient and scalable manner. We implemented this IR approach in an engine called Semplore. Comprehensive experiments on its performance show that it is a promising approach. It leads us to believe that it may be possible to evolve current web search engines to query and search the Semantic Web. Finally, we briefy describe how Semplore is used for searching Wikipedia and an IBM customer's product information.

Source

Proceeding ISWC'07/ASWC'07 : Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference. Ed.: K. Aberer et al
Chen, C.; Ibekwe-SanJuan, F.; Pinho, R.; Zhang, J.: ¬The impact of the sloan digital sky survey on astronomical research : the role of culture, identity, and international collaboration (2008) 0.01
```
0.009239726 = product of:
  0.027719175 = sum of:
    0.027719175 = weight(_text_:on in 2275) [ClassicSimilarity], result of:
      0.027719175 = score(doc=2275,freq=6.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.25253648 = fieldWeight in 2275, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=2275)
  0.33333334 = coord(1/3)
```
Content

We investigate the influence of culture and identity (geographic location) on the constitution of a specific research field. Using as case study the Sloan Digital Sky Survey (SDSS) project in the Astronomy field, we analyzed texts from bibliographic records of publications along three cultural and geographic axes: US only publications, non-US publications and international collaboration. Using three text mining systems (CiteSpace, TermWatch and PEx), we were able to automatically identify the topics specific to each cultural and geographic region as well as isolate the core research topics common to all geographic zones. The results tended to show that US-only and non-US research in this field shared more commonalities with international collaboration than with one another, thus indicating that the former two (US-only and non-US) research focused on rather distinct topics.
Zhang, J.; Wolfram, D.; Wang, P.: Analysis of query keywords of sports-related queries using visualization and clustering (2009) 0.01
```
0.008890929 = product of:
  0.026672786 = sum of:
    0.026672786 = weight(_text_:on in 2947) [ClassicSimilarity], result of:
      0.026672786 = score(doc=2947,freq=8.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.24300331 = fieldWeight in 2947, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2947)
  0.33333334 = coord(1/3)
```
Abstract

The authors investigated 11 sports-related query keywords extracted from a public search engine query log to better understand sports-related information seeking on the Internet. After the query log contents were cleaned and query data were parsed, popular sports-related keywords were identified, along with frequently co-occurring query terms associated with the identified keywords. Relationships among each sports-related focus keyword and its related keywords were characterized and grouped using multidimensional scaling (MDS) in combination with traditional hierarchical clustering methods. The two approaches were synthesized in a visual context by highlighting the results of the hierarchical clustering analysis in the visual MDS configuration. Important events, people, subjects, merchandise, and so on related to a sport were illustrated, and relationships among the sports were analyzed. A small-scale comparative study of sports searches with and without term assistance was conducted. Searches that used search term assistance by relying on previous query term relationships outperformed the searches without the search term assistance. The findings of this study provide insights into sports information seeking behavior on the Internet. The developed method also may be applied to other query log subject areas.
Zhang, J.; Jastram, I.: ¬A study of the metadata creation behavior of different user groups on the Internet (2006) 0.01
```
0.008801571 = product of:
  0.026404712 = sum of:
    0.026404712 = weight(_text_:on in 982) [ClassicSimilarity], result of:
      0.026404712 = score(doc=982,freq=4.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.24056101 = fieldWeight in 982, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=982)
  0.33333334 = coord(1/3)
```
Abstract

Metadata is designed to improve information organization and information retrieval effectiveness and efficiency on the Internet. The way web publishers respond to metadata and the way they use it when publishing their web pages, however, is still a mystery. The authors of this paper aim to solve this mystery by defining different professional publisher groups, examining the behaviors of these user groups, and identifying the characteristics of their metadata use. This study will enhance the current understanding of metadata application behavior and provide evidence useful to researchers, web publishers, and search engine designers.
Zhang, J.; Dimitroff, A.: ¬The impact of metadata implementation on webpage visibility in search engine results : part II (2005) 0.01
```
0.008801571 = product of:
  0.026404712 = sum of:
    0.026404712 = weight(_text_:on in 1027) [ClassicSimilarity], result of:
      0.026404712 = score(doc=1027,freq=4.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.24056101 = fieldWeight in 1027, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1027)
  0.33333334 = coord(1/3)
```
Abstract

This paper discusses the impact of metadata implementation in a webpage on its visibility performance in a search engine results list. Influential internal and external factors of metadata implementation were identified. How these factors affect webpage visibility in a search engine results list was examined in an experimental study. Findings suggest that metadata is a good mechanism to improve webpage visibility, the metadata subject field plays a more important role than any other metadata field and keywords extracted from the webpage itself, particularly title or full-text, are most effective. To maximize the effects, these keywords should come from both title and full-text.
Zhang, J.; Dimitroff, A.: ¬The impact of metadata implementation on webpage visibility in search engine results : part II (2005) 0.01
```
0.008801571 = product of:
  0.026404712 = sum of:
    0.026404712 = weight(_text_:on in 1033) [ClassicSimilarity], result of:
      0.026404712 = score(doc=1033,freq=4.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.24056101 = fieldWeight in 1033, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1033)
  0.33333334 = coord(1/3)
```
Abstract

This paper discusses the impact of metadata implementation in a webpage on its visibility performance in a search engine results list. Influential internal and external factors of metadata implementation were identified. How these factors affect webpage visibility in a search engine results list was examined in an experimental study. Findings suggest that metadata is a good mechanism to improve webpage visibility, the metadata subject field plays a more important role than any other metadata field and keywords extracted from the webpage itself, particularly title or full-text, are most effective. To maximize the effects, these keywords should come from both title and full-text.
Zhang, J.; Wolfram, D.: Visualization of term discrimination analysis (2001) 0.01
```
0.0076997704 = product of:
  0.02309931 = sum of:
    0.02309931 = weight(_text_:on in 5210) [ClassicSimilarity], result of:
      0.02309931 = score(doc=5210,freq=6.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.21044704 = fieldWeight in 5210, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5210)
  0.33333334 = coord(1/3)
```
Abstract

Zang and Wolfram compute the discrimination value for terms as the difference between the centroid value of all terms in the corpus and that value without the term in question, and suggest selection be made by comparing density changes with a visualization tool. The Distance Angle Retrieval Environment (DARE) visually projects a document or term space by presenting distance similarity on the X axis and angular similarity on the Y axis. Thus a document icon appearing close to the X axis would be relevant to reference points in terms of a distance similarity measure, while those close to the Y axis are relevant to reference points in terms of an angle based measure. Using 450 Associated Press news reports indexed by 44 distinct terms, the removal of the term ``Yeltsin'' causes the cluster to fall on the Y axis indicating a good discriminator. For an angular measure, cosine say, movement along the X axis to the left will signal good discrimination, as movement to the right will signal poor discrimination. A term density space could also be used. Most terms are shown to be indifferent discriminators. Different measures result in different choices as good and poor discriminators, as does the use of a term space rather than a document space. The visualization approach is clearly feasible, and provides some additional insights not found in the computation of a discrimination value.
Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.01
```
0.0076997704 = product of:
  0.02309931 = sum of:
    0.02309931 = weight(_text_:on in 5238) [ClassicSimilarity], result of:
      0.02309931 = score(doc=5238,freq=6.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.21044704 = fieldWeight in 5238, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5238)
  0.33333334 = coord(1/3)
```
Abstract

Wolfram and Zhang are interested in the effect of different indexing exhaustivity, by which they mean the number of terms chosen, and of different index term distributions and different term weighting methods on the resulting document cluster organization. The Distance Angle Retrieval Environment, DARE, which provides a two dimensional display of retrieved documents was used to represent the document clusters based upon a document's distance from the searcher's main interest, and on the angle formed by the document, a point representing a minor interest, and the point representing the main interest. If the centroid and the origin of the document space are assigned as major and minor points the average distance between documents and the centroid can be measured providing an indication of cluster organization. in the form of a size normalized similarity measure. Using 500 records from NTIS and nine models created by intersecting low, observed, and high exhaustivity levels (based upon a negative binomial distribution) with shallow, observed, and steep term distributions (based upon a Zipf distribution) simulation runs were preformed using inverse document frequency, inter-document term frequency, and inverse document frequency based upon both inter and intra-document frequencies. Low exhaustivity and shallow distributions result in a more dense document space and less effective retrieval. High exhaustivity and steeper distributions result in a more diffuse space.
Zhang, J.: Archival context, digital content, and the ethics of digital archival representation : the ethics of identification in digital library metadata (2012) 0.01
```
0.0076997704 = product of:
  0.02309931 = sum of:
    0.02309931 = weight(_text_:on in 419) [ClassicSimilarity], result of:
      0.02309931 = score(doc=419,freq=6.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.21044704 = fieldWeight in 419, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=419)
  0.33333334 = coord(1/3)
```
Abstract

The findings of a recent study on digital archival representation raise some ethical concerns about how digital archival materials are organized, described, and made available for use on the Web. Archivists have a fundamental obligation to preserve and protect the authenticity and integrity of records in their holdings and, at the same time, have the responsibility to promote the use of records as a fundamental purpose of the keeping of archives (SAA 2005 Code of Ethics for Archivists V & VI). Is it an ethical practice that digital content in digital archives is deeply embedded in its contextual structure and generally underrepresented in digital archival systems? Similarly, is it ethical for archivists to detach digital items from their archival context in order to make them more "digital friendly" and more accessible to meet needs of some users? Do archivists have an obligation to bring the two representation systems together so that the context and content of digital archives can be better represented and archival materials "can be located and used by anyone, for any purpose, while still remaining authentic evidence of the work and life of the creator"? (Millar 2010, 157) This paper discusses the findings of the study and their ethical implications relating to digital archival description and representation.

Content

Beitrag aus einem Themenheft zu den Proceedings of the 2nd Milwaukee Conference on Ethics in Information Organization, June 15-16, 2012, School of Information Studies, University of Wisconsin-Milwaukee. Hope A. Olson, Conference Chair. Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_39_2012_5_d.pdf.
Zhang, J.; Chen, Y.; Zhao, Y.; Wolfram, D.; Ma, F.: Public health and social media : a study of Zika virus-related posts on Yahoo! Answers (2020) 0.01
```
0.0076997704 = product of:
  0.02309931 = sum of:
    0.02309931 = weight(_text_:on in 5672) [ClassicSimilarity], result of:
      0.02309931 = score(doc=5672,freq=6.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.21044704 = fieldWeight in 5672, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5672)
  0.33333334 = coord(1/3)
```
Abstract

This study investigates the content of questions and responses about the Zika virus on Yahoo! Answers as a recent example of how public concerns regarding an international health issue are reflected in social media. We investigate the contents of posts about the Zika virus on Yahoo! Answers, identify and reveal subject patterns about the Zika virus, and analyze the temporal changes of the revealed subject topics over 4 defined periods of the Zika virus outbreak. Multidimensional scaling analysis, temporal analysis, and inferential statistical analysis approaches were used in the study. A resulting 2-layer Zika virus schema, and term connections and relationships are presented. The results indicate that consumers' concerns changed over the 4 defined periods. Consumers paid more attention to the basic information about the Zika virus, and the prevention and protection from the Zika virus at the beginning of the outbreak of the Zika virus. During the later periods, consumers became more interested in the role that the government and health organizations played in the public health emergency.
Zhang, J.; Nguyen, T.: WebStar: a visualization model for hyperlink structures (2005) 0.01
```
0.0075442037 = product of:
  0.02263261 = sum of:
    0.02263261 = weight(_text_:on in 1056) [ClassicSimilarity], result of:
      0.02263261 = score(doc=1056,freq=4.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.20619515 = fieldWeight in 1056, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=1056)
  0.33333334 = coord(1/3)
```
Abstract

The authors introduce an information visualization model, WebStar, for hyperlink-based information systems. Hyperlinks within a hyperlink-based document can be visualized in a two-dimensional visual space. All links are projected within a display sphere in the visual space. The relationship between a specified central document and its hyperlinked documents is visually presented in the visual space. In addition, users are able to define a group of subjects and to observe relevance between each subject and all hyperlinked documents via movement of that subject around the display sphere center. WebStar allows users to dynamically change an interest center during navigation. A retrieval mechanism is developed to control retrieved results in the visual space. Impact of movement of a subject on the visual document distribution is analyzed. An ambiguity problem caused by projection is discussed. Potential applications of this visualization model in information retrieval are included. Future research directions on the topic are addressed.

Zhang, J.; Korfhage, R.R.: DARE: Distance and Angle Retrieval Environment : A tale of the two measures (1999) 0.01

0.007112743 = product of:
  0.021338228 = sum of:
    0.021338228 = weight(_text_:on in 3916) [ClassicSimilarity], result of:
      0.021338228 = score(doc=3916,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.19440265 = fieldWeight in 3916, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0625 = fieldNorm(doc=3916)
  0.33333334 = coord(1/3)

Abstract: This article presents a visualization tool for information retrieval. Some retrieval evaluation models are interpreted in the two-dimensional space comprising direction and distance. The two different similarity measures-angle and distance-are displayed in the visual space. A new retrieval means based on the visual retrieval tool, the controlling bar, is developed for a search

Zhang, J.; Zhao, Y.: ¬A user term visualization analysis based on a social question and answer log (2013) 0.01
```
0.0062868367 = product of:
  0.01886051 = sum of:
    0.01886051 = weight(_text_:on in 2715) [ClassicSimilarity], result of:
      0.01886051 = score(doc=2715,freq=4.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.1718293 = fieldWeight in 2715, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2715)
  0.33333334 = coord(1/3)
```
Abstract

The authors of this paper investigate terms of consumers' diabetes based on a log from the Yahoo!Answers social question and answers (Q&A) forum, ascertain characteristics and relationships among terms related to diabetes from the consumers' perspective, and reveal users' diabetes information seeking patterns. In this study, the log analysis method, data coding method, and visualization multiple-dimensional scaling analysis method were used for analysis. The visual analyses were conducted at two levels: terms analysis within a category and category analysis among the categories in the schema. The findings show that the average number of words per question was 128.63, the average number of sentences per question was 8.23, the average number of words per response was 254.83, and the average number of sentences per response was 16.01. There were 12 categories (Cause & Pathophysiology, Sign & Symptom, Diagnosis & Test, Organ & Body Part, Complication & Related Disease, Medication, Treatment, Education & Info Resource, Affect, Social & Culture, Lifestyle, and Nutrient) in the diabetes related schema which emerged from the data coding analysis. The analyses at the two levels show that terms and categories were clustered and patterns were revealed. Future research directions are also included.

Search (27 results, page 1 of 2)

Authors

Years

Types

Themes