Search (10 results, page 1 of 1)

Allan, J.; Callan, J.P.; Croft, W.B.; Ballesteros, L.; Broglio, J.; Xu, J.; Shu, H.: INQUERY at TREC-5 (1997) 0.03

0.034582928 = product of:
  0.069165856 = sum of:
    0.069165856 = sum of:
      0.006765375 = weight(_text_:a in 3103) [ClassicSimilarity], result of:
        0.006765375 = score(doc=3103,freq=2.0), product of:
          0.053105544 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046056706 = queryNorm
          0.12739488 = fieldWeight in 3103, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.078125 = fieldNorm(doc=3103)
      0.06240048 = weight(_text_:22 in 3103) [ClassicSimilarity], result of:
        0.06240048 = score(doc=3103,freq=2.0), product of:
          0.16128273 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046056706 = queryNorm
          0.38690117 = fieldWeight in 3103, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.078125 = fieldNorm(doc=3103)
  0.5 = coord(1/2)

Date: 27. 2.1999 20:55:22
Type: a

Xu, J.; Croft, W.B.: Topic-based language models for distributed retrieval (2000) 0.00
```
0.0030444188 = product of:
  0.0060888375 = sum of:
    0.0060888375 = product of:
      0.012177675 = sum of:
        0.012177675 = weight(_text_:a in 38) [ClassicSimilarity], result of:
          0.012177675 = score(doc=38,freq=18.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.22931081 = fieldWeight in 38, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=38)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Effective retrieval in a distributed environment is an important but difficult problem. Lack of effectiveness appears to have two major causes. First, existing collection selection algorithms do not work well on heterogeneous collections. Second, relevant documents are scattered over many collections and searching a few collections misses many relevant documents. We propose a topic-oriented approach to distributed retrieval. With this approach, we structure the document set of a distributed retrieval environment around a set of topics. Retrieval for a query involves first selecting the right topics for the query and then dispatching the search process to collections that contain such topics. The content of a topic is characterized by a language model. In environments where the labeling of documents by topics is unavailable, document clustering is employed for topic identification. Based on these ideas, three methods are proposed to suit different environments. We show that all three methods improve effectiveness of distributed retrieval

Type

a
Xu, J.; Weischedel, R.: Empirical studies on the impact of lexical resources on CLIR performance (2005) 0.00
```
0.0026849252 = product of:
  0.0053698504 = sum of:
    0.0053698504 = product of:
      0.010739701 = sum of:
        0.010739701 = weight(_text_:a in 1020) [ClassicSimilarity], result of:
          0.010739701 = score(doc=1020,freq=14.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20223314 = fieldWeight in 1020, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=1020)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this paper, we compile and review several experiments measuring cross-lingual information retrieval (CLIR) performance as a function of the following resources: bilingual term lists, parallel corpora, machine translation (MT), and stemmers. Our CLIR system uses a simple probabilistic language model; the studies used TREC test corpora over Chinese, Spanish and Arabic. Our findings include: One can achieve an acceptable CLIR performance using only a bilingual term list (70-80% on Chinese and Arabic corpora). However, if a bilingual term list and parallel corpora are available, CLIR performance can rival monolingual performance. If no parallel corpus is available, pseudo-parallel texts produced by an MT system can partially overcome the lack of parallel text. While stemming is useful normally, with a very large parallel corpus for Arabic-English, stemming hurt performance in our empirical studies with Arabic, a highly inflected language.

Type

a

Xu, J.; Weischedel, R.; Licuanan, A.: Evaluation of an extraction-based approach to answering definitional questions (2004) 0.00

0.0023919214 = product of:
  0.0047838427 = sum of:
    0.0047838427 = product of:
      0.009567685 = sum of:
        0.009567685 = weight(_text_:a in 4107) [ClassicSimilarity], result of:
          0.009567685 = score(doc=4107,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.18016359 = fieldWeight in 4107, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=4107)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Wang, P.; Ma, Y.; Xie, H.; Wang, H.; Lu, J.; Xu, J.: "There is a gorilla holding a key on the book cover" : young children's known picture book search strategies (2022) 0.00
```
0.0023919214 = product of:
  0.0047838427 = sum of:
    0.0047838427 = product of:
      0.009567685 = sum of:
        0.009567685 = weight(_text_:a in 443) [ClassicSimilarity], result of:
          0.009567685 = score(doc=443,freq=16.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.18016359 = fieldWeight in 443, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=443)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

There is no information search system can assist young children's known picture book search needs since the information is not organized according to their cognitive abilities and needs. Therefore, this study explored young children's known picture book search strategies and extracted picture book search elements by simulating a search scenario and playing a picture book search game. The study found 29 elements children used to search for known picture books. Then, these elements are classified into three dimensions: The first dimension is the concept category of an element. The second dimension is an element's status in the story. The third dimension indicates where an element appears in a picture book. Additionally, it revealed a young children's general search strategy: Children first use auditory elements that they hear from the adults during reading. After receiving error returns, they add visual elements that they see by themselves in picture books. The findings can not only help to understand young children's known-item search and reformulation strategies during searching but also provide theoretical support for the development of a picture book information organization schema in the search system.

Type

a
Xu, J.: Author credit-assignment schemas : a comparison and analysis (2016) 0.00
```
0.0020714647 = product of:
  0.0041429293 = sum of:
    0.0041429293 = product of:
      0.008285859 = sum of:
        0.008285859 = weight(_text_:a in 3056) [ClassicSimilarity], result of:
          0.008285859 = score(doc=3056,freq=12.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.15602624 = fieldWeight in 3056, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3056)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Credit assignment to multiple authors of a publication is a challenging task owing to the conventions followed within different areas of research. In this study, we present a review of different author credit-assignment schemas, which are designed mainly based on author position and the total number of coauthors on the publication. We implemented, tested, and classified 15 author credit-assignment schemas into 3 types: linear, curve, and "other" assignment schemas. Further investigation and analysis revealed that most of the methods provide reasonable credit-assignment results, even though the credit-assignment distribution approaches are quite different among different types. The evaluation of each schema based on PubMed articles published in 2013 shows that there exist positive correlations among different schemas and that the similarity of credit-assignment distributions can be derived from the similar design principles that stress the number of coauthors or the author position, or consider both. We provide a summary about the features of each credit-assignment schema to facilitate the selection of the appropriate one, depending on the different conditions required to meet diverse needs.

Type

a
Schroeder, J.; Xu, J.; Chen, H.; Chau, M.: Automated criminal link analysis based on domain knowledge (2007) 0.00
```
0.0020296127 = product of:
  0.0040592253 = sum of:
    0.0040592253 = product of:
      0.008118451 = sum of:
        0.008118451 = weight(_text_:a in 275) [ClassicSimilarity], result of:
          0.008118451 = score(doc=275,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.15287387 = fieldWeight in 275, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=275)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Link (association) analysis has been used in the criminal justice domain to search large datasets for associations between crime entities in order to facilitate crime investigations. However, link analysis still faces many challenging problems, such as information overload, high search complexity, and heavy reliance on domain knowledge. To address these challenges, this article proposes several techniques for automated, effective, and efficient link analysis. These techniques include the co-occurrence analysis, the shortest path algorithm, and a heuristic approach to identifying associations and determining their importance. We developed a prototype system called CrimeLink Explorer based on the proposed techniques. Results of a user study with 10 crime investigators from the Tucson Police Department showed that our system could help subjects conduct link analysis more efficiently than traditional single-level link analysis tools. Moreover, subjects believed that association paths found based on the heuristic approach were more accurate than those found based solely on the co-occurrence analysis and that the automated link analysis system would be of great help in crime investigations.

Type

a
Zhang, C.; Bu, Y.; Ding, Y.; Xu, J.: Understanding scientific collaboration : homophily, transitivity, and preferential attachment (2018) 0.00
```
0.0020296127 = product of:
  0.0040592253 = sum of:
    0.0040592253 = product of:
      0.008118451 = sum of:
        0.008118451 = weight(_text_:a in 4011) [ClassicSimilarity], result of:
          0.008118451 = score(doc=4011,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.15287387 = fieldWeight in 4011, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=4011)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Scientific collaboration is essential in solving problems and breeding innovation. Coauthor network analysis has been utilized to study scholars' collaborations for a long time, but these studies have not simultaneously taken different collaboration features into consideration. In this paper, we present a systematic approach to analyze the differences in possibilities that two authors will cooperate as seen from the effects of homophily, transitivity, and preferential attachment. Exponential random graph models (ERGMs) are applied in this research. We find that different types of publications one author has written play diverse roles in his/her collaborations. An author's tendency to form new collaborations with her/his coauthors' collaborators is strong, where the more coauthors one author had before, the more new collaborators he/she will attract. We demonstrate that considering the authors' attributes and homophily effects as well as the transitivity and preferential attachment effects of the coauthorship network in which they are embedded helps us gain a comprehensive understanding of scientific collaboration.

Type

a
Bu, Y.; Ding, Y.; Xu, J.; Liang, X.; Gao, G.; Zhao, Y.: Understanding success through the diversity of collaborators and the milestone of career (2018) 0.00
```
0.0018909799 = product of:
  0.0037819599 = sum of:
    0.0037819599 = product of:
      0.0075639198 = sum of:
        0.0075639198 = weight(_text_:a in 4012) [ClassicSimilarity], result of:
          0.0075639198 = score(doc=4012,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.14243183 = fieldWeight in 4012, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4012)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Scientific collaboration is vital to many fields, and it is common to see scholars seek out experienced researchers or experts in a domain with whom they can share knowledge, experience, and resources. To explore the diversity of research collaborations, this article performs a temporal analysis on the scientific careers of researchers in the field of computer science. Specifically, we analyze collaborators using 2 indicators: the research topic diversity, measured by the Author-Conference-Topic model and cosine, and the impact diversity, measured by the normalized standard deviation of h-indices. We find that the collaborators of high-impact researchers tend to study diverse research topics and have diverse h-indices. Moreover, by setting PhD graduation as an important milestone in researchers' careers, we examine several indicators related to scientific collaboration and their effects on a career. The results show that collaborating with authoritative authors plays an important role prior to a researcher's PhD graduation, but working with non-authoritative authors carries more weight after PhD graduation.

Type

a
Liu, M.; Bu, Y.; Chen, C.; Xu, J.; Li, D.; Leng, Y.; Freeman, R.B.; Meyer, E.T.; Yoon, W.; Sung, M.; Jeong, M.; Lee, J.; Kang, J.; Min, C.; Zhai, Y.; Song, M.; Ding, Y.: Pandemics are catalysts of scientific novelty : evidence from COVID-19 (2022) 0.00
```
0.0018909799 = product of:
  0.0037819599 = sum of:
    0.0037819599 = product of:
      0.0075639198 = sum of:
        0.0075639198 = weight(_text_:a in 633) [ClassicSimilarity], result of:
          0.0075639198 = score(doc=633,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.14243183 = fieldWeight in 633, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=633)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Scientific novelty drives the efforts to invent new vaccines and solutions during the pandemic. First-time collaboration and international collaboration are two pivotal channels to expand teams' search activities for a broader scope of resources required to address the global challenge, which might facilitate the generation of novel ideas. Our analysis of 98,981 coronavirus papers suggests that scientific novelty measured by the BioBERT model that is pretrained on 29 million PubMed articles, and first-time collaboration increased after the outbreak of COVID-19, and international collaboration witnessed a sudden decrease. During COVID-19, papers with more first-time collaboration were found to be more novel and international collaboration did not hamper novelty as it had done in the normal periods. The findings suggest the necessity of reaching out for distant resources and the importance of maintaining a collaborative scientific community beyond nationalism during a pandemic.

Type

a

Search (10 results, page 1 of 1)

Authors

Years

Themes