-
Wordhoard (o.J.)
0.02
0.016259817 = product of:
0.032519635 = sum of:
0.032519635 = product of:
0.06503927 = sum of:
0.06503927 = weight(_text_:n in 3922) [ClassicSimilarity], result of:
0.06503927 = score(doc=3922,freq=2.0), product of:
0.19504215 = queryWeight, product of:
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.045236014 = queryNorm
0.33346266 = fieldWeight in 3922, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.0546875 = fieldNorm(doc=3922)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- WordHoard defines a multiword unit as a special type of collocate in which the component words comprise a meaningful phrase. For example, "Knight of the Round Table" is a meaningful multiword unit or phrase. WordHoard uses the notion of a pseudo-bigram to generalize the computation of bigram (two word) statistical measures to phrases (n-grams) longer than two words, and to allow comparisons of these measures for phrases with different word counts. WordHoard applies the localmaxs algorithm of Silva et al. to the pseudo-bigrams to identify potential compositional phrases that "stand out" in a text. WordHoard can also filter two and three word phrases using the word class filters suggested by Justeson and Katz.
-
WordHoard: finding multiword units (20??)
0.02
0.016259817 = product of:
0.032519635 = sum of:
0.032519635 = product of:
0.06503927 = sum of:
0.06503927 = weight(_text_:n in 1123) [ClassicSimilarity], result of:
0.06503927 = score(doc=1123,freq=2.0), product of:
0.19504215 = queryWeight, product of:
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.045236014 = queryNorm
0.33346266 = fieldWeight in 1123, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.0546875 = fieldNorm(doc=1123)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- WordHoard defines a multiword unit as a special type of collocate in which the component words comprise a meaningful phrase. For example, "Knight of the Round Table" is a meaningful multiword unit or phrase. WordHoard uses the notion of a pseudo-bigram to generalize the computation of bigram (two word) statistical measures to phrases (n-grams) longer than two words, and to allow comparisons of these measures for phrases with different word counts. WordHoard applies the localmaxs algorithm of Silva et al. to the pseudo-bigrams to identify potential compositional phrases that "stand out" in a text. WordHoard can also filter two and three word phrases using the word class filters suggested by Justeson and Katz.
-
Aizawa, A.; Kohlhase, M.: Mathematical information retrieval (2021)
0.02
0.016259817 = product of:
0.032519635 = sum of:
0.032519635 = product of:
0.06503927 = sum of:
0.06503927 = weight(_text_:n in 667) [ClassicSimilarity], result of:
0.06503927 = score(doc=667,freq=2.0), product of:
0.19504215 = queryWeight, product of:
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.045236014 = queryNorm
0.33346266 = fieldWeight in 667, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.0546875 = fieldNorm(doc=667)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Source
- Evaluating information retrieval and access tasks. Eds.: Sakai, T., Oard, D., Kando, N. [https://doi.org/10.1007/978-981-15-5554-1_12]
-
Liu, P.J.; Saleh, M.; Pot, E.; Goodrich, B.; Sepassi, R.; Kaiser, L.; Shazeer, N.: Generating Wikipedia by summarizing long sequences (2018)
0.02
0.016259817 = product of:
0.032519635 = sum of:
0.032519635 = product of:
0.06503927 = sum of:
0.06503927 = weight(_text_:n in 773) [ClassicSimilarity], result of:
0.06503927 = score(doc=773,freq=2.0), product of:
0.19504215 = queryWeight, product of:
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.045236014 = queryNorm
0.33346266 = fieldWeight in 773, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.0546875 = fieldNorm(doc=773)
0.5 = coord(1/2)
0.5 = coord(1/2)
-
Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016)
0.01
0.013936987 = product of:
0.027873974 = sum of:
0.027873974 = product of:
0.05574795 = sum of:
0.05574795 = weight(_text_:n in 2697) [ClassicSimilarity], result of:
0.05574795 = score(doc=2697,freq=2.0), product of:
0.19504215 = queryWeight, product of:
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.045236014 = queryNorm
0.28582513 = fieldWeight in 2697, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.046875 = fieldNorm(doc=2697)
0.5 = coord(1/2)
0.5 = coord(1/2)
-
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; Agarwal, S.; Herbert-Voss, A.; Krueger, G.; Henighan, T.; Child, R.; Ramesh, A.; Ziegler, D.M.; Wu, J.; Winter, C.; Hesse, C.; Chen, M.; Sigler, E.; Litwin, M.; Gray, S.; Chess, B.; Clark, J.; Berner, C.; McCandlish, S.; Radford, A.; Sutskever, I.; Amodei, D.: Language models are few-shot learners (2020)
0.01
0.009291325 = product of:
0.01858265 = sum of:
0.01858265 = product of:
0.0371653 = sum of:
0.0371653 = weight(_text_:n in 872) [ClassicSimilarity], result of:
0.0371653 = score(doc=872,freq=2.0), product of:
0.19504215 = queryWeight, product of:
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.045236014 = queryNorm
0.19055009 = fieldWeight in 872, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.3116565 = idf(docFreq=1611, maxDocs=44218)
0.03125 = fieldNorm(doc=872)
0.5 = coord(1/2)
0.5 = coord(1/2)