research-article

Ranking related news predictions

Authors:
Nattiya Kanhabua

Norwegian University of Science and Technology, Trondheim, Norway

Norwegian University of Science and Technology, Trondheim, Norway
View Profile

,
Roi Blanco

Yahoo! Research, Barcelona, Spain

Yahoo! Research, Barcelona, Spain
View Profile

,
Michael Matthews

Yahoo! Research, Barcelona, Spain

Yahoo! Research, Barcelona, Spain
View Profile

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalJuly 2011Pages 755–764https://doi.org/10.1145/2009916.2010018

Published:24 July 2011Publication History

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 755–764

ABSTRACT

We estimate that nearly one third of news articles contain references to future events. While this information can prove crucial to understanding news stories and how events will develop for a given topic, there is currently no easy way to access this information. We propose a new task to address the problem of retrieving and ranking sentences that contain mentions to future events, which we call ranking related news predictions. In this paper, we formally define this task and propose a learning to rank approach based on 4 classes of features: term similarity, entity-based similarity, topic similarity, and temporal similarity. Through extensive evaluations using a corpus consisting of 1.8 millions news articles and 6,000 manually judged relevance pairs, we show that our approach is able to retrieve a significant number of relevant predictions related to a given topic.

References

O. Alonso, M. Gertz, and R. Baeza-Yates. On the value of temporal information in information retrieval. ACM SIGIR Forum, 41(2):35--41, 2007. Google ScholarDigital Library
A. Asuncion, M. Welling, P. Smyth, and Y. W. Teh. On smoothing and inference for topic models. In Proceedings of UAI'2009, 2009. Google ScholarDigital Library
R. Baeza-Yates. Searching the future. In Proceedings of ACM SIGIR workshop MF/IR 2005, 2005.Google Scholar
K. Balog, L. Azzopardi, and M. de Rijke. A language modeling framework for expert finding. Inf. Process. Manage., 45(1):1--19, 2009. Google ScholarDigital Library
K. Berberich, S. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. In Proceedings of ECIR'2010, 2010. Google ScholarDigital Library
R. Blanco and H. Zaragoza. Finding support sentences for entities. In Proceeding of SIGIR'2010, 2010. Google ScholarDigital Library
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, March 2003. Google ScholarCross Ref
J. Canton. The Extreme Future: The Top Trends That Will Reshape the World in the Next 20 Years. Plume, 2007.Google ScholarDigital Library
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. J. Mach. Learn. Res., 7:551--585, 2006. Google ScholarDigital Library
G. Demartini, A. P. Vries, T. Iofciu, and J. Zhu. Overview of the INEX 2008 Entity Ranking Track. 2009.Google ScholarDigital Library
F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In Proceedings of SIGIR'2004, 2004. Google ScholarDigital Library
T. L. Griffiths. Finding scientific topics. Proceedings of the National Academy of Science, 101:5228--5235, Jan. 2004.Google ScholarCross Ref
A. Jatowt, K. Kanazawa, S. Oyama, and K. Tanaka. Supporting analysis of future-related information in news archives and the web. In Proceedings of JCDL'2009, 2009. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of KDD'2002, 2002. Google ScholarDigital Library
P. J. Kalczynski and A. Chou. Temporal document retrieval model for business news archives. Inf. Process. Manage., 41, 2005. Google ScholarDigital Library
N. Kanhabua and K. Nørvåg. Determining time of queries for re-ranking search results. In Proceedings of ECDL'2010, 2010. Google ScholarDigital Library
N. Lathia, S. Hailes, L. Capra, and X. Amatriain. Temporal diversity in recommender systems. In Proceeding of SIGIR'2010, 2010. Google ScholarDigital Library
X. Li and W. B. Croft. Time-based language models. In Proceedings of CIKM'2003, 2003. Google ScholarDigital Library
X. Li and W. B. Croft. Improving novelty detection for general topics using sentence level information patterns. In Proceedings of CIKM'2006, 2006. Google ScholarDigital Library
T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, 2009. Google ScholarDigital Library
C. Macdonald and I. Ounis. Searching for expertise: Experiments with the voting model. Comput. J., 52(7):729--748, 2009. Google ScholarDigital Library
M. Matthews, P. Tolchinsky, R. Blanco, J. Atserias, P. Mika, and H. Zaragoza. Searching through time in the new york times. In Bridging Human-Computer Interaction and Information Retrieval, 2010.Google Scholar
D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving search relevance for implicitly temporal queries. In Proceedings of SIGIR'2009, 2009. Google ScholarDigital Library
V. Murdock. Exploring Sentence Retrieval. VDM Verlag Dr. Mueller e.K., 2008.Google Scholar
M. J. Pazzani and D. Billsus. The adaptive web. pages 325--341, 2007.Google Scholar
S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of SIGIR'1994, 1994. Google ScholarDigital Library
R. P. Schumaker and H. Chen. Textual analysis of stock market prediction using breaking financial news: The azfin text system. ACM Trans. Inf. Syst., 27:12:1--12:19, March 2009. Google ScholarDigital Library
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of ICML'2007, 2007. Google ScholarDigital Library
Y. Song, S. Pan, S. Liu, M. X. Zhou, and W. Qian. Topic and keyword re-ranking for lda-based topic modeling. In Proceeding of CIKM'2009, 2009. Google ScholarDigital Library
M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online qa collections. In Proceedings of ACL-08: HLT, 2008.Google Scholar
X. Wang and A. McCallum. Topics over time: a non-markov continuous-time model of topical trends. In Proceedings of KDD'2006, 2006. Google ScholarDigital Library
X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of SIGIR'2006, 2006. Google ScholarDigital Library
D. Wu, G. P. C. Fung, J. X. Yu, and Q. Pan. Stock prediction: an event-driven approach based on bursty keywords. Frontiers of Computer Science in China, 3(2):145--157, 2009.Google ScholarCross Ref
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proceedings of SIGIR'2007, 2007. Google ScholarDigital Library
H. Zaragoza, H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. In Proceedings of CIKM'2007, 2007. Google ScholarDigital Library
T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of ICML'2004, 2004. Google ScholarDigital Library

Index Terms

Ranking related news predictions
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
    2. Retrieval models and ranking

Recommendations

Estimating query difficulty for news prediction retrieval
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

News prediction retrieval has recently emerged as the task of retrieving predictions related to a given news story (or a query). Predictions are defined as sentences containing time references to future events. Such future-related information is ...
Read More
News video story sentiment classification and ranking
ICME '11: Proceedings of the 2011 IEEE International Conference on Multimedia and Expo

In this paper, we present a novel approach for news video story sentiment analysis. Two research challenges are addressed: news video story sentiment classification and ranking. For classification, a graph based semi-supervised learning approach is ...
Read More
Ranking Through Clustering: An Integrated Approach to Multi-Document Summarization

Multi-document summarization aims to create a condensed summary while retaining the main characteristics of the original set of documents. Under such background, sentence ranking has hitherto been the issue of most concern. Since documents often cover a ...
Read More

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
future events
news predictions
sentence retrieval and ranking
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 736
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Ranking related news predictions

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Estimating query difficulty for news prediction retrieval

News video story sentiment classification and ranking

Ranking Through Clustering: An Integrated Approach to Multi-Document Summarization