www.fgks.org   »   [go: up one dir, main page]

skip to main content
10.1145/2009916.2010018acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Ranking related news predictions

Published:24 July 2011Publication History

ABSTRACT

We estimate that nearly one third of news articles contain references to future events. While this information can prove crucial to understanding news stories and how events will develop for a given topic, there is currently no easy way to access this information. We propose a new task to address the problem of retrieving and ranking sentences that contain mentions to future events, which we call ranking related news predictions. In this paper, we formally define this task and propose a learning to rank approach based on 4 classes of features: term similarity, entity-based similarity, topic similarity, and temporal similarity. Through extensive evaluations using a corpus consisting of 1.8 millions news articles and 6,000 manually judged relevance pairs, we show that our approach is able to retrieve a significant number of relevant predictions related to a given topic.

References

  1. O. Alonso, M. Gertz, and R. Baeza-Yates. On the value of temporal information in information retrieval. ACM SIGIR Forum, 41(2):35--41, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Asuncion, M. Welling, P. Smyth, and Y. W. Teh. On smoothing and inference for topic models. In Proceedings of UAI'2009, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Baeza-Yates. Searching the future. In Proceedings of ACM SIGIR workshop MF/IR 2005, 2005.Google ScholarGoogle Scholar
  4. K. Balog, L. Azzopardi, and M. de Rijke. A language modeling framework for expert finding. Inf. Process. Manage., 45(1):1--19, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Berberich, S. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. In Proceedings of ECIR'2010, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Blanco and H. Zaragoza. Finding support sentences for entities. In Proceeding of SIGIR'2010, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, March 2003. Google ScholarGoogle ScholarCross RefCross Ref
  8. J. Canton. The Extreme Future: The Top Trends That Will Reshape the World in the Next 20 Years. Plume, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. J. Mach. Learn. Res., 7:551--585, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Demartini, A. P. Vries, T. Iofciu, and J. Zhu. Overview of the INEX 2008 Entity Ranking Track. 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In Proceedings of SIGIR'2004, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. L. Griffiths. Finding scientific topics. Proceedings of the National Academy of Science, 101:5228--5235, Jan. 2004.Google ScholarGoogle ScholarCross RefCross Ref
  13. A. Jatowt, K. Kanazawa, S. Oyama, and K. Tanaka. Supporting analysis of future-related information in news archives and the web. In Proceedings of JCDL'2009, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of KDD'2002, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. J. Kalczynski and A. Chou. Temporal document retrieval model for business news archives. Inf. Process. Manage., 41, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Kanhabua and K. Nørvåg. Determining time of queries for re-ranking search results. In Proceedings of ECDL'2010, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Lathia, S. Hailes, L. Capra, and X. Amatriain. Temporal diversity in recommender systems. In Proceeding of SIGIR'2010, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Li and W. B. Croft. Time-based language models. In Proceedings of CIKM'2003, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Li and W. B. Croft. Improving novelty detection for general topics using sentence level information patterns. In Proceedings of CIKM'2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Macdonald and I. Ounis. Searching for expertise: Experiments with the voting model. Comput. J., 52(7):729--748, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Matthews, P. Tolchinsky, R. Blanco, J. Atserias, P. Mika, and H. Zaragoza. Searching through time in the new york times. In Bridging Human-Computer Interaction and Information Retrieval, 2010.Google ScholarGoogle Scholar
  23. D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving search relevance for implicitly temporal queries. In Proceedings of SIGIR'2009, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. V. Murdock. Exploring Sentence Retrieval. VDM Verlag Dr. Mueller e.K., 2008.Google ScholarGoogle Scholar
  25. M. J. Pazzani and D. Billsus. The adaptive web. pages 325--341, 2007.Google ScholarGoogle Scholar
  26. S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of SIGIR'1994, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. P. Schumaker and H. Chen. Textual analysis of stock market prediction using breaking financial news: The azfin text system. ACM Trans. Inf. Syst., 27:12:1--12:19, March 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of ICML'2007, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Song, S. Pan, S. Liu, M. X. Zhou, and W. Qian. Topic and keyword re-ranking for lda-based topic modeling. In Proceeding of CIKM'2009, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online qa collections. In Proceedings of ACL-08: HLT, 2008.Google ScholarGoogle Scholar
  31. X. Wang and A. McCallum. Topics over time: a non-markov continuous-time model of topical trends. In Proceedings of KDD'2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of SIGIR'2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. Wu, G. P. C. Fung, J. X. Yu, and Q. Pan. Stock prediction: an event-driven approach based on bursty keywords. Frontiers of Computer Science in China, 3(2):145--157, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  34. Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proceedings of SIGIR'2007, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. H. Zaragoza, H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. In Proceedings of CIKM'2007, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of ICML'2004, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Ranking related news predictions

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
        July 2011
        1374 pages
        ISBN:9781450307574
        DOI:10.1145/2009916

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 July 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader