Michael J. Paul
 
Assistant Professor
Department of Information Science
University of Colorado Boulder
Office: TLC 119
Email:
Google Scholar | GitHub | LinkedIn

I am a founding faculty member in the new Information Science department in the College of Media, Communication, and Information at CU-Boulder (see: CU vs. UC). I received a Ph.D. in Computer Science from Johns Hopkins University in 2015, and a B.S. in Computer Science from the University of Illinois at Urbana-Champaign in 2009. See my CV. My research is at the intersection of text analysis and health/social science. On the methodological side, I research machine learning and natural language processing, and in particular I develop methods in topic modeling, which I use to discover patterns in large text datasets. On the applied side, I study social media to learn about human behavior, especially in the context of public health.
I am no longer in academia as of May 2020. I am not currently responding to reviewer requests.

2020
Mozhi Zhang, Yoshinari Fujinuma, Michael J. Paul, Jordan Boyd-Graber. Why overfitting isn't always bad: retrofitting cross-lingual word embeddings to dictionaries. Association for Computational Linguistics (ACL). July 2020. [paper]
[code]
Ashlynn R. Daughton, Rumi Chunara, Michael J. Paul. Comparison of social media, syndromic surveillance, and microbiologic acute respiratory infection data: observational study. JMIR Public Health and Surveillance 6(2):e14986. [link]
Xiaolei Huang, Linzi Xing, Franck Dernoncourt, Michael J. Paul. Multilingual Twitter corpus and baselines for evaluating demographic bias in hate speech recognition. International Conference on Language Resources and Evaluation (LREC). May 2020. [paper]
[data]
[code]
Shudong Hao and Michael J. Paul. An empirical study on crosslingual transfer in probabilistic topic models. Computational Linguistics 46(1):95-134. [link]
Kai Larsen, Eric Hekler, Michael J. Paul, Bryan Gibson. Improving usability of social and behavioral sciences' evidence: a call to action for a national infrastructure project for mining our knowledge. Communications of the Association for Information Systems 46(1). January 2020. [link]
[preprint]
Hande Batan, Dianna Radpour, Ariane Kehlbacher, Judith Klein-Seetharaman, Michael J. Paul. Natural vs. artificially sweet tweets: characterizing discussions of non-nutritive sweeteners on Twitter. AAAI International Workshop on Health Intelligence (W3PHIAI), New York, New York. February 2020. [paper]

2019
Davy Weissenbacher, Abeed Sarker, Arjun Magge, Ashlynn Daughton, Karen O'Connor, Michael J. Paul, Graciela Gonzalez-Hernandez. Overview of the Fourth Social Media Mining for Health (SMM4H) Shared Tasks at ACL 2019. ACL Workshop SMM4H: The 4th Social Media Mining for Health Applications Workshop and Shared Task, Florence, Italy. August 2019. [paper]
Linzi Xing, Michael J. Paul, Giuseppe Carenini. Evaluating topic quality with posterior variability. 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, November 2019. [23.8% acceptance] [paper]
Avijit Thawani, Michael J. Paul, Urmimala Sarkar, Byron C. Wallace. Are online reviews of physicians biased against female providers? Machine Learning for Healthcare, Ann Arbor, Michigan, August 2019. [paper]
Xiaolei Huang and Michael J. Paul. Neural temporality adaptation for document classification: Diachronic word embeddings and domain adaptation models. Association for Computational Linguistics (ACL), Florence, Italy, July 2019. [paper]
[code]
Yoshinari Fujinuma, Jordan Boyd-Graber, Michael J. Paul. A resource-free evaluation metric for cross-lingual word embeddings based on graph modularity. Association for Computational Linguistics (ACL), Florence, Italy, July 2019. [paper]
[code]
Dasha Pruss, Yoshinari Fujinuma, Ashlynn R. Daughton, Michael J. Paul, Brad Arnot, Danielle Albers Szafir, Jordan Boyd-Graber. Zika discourse in the Americas: a multilingual topic analysis of Twitter. PLOS ONE 14(5):e0216922. [link]
[data]
Ashlynn R. Daughton and Michael J. Paul. Identifying protective health behaviors on Twitter: Observational study of travel advisories and Zika virus. Journal of Medical Internet Research (JMIR) 21(5):e13090. [link]
Xiaolei Huang and Michael J. Paul. Neural user factor adaptation for text classification: Learning to generalize across author demographics. Conference on Lexical and Computational Semantics (*SEM), Minneapolis, Minnesota. June 2019. [paper]
[code]
[data]
Shudong Hao and Michael J. Paul. Analyzing Bayesian crosslingual transfer in topic models. North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, Minnesota. June 2019. [paper]
Xiaolei Huang, Michael C. Smith, Amelia M. Jamison, David A. Broniatowski, Mark Dredze, Sandra C. Quinn, Justin Cai, Michael J. Paul. Can online self-reports assist in real-time identification of influenza vaccination uptake? A cross-sectional study of influenza vaccine-related tweets in the US, 2013-2017. BMJ Open 9(1):e024018. [link]
[data]
Ashlynn R. Daughton and Michael J. Paul. Constructing accurate confidence intervals when aggregating social media data for public health monitoring. AAAI International Workshop on Health Intelligence (W3PHIAI), Honolulu, Hawaii. January 2019. [paper]

2018
Davy Weissenbacher, Abeed Sarkar, Michael J. Paul, Graciela Gonzalez-Hernandez. Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018. EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop and Shared Task, Brussels, Belgium. October 2018. [paper]
Sachin Muralidhara and Michael J. Paul. #Healthy selfies: exploration of health topics on Instagram. JMIR Public Health and Surveillance 4(2):e10150. [link]
Alexis S. Hammond, Michael J. Paul, Joseph Hobelmann, Animesh R. Koratana, Mark Dredze, Margaret S. Chisolm. Perceived attitudes about substance use in anonymous social media posts near college campuses: observational study. JMIR Mental Health 5(3):e52. [link]
Shudong Hao and Michael J. Paul. Learning multilingual topics from incomparable corpora. International Conference on Computational Linguistics (COLING), Santa Fe, New Mexico. August 2018. [37% acceptance] [paper]
Ashlynn R. Daughton, Michael J. Paul, Rumi Chunara. What do people tweet when they're sick? A preliminary comparison of symptom reports and Twitter timelines. ICWSM Workshop on Social Media and Health, Stanford, California. June 2018. [non-archival presentation] [paper]
Xiaolei Huang and Michael J. Paul. Examining temporality in document classification. Association for Computational Linguistics (ACL), Melbourne, Australia. July 2018. [25% acceptance] [paper]
[slides]
[code]
Meredith C. Meacham, Michael J. Paul, Danielle E. Ramo. Understanding emerging forms of cannabis use through an online community: an analysis of relative post volume and subjective highness ratings. Drug and Alcohol Dependence 188: 364-369. [link]
Shudong Hao, Jordan Boyd-Graber, Michael J. Paul. Lessons from the Bible on modern topics: low-resource multilingual topic model evaluation. North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), New Orleans, Louisiana. June 2018. [32% acceptance] [paper]
Linzi Xing and Michael J. Paul. Diagnosing and improving topic models by analyzing posterior variability. AAAI Conference on Artificial Intelligence (AAAI), New Orleans, Louisiana. February 2018. [25% acceptance] [paper]

2017
Michael J. Paul and Mark Dredze. Social monitoring for public health. Synthesis Lectures on Information Concepts, Retrieval, and Services 9(5): 1-185. September 2017. ISBN: 9781681730950. [link]
[preprint]
Ashlynn R. Daughton, Dasha Pruss, Brad Arnot, Danielle Albers Szafir, Michael J. Paul. Characteristics of Zika behavior discourse on Twitter. AMIA Workshop on Social Media Mining for Health Applications, Washington, DC. November 2017. [paper]
Linzi Xing and Michael J. Paul. Incorporating metadata into content-based user embeddings. EMNLP Workshop on Noisy User-generated Text (W-NUT), Copenhagen, Denmark. September 2017. [paper]
Xiaolei Huang, Linzi Xing, Jed R. Brubaker, Michael J. Paul. Exploring timelines of confirmed suicide incidents through social media. IEEE International Conference on Healthcare Informatics (ICHI), Park City, Utah. August 2017. [paper]
[poster]
Michael J. Paul. Feature selection as causal inference: experiments with text classification. Conference on Computational Natural Language Learning (CoNLL), Vancouver, Canada. August 2017. [18.7% acceptance] [paper]
[code]
[poster]
Dasha Pruss, Ashlynn Daughton, Brad Arnot, Danielle Szafir, Michael Paul. Content analysis of Zika related tweets. Annual Meeting of the American Public Health Association (APHA), Atlanta, Georgia. November 2017. [abstract accepted for poster presentation]
Meredith Meacham, Michael Paul, Danielle Ramo. Temporal trends and subjective highness for alternative cannabis product use reported in an online cannabis community. Lisbon Addictions 2017: Second European Conference on Addictive Behaviours and Dependencies, Lisbon, Portugal. October 2017. [abstract accepted for presentation]
Brian C. Keegan, Jofish Kaye, Patricia Cavazos-Rehg, Munmun de Choudhury, Anh Ngoc Nguyen, Michael J. Paul, Saiph Savage. CHI-nnabis: Implications of marijuana legalization for and from human-computer interaction. CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI), Denver. May 2017. [panel presentation] [paper]
Xiaolei Huang, Michael C. Smith, Michael J. Paul, Dmytro Ryzhkov, Sandra C. Quinn, David A. Broniatowski, Mark Dredze. Examining patterns of influenza vaccination in social media. AAAI Joint Workshop on Health Intelligence (W3PHIAI), San Francisco. February 2017. [paper]
Byron C. Wallace and Michael J. Paul. "Jerk" or "Judgemental"? Patient perceptions of male versus female physicians in online reviews. AAAI Joint Workshop on Health Intelligence (W3PHIAI), San Francisco. February 2017. [paper]

2016
Kevin Stowe, Michael Paul, Martha Palmer, Leysia Palen, Ken Anderson. Identifying and categorizing disaster-related tweets. EMNLP Workshop on Natural Language Processing for Social Media (SocialNLP), Austin, Texas. November 2016. [paper]
[code]
Ye Zhang, Erin Willis, Michael J. Paul, Noemie Elhadad, Byron C. Wallace. Characterizing the (perceived) newsworthiness of health science articles: a data-driven approach. JMIR Medical Informatics 4(3):e27. [article]
Michael J. Paul, Margaret S. Chisolm, Matthew W. Johnson, Ryan G. Vandrey, Mark Dredze. Assessing the validity of online drug forums as a source for estimating demographic and temporal trends in drug use. Journal of Addiction Medicine 10(5): 324-330. [preprint]
[article]
David A. Broniatowski, Mark Dredze, Karen M. Hilyard, Maeghan Dessecker, Sandra Crouse Quinn, Amelia Jamison, Michael J. Paul, Michael C. Smith. Both mirror and complement: a comparison of social media data and survey data about flu vaccination. Abstract accepted to American Public Health Association, Denver, Colorado. October 2016. [poster presentation]
Michael J. Paul, Ryen W. White, Eric Horvitz. Search and breast cancer: on episodic shifts of attention over life histories of an illness. ACM Transactions on the Web (TWEB) 10(2). [article]
[preprint]
Michael J. Paul. Interpretable machine learning: lessons from topic modeling. CHI Workshop on Human-Centered Machine Learning. San Jose, California. May 2016. [panel presentation; 46% acceptance] [paper]
[slides]
[video]
Atul Nakhasi, Ralph J. Passarella, Sarah G. Bell, Michael J. Paul, Mark Dredze, Peter J. Pronovost. The potential of Twitter as a data source for patient safety. Journal of Patient Safety. doi: 10.1097/PTS.0000000000000253 [article]
[preprint]
Animesh Koratana, Mark Dredze, Margaret S. Chisolm, Matthew W. Johnson, Michael J. Paul. Studying anonymous health issues and substance use on college campuses with Yik Yak. AAAI Workshop on the World Wide Web and Public Health Intelligence, Phoenix, Arizona. February 2016. [paper]
[slides]
Michael C. Smith, David A. Broniatowski, Michael J. Paul, Mark Dredze. Towards real-time measurement of public epidemic awareness: monitoring influenza awareness through Twitter. AAAI Spring Symposium on Observational Studies through Social Media and Other Human-Generated Content, Stanford, California. March 2016.
  • Also presented at the 3rd International Conference on Digital Disease Detection (DDD). See slides and video.
[paper]
[slides]
Adrian Benton, Michael J. Paul, Braden Hancock, Mark Dredze. Collective supervision of topic models for predicting surveys with social media. AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona. February 2016. [26% acceptance] [paper]
[code]
[data]
[slides]
Michael J. Paul, Abeed Sarker, John S. Brownstein, Azadeh Nikfarjam, Matthew Scotch, Karen L. Smith, Graciela Gonzalez. Social media mining for public health monitoring and surveillance. Pacific Symposium on Biocomputing (PSB), Big Island of Hawaii. January 2016. [paper]

2015
Benjamin M. Althouse, Samuel V. Scarpino, Lauren Ancel Meyers, John W. Ayers, Marisa Bargsten, Joan Baumbach, John S. Brownstein, Lauren Castro, Hannah Clapham, Derek A.T. Cummings, Sara Del Valle, Stephen Eubank, Geoffrey Fairchild, Lyn Finelli, Nicholas Generous, Dylan George, David R. Harper, Laurent Hebert-Dufresne, Michael A. Johansson, Kevin Konty, Marc Lipsitch, Gabriel Milinovich, Joseph D. Miller, Elaine O. Nsoesie, Donald R. Olson, Michael J. Paul, Philip M. Polgreen, Reid Priedhorsky, Jonathan M. Read, Isabel Rodriguez-Barraquer, Derek J. Smith, Christian Stefansen, David L. Swerdlow, Deborah Thompson, Alessandro Vespignani, Amy Wesolowski. Enhancing disease surveillance with novel data streams: challenges and opportunities. EPJ Data Science 4:47. [article]
Mauricio Santillana, Andre T. Nguyen, Mark Dredze, Michael J. Paul, Elaine O. Nsoesie, John S. Brownstein. Combining search, social media, and traditional data sources to improve influenza surveillance. PLOS Computational Biology 11(10): e1004513. [article]
Michael J. Paul. Topic modeling with structured priors for text-driven science. PhD thesis, Johns Hopkins University. [pdf]
[slides]
[code]
David A. Broniatowski, Mark Dredze, Michael J. Paul, Andrea Dugas. Using social media to perform local influenza surveillance in an inner-city hospital: a retrospective observational study. JMIR Public Health and Surveillance 1(1): e5. [article]
Michael J. Paul and Mark Dredze. Sprite: Generalizing topic models with structured priors. Transactions of the Association for Computational Linguistics (TACL) 3: 43-57.
  • More details can be found in Chapter 5 of my dissertation.
[paper]
[code]
[data]
[poster]
Michael Smith, David Broniatowski, Michael Paul, Mark Dredze. Tracking public awareness of influenza through Twitter. 3rd International Conference on Digital Disease Detection (DDD), Florence, Italy. May 2015. [rapid fire talk] [slides]
[video]
Michael J. Paul, Ryen W. White, Eric Horvitz. Diagnoses, decisions, and outcomes: Web search as medical decision support for cancer. 24th International World Wide Web Conference (WWW 2015), Florence, Italy. May 2015. [14.1% acceptance] [paper]
[slides]
Shiliang Wang, Michael J. Paul, Mark Dredze. Social media as a sensor of air quality and public response in China. Journal of Medical Internet Research 17(3): e22.
  • Presented at the AAAI Workshop on the World Wide Web and Public Health Intelligence, January 2015.
  • Press: CBS News, TakePart
[article]
[slides]
[data]
Byron Wallace, Michael J. Paul, Noemie Elhadad. What predicts media coverage of health science articles? AAAI Workshop on the World Wide Web and Public Health Intelligence, Austin, Texas. January 2015. [paper]
[slides]
Michael J. Paul, Mark Dredze, David A. Broniatowski, Nicholas Generous. Worldwide influenza surveillance through Twitter. AAAI Workshop on the World Wide Web and Public Health Intelligence, Austin, Texas. January 2015. [paper]
[slides]

2014
Michael J. Paul, Ryen W. White, Eric Horvitz. Search and breast cancer: On disruptive shifts of attention over life histories of an illness. Microsoft Research Technical Report: MSR-TR-2014-144. [paper]
Michael J. Paul, Mark Dredze, David Broniatowski. Twitter improves influenza forecasting. PLOS Currents Outbreaks. October 2014. doi: 10.1371/currents.outbreaks.90b9ed0f59bae4ccaa683a39865d9117.
  • Presented at the AAAI Workshop on the World Wide Web and Public Health Intelligence, July 2014.
  • Press: Slate, Fast Company
[article]
[slides]
David A. Broniatowski, Michael J. Paul, Mark Dredze. Twitter: Big data opportunities (letter). Science 345(6193): 148. [article]
[pdf]
Michael J. Paul and Mark Dredze. Discovering health topics in social media using topic models. PLOS ONE 9(8): e103408. [article]
[pdf]
[data]
Ahmed Abbasi, Donald Adjeroh, Mark Dredze, Michael J. Paul, Fatemeh Mariam Zahedi, Huimin Zhao, Nitin Walia, Hemant Jain, Patrick Sanvanson, Reza Shaker, Marco D. Huesch, Richard Beal, Wanhong Zheng, Marie Abate, Arun Ross.
Social media analytics for smart health. IEEE Intelligent Systems 29(2):60-80. Mar-Apr 2014.
  • Our article in this collection: M. Dredze and M.J. Paul, "Natural language processing for health and social media"  
[article]
[preprint]
Byron C. Wallace, Michael J. Paul, Urmimala Sarkar, Thomas A. Trikalinos, Mark Dredze. A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews. Journal of the American Medical Informatics Association (JAMIA) 21(6), 1098-1103. [article]
[preprint]
[data]
Shiliang Wang, Michael J. Paul, Mark Dredze. Exploring health topics in Chinese social media: an analysis of Sina Weibo. AAAI Workshop on the World Wide Web and Public Health Intelligence, Quebec City. July 2014. [paper]
[slides]
Mark Dredze, Renyuan Cheng, Michael Paul, David Broniatowski. HealthTweets.org: a platform for public health surveillance using Twitter. AAAI Workshop on the World Wide Web and Public Health Intelligence, Quebec City. July 2014. [paper]
[slides]
[website]

2013
David A. Broniatowski, Michael J. Paul, Mark Dredze. National and local influenza surveillance through Twitter: An analysis of the 2012-2013 influenza epidemic. PLOS ONE 8(12): e83672. [article]
[pdf]
Michael Paul, Eric Horvitz, Ryen White. Understanding cancer patients through search engine query logs. 2nd International Conference on Digital Disease Detection (DDD), San Francisco. September 2013. [rapid fire talk] [slides]
[video]
Michael J. Paul, Byron C. Wallace, Mark Dredze. What affects patient (dis)satisfaction? Analyzing online doctor ratings with a joint topic-sentiment model. AAAI Workshop on Expanding the Boundaries of Health Informatics Using AI (HIAI), Bellevue, WA. July 2013. [paper]
[data]
[slides]
Mark Dredze, Michael J. Paul, Shane Bergsma, Hieu Tran. Carmen: a Twitter geolocation system with applications to public health. AAAI Workshop on Expanding the Boundaries of Health Informatics Using AI (HIAI), Bellevue, WA. July 2013. [paper]
[code]
[slides]
Alex Lamb, Michael J. Paul, Mark Dredze. Separating fact from fear: Tracking flu infections on Twitter. 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), Atlanta. June 2013. [paper]
[data]
[slides]
[video]
Michael J. Paul and Mark Dredze. Drug extraction from the Web: Summarizing drug experiences with multi-dimensional topic models. 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), Atlanta. June 2013. [paper]
[code]
[slides]
[video]

2012
Michael J. Paul and Mark Dredze. Factorial LDA: Sparse multi-dimensional models of text. Advances in Neural Information Processing Systems (NIPS 2012), Lake Tahoe, Nevada. December 2012. [25% acceptance] [paper]
[code]
Michael J. Paul and Mark Dredze. Experimenting with drugs (and topic models): Multi-dimensional exploration of recreational drug discussions. In the AAAI 2012 Fall Symposium on Information Retrieval and Knowledge Discovery in Biomedical Text, Arlington, VA. November 2012. [full paper] [paper]
[slides]

Alex Lamb, Michael J. Paul, Mark Dredze. Investigating Twitter as a source for studying behavioral responses to epidemics. In the AAAI 2012 Fall Symposium on Information Retrieval and Knowledge Discovery in Biomedical Text, Arlington, VA. November 2012. [paper]

Atul Nakhasi, Ralph J. Passarella, Sarah G. Bell, Michael J. Paul, Mark Dredze, Peter J. Pronovost. Malpractice and malcontent: Analyzing medical complaints in Twitter. In the AAAI 2012 Fall Symposium on Information Retrieval and Knowledge Discovery in Biomedical Text, Arlington, VA. November 2012. [paper]

Ralph J. Passarella, Atul Nakhasi, Sarah G. Bell, Michael J. Paul, Peter J. Pronovost, Mark Dredze. Twitter as a source for learning about patient safety events. In the AMIA 2012 Annual Symposium (American Medical Informatics Association), Chicago, IL. November 2012. [oral presentation]
Michael J. Paul. Mixed membership Markov models for unsupervised conversation modeling. In the 2012 Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2012), Jeju Island, Korea. July 2012. [25% acceptance] [paper]
[code]
[slides]
Michael J. Paul and Jason Eisner. Implicitly intersecting weighted automata using dual decomposition. In the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2012), Montreal, Canada. June 2012. [paper]
[poster]

William M. Darling, Michael J. Paul and Fei Song. Unsupervised part-of-speech tagging in noisy and esoteric domains with a syntactic-semantic Bayesian HMM. In the EACL 2012 Workshop on Semantic Analysis in Social Media, Avignon, France. April 2012. [paper]


2011
Michael J. Paul and Mark Dredze. You are what you tweet: Analyzing Twitter for public health. In the proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM 2011), Barcelona, Spain. July 2011. [24% acceptance] [paper]
[slides]
[video]
Michael J. Paul and Mark Dredze. A model for mining public health topics from Twitter. Technical Report. Johns Hopkins University. 2011. [paper]
Delip Rao, Michael Paul, Clayton Fink, David Yarowsky, Timothy Oates, Glen Coppersmith. Hierarchical Bayesian models for latent attribute detection in social media. In the proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM 2011), Barcelona, Spain. July 2011. [short paper] [paper]
Roxana Girju and Michael J. Paul. Modeling reciprocity in social interactions with probabilistic latent space models. Natural Language Engineering 17(1), pages 1-36. Cambridge University Press 2011. [paper]
[article]
[data]

2010
Michael J. Paul, ChengXiang Zhai and Roxana Girju. Summarizing contrastive viewpoints in opinionated text. In the proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), pages 65-75, MIT, Cambridge, Massachusetts. October 2010. [25% acceptance] [paper]
[slides]
[data]
Michael Paul and Roxana Girju. Comparative scientific research analysis with a language-independent cross-collection model. In the proceedings of XXVI Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN 2010), Valencia, Spain. September 2010. [paper]
Michael Paul and Roxana Girju. A two-dimensional topic-aspect model for discovering multi-faceted topics. In the proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI-10), pages 545-550, Atlanta, Georgia. July 2010. [26.9% acceptance] [paper]
[slides]
[code]

2009
Michael Paul. Cross-collection topic models: Automatically comparing and contrasting text. Undergraduate thesis, advised by Roxana Girju. Department of Computer Science, University of Illinois at Urbana-Champaign. 2009. [paper]
[slides]
Michael Paul and Roxana Girju. Topic modeling of research fields: an interdisciplinary perspective. In the proceedings of Recent Advances in Natural Language Processing (RANLP 2009), Borovets, Bulgaria. September 2009. [paper]
Michael Paul and Roxana Girju. Cross-cultural analysis of blogs and forums with mixed-collection topic models. In the proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), pages 1408-1417, Singapore. August 2009. [paper]
[code]
[data]
Michael Paul, Roxana Girju, Chen Li. Mining the Web for reciprocal relationships. In the proceedings of the 13th Conference on Computational Natural Language Learning (CoNLL 2009), Boulder, Colorado. June 2009. [paper]
[data]

2008
Michael Paul and Roxana Girju. AIRTA: An automatic interdisciplinary research topic advisor. [extended abstract] NSF-sponsored Symposium on Semantic Knowledge Discovery, Organization and Use - Demo session, New York University. November 2008. [paper]
[poster]