Michael Paul - Twitter Flu Annotations Dataset

Twitter Flu Annotations

Here you can download the annotated data we used for training influenza classifiers described in the paper below. This is only the annotated training data (about 10K tweets), and not the full set of Twitter data.

Also note that this dataset does not contain the content of the tweets. The tweet IDs are included, which you can use to download the tweet content using the Twitter API.

References

Alex Lamb, Michael J. Paul, Mark Dredze. Separating Fact from Fear: Tracking Flu Infections on Twitter. To appear at the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), Atlanta. June 2013.