Recent full paper acceptances at ACM Hypertext and IEEE BigData
Visualization of our diversity/novelty microblog corpus qrels (presented at AIRS 2013)
The proceedings of our SIGIR workshop on Temporal, Social and Spatially-aware Information Access (TAIA2014) are online.
I am an Assistant Professor at the Web Information Systems group,
Delft University of Technology.
Between 2011 and 2012 I worked as Postdoc in the same group, conducting research in the scope of the ImREAL project.
I received my PhD from the University of Twente, where I worked in the Human Media Interaction group under the supervision of Franciska de Jong and Djoerd Hiemstra. The Otto-von-Guericke University of Magdeburg in Germany was my home during my undergraduate years as a student in computer science.
My research interests include query performance prediction (the topic of my PhD thesis [abstract]), IR for specific user groups, personalization and false memories in search.
TU Delft - EWI/ST/WIS PO Box 5031 2600 GA Delft The Netherlands Office: HB 08.100 Email: c.hauff[at]tudelft.nl or claudia.hauff[at]gmail.com
Data SetsA list of data sets derived from our research that are publicly available:
- MediaEval 20013 Placing Task data
- Author Verification data based on Wikipedia Talkpages (SIGIR 2014)
- Lecturer of the 2nd year BSc course Big Data Processing (Nov. 2013) [lecture slides]
- Lecturer (50%) of the 1st year BSc course Web- and Database Technology (Nov. 2013) [lecture slides]
- Lecturer of the MSc course Information Retrieval (Feb. 2012) [lecture slides]
If you are interested in a Master thesis project in information retrieval, have a look at the ongoing benchmark campaigns - often a good starting point for finding a topic in information retrieval:
We also have industry contacts, if you are interested in working at a company during your thesis.
Students I have supervised and am currently supervising are researching a range of topics related to information retrieval including mobile re-finding behavior, author identification, spelling correction, big data architectures and query log analysis.
- Temporal distribution of microblog qrels in a diversity/novelty setup.
- Visualization on the accuracy of Flickr geotags (SIGIR 2013 work)
- A visualization of an automatic retrieval system evaluation technique and its application on more than 20 TREC tasks.
- A visualization that shows the unique contributions of TREC runs to the relevance assessment pool.
- A visualization of query difficulty.
Publications [DBLP] [Google Scholar]
Martha Larson, Pascal Kelm, Adam Rae, Claudia Hauff, Bart Thomee et al., The Benchmark as a Research Catalyst: Charting the Progress of Geo-prediction for Social Multimedia, book chapter in Multimodal Location Estimation of Videos and Images (Springer Publishing), 2014 [link]
Ke Tao, Claudia Hauff, Geert-Jan Houben, Fabian Abel, and Guido Wachsmuth, Facilitating Twitter Data Analytics: Platform, Language, and Functionality, accepted as a full paper at IEEE BigData 2014
Jie Yang, Claudia Hauff, Alessandro Bozzon, and Geert-Jan Houben, Answering the Right Question: on the Editing of Questions in Collaborative Question Answering Systems, accepted as full paper at ACM Hypertext 2014. [pdf] [slides]
Claudia Hauff, Bart Thomee, and Michele Trevisiol, Working Notes
for the Placing Task at MediaEval, In MediaEval 2013 Workshop, 2013 (task organizers) [pdf]
The Placing Task 2013 data can be downloaded here.
Ke Tao, Claudia Hauff and Geert-Jan Houben, Building a Microblog Corpus for Search Result Diversification, accepted as a full paper at AIRS 2013
Gudrun Wesiak, Adam Moore, Christina M. Steiner, Claudia Hauff, Conor Gaffney, Declan Dagger, Dietrich Albert, Fionn Kelly, Gary Donohoe, Gordon Power and Owen Conlan, Affective Metacognitive Scaffolding and User Model Augmentation for Experiential Training Simulators: A Follow-up Study , EC-TEL 2013, pp. 396-409, 2013
Adam Moore, Gudrun Wesiak, Christina M. Steiner, Claudia Hauff, Declan Dagger, Gary Donohoe and Owen Conlan, Utilizing social neworks for user model priming: user attitudes, UMAP Workshops: Late Breaking Results, [pdf]
Christophe Deloo and Claudia Hauff, Exploiting Semantic Relatedness Measures for Multi-label Classifier Evaluation, accepted as research contribution at the Dutch-Belgian IR Workshop 2013 [pdf] [proceedings link]
Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben and Ujwal Gadiraju, Groundhog Day: Near-Duplicate Detection on Twitter, WWW 2013, pp. 1273-1284, 2013 [slides]
Claudia Hauff and Gerald Friedland, Brave New Task: User Account Matching, MediaEval: Benchmarking Initiative for Multimedia Evaluation, 2012 [pdf] [slides]
Fabian Abel, Claudia Hauff, Geert-Jan Houben and Ke Tao, Leveraging User Modeling on the Social Web with Linked Data, ICWE 2012, pp. 378-385, 2012 [slides (Ke Tao)]
Ke Tao, Fabian Abel, Claudia Hauff and Geert-Jan Houben, Twinder: a search engine for Twitter streams, ICWE 2012, pp. 153-168, 2012 [slides (Ke Tao)]
Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ke Tao and Richard Stronkman, Twitcident: Fighting Fire with Information from Social Web Streams, WWW '12 Companion, pp. 305-308, 2012 [link to paper] [link to website]
Ke Tao, Fabian Abel, Claudia Hauff and Geert-Jan Houben, What makes a tweet relevant for a topic?, WWW 2012 workshop "Making Sense of Microposts" [proceedings]
Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ke Tao and Richard Stronkman, Semantics + Filtering + Search = Twitcident: Exploring Information in Social Web Streams, ACM Hypertext, pp. 285-294, 2012 [link]
Ke Tao, Fabian Abel and Claudia Hauff, WISTUD at TREC 2011: Microblog Track [pdf]
Claudia Hauff and Geert-Jan Houben, Simulating Memory Recall in Personal Search, EPS 2011 (Evaluating Personal Search Workshop), 2011 [link to proceedings]
Fabian Abel, Ilknur Celik, Claudia Hauff, Laura Hollink and Geert-Jan Houben, U-Sem: Semantic Enrichment, User Modeling and Mining Usage Data on the Social Web, 1st International Workshop on Usage Analysis and the Web of Data (USEWOD2011), short paper, 2011 [pdf]
Dolf Trieschnigg and Claudia Hauff, Classic Children's Literature - Difficult to Read?, ECIR 2011, pp. 691 - 694, 2011 [link]
Djoerd Hiemstra and Claudia Hauff, MapReduce for experimental search, TREC 2010 [link]
Claudia Hauff and Dolf Trieschnigg, Enhancing Access to
Classic Children's Literature, BooksOnline
2010 workshop (co-located with CIKM) [pdf] [link]
The proposed project was awarded a seed fund, sponsored by Microsoft Research. [Workshop report]
Claudia Hauff, Leif Azzopardi and Diane Kelly, A Comparison of User and System Performance Predictions, CIKM 2010, pp. 979 - 988, 2010 [link]
Djoerd Hiemstra and Claudia Hauff, MapReduce for information retrieval evaluation: "Let's quickly test this on 12 TB of data", CLEF 2010, pp. 64 - 69, 2010 [pdf]
Guido Zuccon, Leif Azzopardi, Claudia Hauff and C.J. van Rijsbergen, Estimating Interference in the QPRP for Subtopic Retrieval, SIGIR 2010, pp. 741-742, 2010 [link]
Claudia Hauff, Djoerd Hiemstra, Franciska de Jong and Leif Azzopardi, Relying on topic subsets for system ranking estimation, CIKM 2009, pp. 1859-1862 [link]
Ricardo Baeza-Yates, Vanessa Murdock and Claudia Hauff, Efficiency trade-offs in two-tier web search systems, SIGIR 2009, pp. 163-170, 2009 [link]
D. Nguyen, A.Overwijk, C.Hauff, R.B. Trieschnigg, D. Hiemstra, F.M.G. de Jong, WikiTranslate: Query Translation for Cross-lingual Information Retrieval using only Wikipedia, LNCS - CLEF 2008 [pdf] [link]
Claudia Hauff, Query Difficulty for Digital Libraries, presented at the ECDL 2008 Doctoral Consortium, published in the Fall 2009 issue of the TCDL Bulletin (Volume 5, Issue 2) [pdf]
Claudia Hauff, Djoerd Hiemstra and Franciska de Jong, A Survey of Pre-Retrieval Query Performance Predictors, CIKM 2008, pp. 1419-1420, 2008 [link]
Claudia Hauff, Vanessa Murdock and Ricardo Baeza-Yates, Improved Query Difficulty Prediction for the Web, CIKM 2008, pp. 439-448, 2008 [link]
R. Aly, C. Hauff, W. Heeren, D. Hiemstra, F. de Jong, R. Ordelman, T. Verschoor and A. de Vries, The Lowlands team at TRECVID 2007, TRECVID 2007 [pdf]
Djoerd Hiemstra, Claudia Hauff, Franciska de Jong and Wessel Kraaij, SIGIR's 30th anniversary: an analysis of trends in IR research and the topology of its community, ACM SIGIR Forum, Vol. 41, No. 2 [link]
Claudia Hauff, Robin Aly and Djoerd Hiemstra, The Effectiveness of Concept Based Search for Video Retrieval, WIR 2007 [pdf]
Claudia Hauff, Dolf Trieschnigg and Henning Rode, University of Twente at GeoCLEF 2006: geofiltered document retrieval, CLEF 2006, LNCS 4730, pp. 958-961, 2007 [link]
Claudia Hauff and Andreas Nürnberger, Utilizing scale-free networks to support the search for scientific publications, Proc. of the Dutch Belgian Workshop in Information Retrieval (DIR'06), 2006 [pdf]
Claudia Hauff and Andreas Nürnberger, On the use of scale-free networks for information network modelling, Proc. of 1st European Symposium on Nature-inspired Smart Information Systems, 2005 [pdf]
Claudia Hauff and Leif Azzopardi, Age dependent document priors in link structure analysis, 27th European Conference on IR Research (ECIR'05), 2005 [pdf]