Title: Examining Additivity and Weak Baselines
Date: July 27, 2017 (Thursday)
Time: 4:30 p.m. - 5:30 p.m.
Venue: Room 121, 1/F, Ho Sin-hang Engineering Building,
The Chinese University of Hong Kong,
Shatin, N.T.
Speaker: Prof. Mark SANDERSON
School of Computer Science and Information Technology
RMIT University



I will present a study of which baseline to use when testing a new information retrieval (IR) technique. In contrast to past work, we show that measuring a statistically significant improvement over a weak baseline is not a good predictor of whether a similar improvement will be measured on a strong baseline. Indeed, sometimes strong baselines are made worse when a new technique is applied. We investigate whether conducting comparisons against a range of weaker baselines can increase confidence that an observed effect will also show improvements on a stronger baseline. Our results indicate that this is not the case - at best, testing against a range of baselines means that an experimenter can be more confident that the new technique is unlikely to significantly harm a strong baseline. Examining recent past work, we present evidence that the IR community continues to test against weak baselines. This is unfortunate, as in the light of our experiments we conclude that the only way to be confident that a new technique is a contribution is to compare it against, nothing less than the state of the art.



Mark Sanderson is Professor of Information Retrieval at RMIT University and is Director of the ISE Enabling Capability Platform. Prof Sanderson is head of the RMIT Information Retrieval (IR) group, which is regarded as the leading IR group in Australia. He is co-editor of Foundations and Trends in Information Retrieval, which is currently the highest impact rated IR journal. He is also an associate editor of IEEE TKDE and of ACM TWeb. Prof. Sanderson was co-PC chair of ACM SIGIR in 2009 and 2012, and general chair of the conference in 2004. Prof Sanderson is also a visiting professor at NII in Tokyo.


