Foundations of IR Evaluation 2: A thousand and one ways of doing evaluation wrong
Abstract: We will devote this second ESSIR talk about evaluation to underline two things. First, that evaluation might be the most crucial aspect of high-quality research in Information Access and pervades all aspects of the research cycle, rather than being a final step to assess how well we solved our target problem. Second, that proper evaluation – the one that leads to valid, relevant, unbiased and generalizable results – can be very challenging. In fact, proper evaluation methodologies are still an open issue for many Information Access topics. While the first ESSIR talk on evaluation will have an affirmative nature, covering the essential evaluation tools and good practices, this second talk will have a more interrogative edge, and will focus on all the things that can go wrong, how they may lead to biased or flawed results, and how they can, occasionally, even drive an entire research community into unproductive or misleading roads.
Short Bio: Julio Gonzalo is full professor at UNED (Universidad Nacional de Educación a Distancia, Spain), where he leads the Research Group in Natural Language Processing and Information Retrieval (nlp.uned.es). His current research interests are in Evaluation Metrics and Methodologies for Information Access, Semantic Textual Similarity and Information Access Technologies for Social Media. A list of his publications can be found at https://scholar.google.com/citations?user=opFCmpYAAAAJ