My research deal with Information Retrieval and Natural Language Processing and more specifically with Information Retrieval and Extraction in large collections of documents (Web pages, digital libraries), sentiment analysis and query oriented recommender systems. I have been President of the Technical Committee of the initiative ISTEX since 2016, head of OpenEdition Lab since 2011 and of the DIMAG team in LSIS since 2013.
Between 2000 and 2011, I proposed several methods for classifying texts (unsupervised decision trees) and for segmenting them (weighted lexical chains) in order to improve information retrieval.
During the ANR CAAS (2010-2013) project, we proposed a method for contextualizing and expanding queries and then methods for filtering Web pages targeting specific entities (by means of a new diachronic probabilistic model). The Google Digital Humanities Award I received with Marin Dacos (OpenEdition.org) in 2011 allowed us to begin a concrete collaboration with OpenEdition for employing NLP approaches for improving navigation and searching in the context of a digital library in Social Science & Humanities. Since then, we have proposed approaches for the automatic creation of links between articles in journals, books and blogs by analyzing common references and citations, and we developed a recommender system integrating information retrieval, automatic classification and sentiment analysis of book reviews.
For these different activities, we place great emphasis on participating to international challenges such as CLEF Social Book Search for query based recommandation, Semeval for sentiment analysis, TREC and CLEF for question-answering, TREC Entity, KBA, Medical for information retrieval and filtering. Between 2010 and 2014, I have been one of the organizers of the track Tweet Contextualization in CLEF along with IRIT and LIMSI labs.