Team leader: Bertrand CUISSART
The team considers the entire data processing chain as its object of study and focuses on imposing fine-grained control on this process. In addition, we explore the ability to integrate external constraints, whether legal, ethical or physical. Our work revolves around three themes: control over the data, managing the search for the optimal model, and the intelligibility of the resulting model.
Key-words : constraint resolution, optimization, graph problems, data mining, natural language processing.
Since learning from data is typically modeled as an optimization problem, its resolution requires access to a reliable representation of the target reality. CoDaG is interested in cases where this reference is problematic by focusing on three research topics: unsupervised evaluation, the measurement of inter-rater agreement, and the use of ontologies to assess data quality.
Constraint programming, optimization and data mining are at the heart of the data processing process. On the one hand, constraint and combinatorial optimization methods are used to declare and solve data mining tasks. On the other hand, data can provide information that can improve a constraint resolution process. As a result, the development of methods that hybridize these three domains is attracting increasing interest in the communities concerned.
CoDaG is interested in the interactive consideration of user preferences in the data analysis process. This interaction can go in two directions. One can envisage an interaction from the system towards the user: a user acquires new knowledge from the results returned by the system. In the other direction, an interaction from the user towards the system: the system learns a user’s preferences based on a compromise between criteria derived from their feedback. The form that user feedback can take and how exactly it is exploited remain open subjects of study.
The following list should not be considered exhaustive but rather shows the applications to which CoDaG members return often:
- Biological and chemical data processing
- Analysis of sports data
- Digital humanities
ALEC Céline – Associate professor at the University of Caen Normandie
BRETTO Alain – Full professor at the University of Caen Normandie
CREMILLEUX Bruno – Full professor at the University of Caen Normandie
CUISSART Bertrand – Associate professor at the University of Caen Normandie
LAMOTTE Jean-Luc – Full professor at the University of Caen Normandie
MATHET Yann – Associate professor (HDR) at the University of Caen Normandie
OUALI Abdelkader – Associate professor at the University of Caen Normandie
REYNAUD Justine – Associate professor at the University of Caen Normandie
RIOULT François – Associate professor (HDR) at the University of Caen Normandie
VAGINAY Athénaïs – Associate professor at the University of Caen Normandie
WIDLÖCHER Antoine – Associate professor at the University of Caen Normandie
ZIMMERMANN Albrecht – Associate professor at the University of Caen Normandie
BRUTUS Philippe – Associate researcher
BENGUIGUI Nicolas – Associate researcher
KASTNER Lise – Ph.D. student
LEHEMBRE Etienne – Ph.D. student
LEJMI Maroua – Ph.D. student
LIBREAU Clément – Ph.D. student
LOUDNI Samir – Associate researcher
MORTELIER Alexis – Ph.D. student
MORADI Neda – Ph.D. student
SAHBI Aya Nour-Elimane – Ph.D. student
SOUPLY Marc – Ph.D. student
PANDORA (ANR IA, 2025-2029)
NEO-REEDUC (Normandy region – FEDER, 2023-2025)
Paprica (PHC Utique, 2022-2023)
CodeGNN (ANR IA, 2022-2026)
InvolvD (ANR IA, 2021-2025)
Herelles (ANR IA, 2020-2025)
Orange Labs, RMAN SYNC, Roullier (financial support for thesis and internships: 2018-2023)
Schism (Normandy region, 2021-2022)
INCA (Normandy region, 2019-2022)
RHuNes (CNRS + Maupertuis programme, 2021)
AIMS (FEDER, 2017-2020)
CPER Numnie (2016-2020), with the Hultech team
AGAC (Normandy region, 2017-2019), with the Image team
PepTraq (Normandy region, 2017-2019), with the MAD team
REUs (FUI, 2016-2019)
CIFRE convention with Huawei (2016-2019)
Imprimerie Nationale (2018)
Nareca (ANR Contint, 2013-2018)
QCM-BioChem a follow-up of Decade (CNRS MASTODONS, 2017-2018)
Minomics – Mining Omics data for chemistry (Normandy region, 2015-2017)
Prefute (CNRS PEPS, 2015-2016)
Hybride (ANR blanche, 2011-2016)
Adn’Tox (FEDER, 2012-2015)
The team has a strong attractiveness at the application level. This characteristic notably makes it possible to develop long-term interdisciplinary collaborations, which naturally inspire innovation. See “Projects” item and our publications.
New project : NEOREEDUC – 2023–2025. Project Région Normandie Collaboratif Innovation (22E05784) with the company NeoXperience (
Thematic school “Complex sports data” (06/26/23-06/29/23)
DSChem project is a workshop at GDR MaDICS.
Workshop “Machine Learning and Data Mining for Sports Analytics” accepted at ECML/PKDD 2022.
Several members from CODAG are involved at ECML/PKDD 2022 : Journal Track co-chair, Workshop & Tutorial co-chair, PhD Forum co-chair, area chair, membres du comité de programme.
François Rioult was invited to participate in the Radio Phénix program “C’est pas faux” to discuss the scientific value of digital data.
Launch of the AMPERE project (2022/2025)
The thematic school “BigSportsData: Analysis of massive sports data”, organized by François Rioult and Albrecht Zimmermann took place from 27/06 to 30/06.
The CODAG team welcomes 6 new PhD students this year: Djawad Bekkoucha (HAISCODE thesis), Steve Gendarme (CIFRE thesis), Maroua Lejmi (co-supervision thesis), Lise Kastner (thesis on the AMPERE project), Neda Moradi (co-supervision thesis), Aya Nour Elimane Sahbi (ministerial thesis)
The CODAG team welcomes two new members: Jean-Luc Lamotte, University Professor in Computer Science, and Nicolas Benguigui, University Professor in STAPS.
We are co-organizer of the 8th Machine Learning and Data Mining for Sports Analytics (MLSA 21).
RHuNes project is accepted!
February 3rd 2021: kick-off of the ANR Involvd project.
Our proposal of “Big Sports Data” summer school (postponed in 2022) is accepted by CNRS.
2021: we got a paper at DAMI.
Welcome to Soufia Bennai (ATER), Hajar Rehioui-Karine (postdoctoral researcher), Maksim Koptelov (postdoctoral researcher), Aymeric Beauchamp, Chaima Boughanmi, Triss Jacquiot, Etienne Lehembre (internships).
November 2020: ANR Herelles project is launched.
David Batista Soares defended his Ph.D thesis on November 12th 2020. It is entitled “How to do the nature and the structure of information affect the optimal pricing algorithm to guarantee market efficiency and minimize fundamental prices volatility?”.
Maksim Koptelov defended his Ph.D thesis on September 30th 2020. It is entitled “Link prediction in bipartite multi-layer networks, with an application to drug-target interaction prediction”.
ANR InvolvD (2021-2025), ANR Herelles (2020-2024) and RIN Schism (2020-2022) projects are accepted!
We are publicity and public relation co-chair at ECML/PKDD 2020, poster chair at IDA 2020, co-organizer of the 2nd Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning and the 7th Machine Learning and Data Mining for Sports Analytics (MLSA 20) at ECML/PKDD 2020.
We are invited at a Dagstuhl seminar and the SML 2020 workshop.
We got collaborations and contracts with Orange Labs and Rman Sync compagnies.
Tenured associate professor position in computer science (Computational sciences and data science for digital humanities), University of Caen Normandie. The position is now closed.
January 2020: we take part of the executive committee (deputy-head) of the GDR MaDICS.
2020: we got papers at ECML/PKDD, DSAA (video), SAC (video) conferences, AIJ, Wiley Interdiscip. Rev. Data Min. Knowl. Discov., Discrete Mathematics and Linear and Multilinear Algebra journals.
2020: welcome to Hayfa Azibi (PhD student), Mina Rafla (PhD, student), Justine Reynaud (Associate Professor), Marc Souply (PhD student).
Tenured associate professor position in data science (constraints, data mining), University of Caen Normandie. The position is now closed.
Rafic Nader defended his Ph.D thesis on June 28th 2019. It is entitled “A study concerning the positive semi-definite property for similarity matrices and for doubly stochastic matrices with some applications”.
Anthony Palmieri defended his Ph.D thesis on May 15th 2019. It is entitled “Nouvelles Techniques pour les Constraint Games”.
Noureddine Aribi (University of Oran) has been an invited professor at University of Caen in April 2019. We worked on unsupervised declarative approaches.
March 2019: short visit of Marc Plantevit (LRIS, Lyon). We have exchanged on augmented graphs and biological networks.
RIN INCA (2019-2022) project is accepted.
CPER Numnie supports (engineer) our work on text mining techniques to discover relations in texts.
We are publicity and public relation co-chair at ECML/PKDD 2019, co-organizer of the 1st Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning at SDM 2019 and the 6th Machine Learning and Data Mining for Sports Analytics (MLSA 19) at ECML/PKDD 2019.
2019: we got the best paper at AI TEST 2019, papers at IJCAI, ICTAI, KES, IEA/AIE conferences, Linear Algebra and its Applications, TCS, Constraints journals.
2019: welcome to Anaëlle Baledent (PhD student, together with Hultech team), Nida Meddouri (ATER), Abdelkader Ouali (Associate Professor).
June 2018: short visit of Ian Davidson, University of California, Davis, US. We exchanged on several declarative approaches for pattern sets.
May 2018: Project CNRS Mastodons QCM-BioChem (Quality in Consensualizing and Mining biological and chemical datasets) is launched.
Mohamad Badaoui defended his Ph.D thesis on March 30th 2018. It is entitled “G-graphs and Expander graphs”.
2018: we got collaborations and contracts with Inprimerie Nationale and Roullier compagnies.
We are co-organizer of the 5th Machine Learning and Data Mining for Sports Analytics (MLSA 18), we take part in the organization of SML 2018.
2018: we got papers at CP, ECML/PKDD, KDD, PAKDD, IDA, ICTAI, CICLing conferences, DAMI, Journal Medicinal Chemistry.
2018: welcome to Wiem Belhedi (post-doc), David Condaminet (engineer), Arnold Hien (PhD student), Ludovic Jean-Baptiste (engineer), Nhat Vinh VO (post-doc).
François Rioult defended his Habilitation thesis on December 7th 2017. It is entitled “Fouille de données : motifs minimaux, redescription d’espace et analyse du (e-)sport”.
September 2017: AGAC project is launched.
Bamba Kane defended his Ph.D thesis on September 6th 2017. It is entitled “Extraction et sélection de motifs émergents minimaux : application à la chémoinformatique”.
Abdelkader Ouali defended his Ph.D thesis on July 3rd 2017. It is entitled “Méthodes hybrides parallèles pour la résolution de problèmes d’optimisation combinatoire : application au clustering sous contraintes”.
May 2017: project CNRS Mastodons Decade is launched. Collaborative research on knowledge discovery and decision support to therapeutic chemistry.
May 2017: project FEDER AIMS “Automated Integrated Monitoring System” is launched.
We are co-organizer of the 4th Machine Learning and Data Mining for Sports Analytics (MLSA 17).
2017: we got papers at PAKDD, UAI, IJCAI, ICTAI, ICIP conferences, AIJ, Autom. Softw. Eng., Constraints, Machine Learning journals, co-editor of a DAMI special issue on sports analytics.
2017: welcome to Pegah Alizadeh (post-doc), Emna Hachicha (post-doc), Maksim Koptelov (PhD student).
November 2016: project FUI REUs is launched.
Samir Loudni defended his Habilitation thesis on October 5th 2016. It is entitled “Contributions à la résolution des WCSP et approches déclaratives pour la fouille de données”.
September 2016: we started a collaboration with Huawei.
2016: CPER/Numnie supports our work on text annotation (engineer) and sport analytics (post-doc).
We are tutorial/workshops co-chair à ECML/PKDD 2016.
We gave a tutorial on Preference-based Pattern Mining at ECML/PKDD 2016 (see here), ICFCA 2017 (see here) and BDA 2017 (see here).
2016: we got papers at CP, CPAIOR, IJCAI, IDA, Interspeech conferences, AIJ, Statistical Analysis and Data Mining journals.
2016: welcome to David Batista Soares (PhD student), Anthony Palmieri (PhD student).
May 2015: project CNRS PEPS Préfute is launched.
Guozhu Dong, head of the Data Mining Research Lab., Wright State University, Dayton, US, has been an invited professor at University of Caen in May 2015.
April 2015: Minomics project is launched.
2015: we got papers at AIME, CP, ICTAI, PAKDD, IDA, SAC, DocEng conferences, J. Biomedical Semantics, . J. of Chemical Information Modeling, Constraints, Electronic Notes in Discrete Mathematic, JIIS, TCS, Computer Vision and Image Understanding, Discrete Applied Mathematics, Computational Linguistics – Best paper at COSI 2015.
2015: welcome to Gaël Lejeune (post-doc), Valentin Lemière (PhD student), Rafic Nader (PhD student), Albrecht Zimmermann (associate professor).
Activity research report and research program in 2015 (in French).