Home - Pierre Epron

About me

My name is Pierre Epron, I am 29-year old. Currently in my final semester of pursuing a Master's degree in Natural Language Processing (NLP) at Nancy. I had the privilege of delving into projects such as medical report analysis, honing my skills and understanding of linguistic data. Presently, I am engaged in an internship at LORIA, where my focus lies in leveraging Large Language Models (LLMs) to enhance the classification of irony. Eager to further delve into the depths of NLP research, I am actively seeking opportunities to pursue a PhD in the field.

Contact

NLP Master Projects

Irony Categories

Context

A 6-month project carried out as part of the end-of-Master's NLP internship
At LORIA, Orpailleur, Synalp
Between 03/2024 - 09/2024
Supervised by Gaël Guibon and Miguel Couceiro
github

Brief

Large Language Models (LLMs) are trained on massive amount of data (Kaplan et al., 2020), however these data are not filtered and integrate User Generated Content (UGC) which is prone to possess ironic statements. This leads LLMs to incorporate some bias due to irony that can stem from positive sentiment words, interjections or topical words (Maladry et al., 2023a). Bias hinders the exploit of emergent abilities from LLMs (Srivastava et al., 2022), especially for irony detection and explanation, which prevent their usage for UGC disambiguation, humoristic content generation, or more personified conversational agents. Currently, irony detection is mainly considered as a binary classification task (Van Hee et al., 2018; Maladry et al., 2023b) and studying the strategies underlying the irony expression is yet to be made, even though these strategies are important to decode harmful content using irony from real opinion expression, or even simple humoristic content. An alternative from these strategies comes from the irony categories from SemEval2018 Task 3 subtask A (Van Hee et al., 2018), which actually focuses on a multiclass classification task to classify irony created from a verbal polarity clash, other verbal irony, and situationnal irony. In this internship proposal, we want to focus on exploiting these 3 classes along with their explanation to better identify and control the irony detection’s binary classification task. This comes with many challenges as state-of-the-art models, such as LLMs, are challenging to control.

Irony Dectection

Context

A 6-month project carried out during the first semester of the second year of the NLP Masters.
At Université de Lorraine, IDMC
Between 09/2023 - 02/2024 (Half time)
Supervised by Gaël Guibon and Miguel Couceiro
github
report

Brief

Irony is a complex linguistic phenomenon with varying interpretations that poses a challenge for both humans and automated systems. Recognizing irony is crucial, especially in the context of harmful behavior on social media (Reyes et al. 2012, Estrella et al., 2023, Frenda et al., 2023). Leveraging recent advancements and a rich irony dataset, this project aims to enhance irony detection using state-of-the- art language models, comparing the performances of traditional Language Models against the performances of Large Language Models (LLMs), using two irony datasets, one focused on the different perspectives of the annotators and the other focused on the different types of irony

JusTAL

Context

A 1 year project carried out during the first year of the NLP Masters
At Université de Lorraine, IDMC
Between 09/2022 - 07/2023 (Half time)
Supervised by Samuel Ferey
github
report

Brief

Constitutional judges play a vital role in modern democracies, particularly in the interpretation and implementation of equality ideals. Since the 1970s, the French Constitutional Council has consistently employed the concept of equality in various forms to evaluate the constitutionality of legislation. Present estimations suggest that approximately half of the Council’s judgments incorporate the principle of equality. The Isovote project seeks to investigate the Council’s evolving philosophy by examining a previously underutilized source the reports of the Council’s debates. Through this research, the objective is to gain insights into how the Council has shaped its philosophy over time. This tutored project is a component of the Isovote project, aiming to analyze a specific corpus to gain a better understanding of the decision-making process of the Constitutional Council (CC). The objective is to characterize, explain, and comprehend the opinions of each councilor, thereby shedding light on the contributions of the CC to the evolution of the French social state and its assessment of legislative texts pertaining to the fight against inequalities.

Paperjam

Context

A 6-month project carried out during the first semester of the first year of the NLP Masters
At Université de Lorraine, IDMC
Between 09/2022 - 02/2022 (Half time)
Supervised by Maxime Amblard and Miguel Couceiro
github
report

Brief

Identifying scientific articles related to a specific feature is crucial for most researchers and students. The field of Natural Language Processing (NLP) has long been interested in this task. Different Information Extraction (IE) pipelines have been developed for this purpose. Most of them adopt a similar approach. They first extract mentions associated with specific entities such as tasks, materials, metrics, and methods. Then, they arrange them into coreference clusters. Lastly, they establish relationships between these clusters. This article aims to show some of the unresolved issues with that approach, by focusing on an already existing pipeline: SciREX. To achieve this, it was first necessary to set up their methodology, to which a few minor modifications were implemented.

IA Simplon Projects

PET reports

Context

A 1-month project carried out during my apprenticeship at NancyClotep.
At CHRU Nancy
Between 09/2020 - 09/2021
github
report

Brief

As part of our training as an Artificial Intelligence (AI) developer, I had the opportunity to do my professionalisation contract at the Nancy Brabois Regional University Hospital (CHRU) and at NancyClotep, an interest group in direct contact with the CHRU's nuclear medicine department. The objective was to create a database of Positron Emission Tomography (PET) images. The main aim of this database was to structure and associate key information with the images (pathology, anatomy, etc.). This would subsequently enable different sets of images to be composed and used in computer vision research. To do this, Professor Karcher asked us to try and extract as much relevant information as possible from the medical reports associated with each examination and therefore with each image. Our main approach was to finetune the token classification models provided by the spacy library.

Education

Master NLP

Between 09/2022 - 09/2024
At Université de Lorraine, IDMC
link

(RNCP title) Developer in Artificial Intelligence

Between 01/2020 - 09/2021
At Simplon Nancy, CHRU Nancy, NancyClotep
link

Literary Baccalaureate

Between 09/2011 - 09/2012
At Lycée Jeanne D'Arc Nancy

Former jobs

Civic Service: Digital Support

Between 08/2019 - 01/2020:
At Pôle emploi Cristallerie Nancy

Technician Assistant

Between 08/2016 - 05/2017
At INRAE Champenoux

Interests

NLP

Computational Linguistics
Neural Networks
LLMs
Knowledge Graphs
NEAT

Computer Sciences

NEAT (NeuroEvolution of Augmenting Topologies)
Video game development (Unity C#)
Web (Vue, React, Flask, Django)

Others

Rôle Playing Game (Game master since 10 years)
Literature (Hamilton, Bordage, Cixin, Tchaikovsky, Hobb ...)
Trips (Autralia, Japan, Brasil, Congo ...)