When carrying out studies for our clients, Erdyn’s consultants are required to analyse and synthesise information contained, in particular, in the scientific literature. In general, the purpose of reading a scientific publication is to clarify questions and provide specific answers to a need for information, but the considerable wealth of knowledge present in the many scientific texts that we use quickly becomes a laborious and resource-intensive process. As innovation is in our DNA, we have launched a promising research and development project that will ultimately allow our consultants to focus on the finesse of the analyses and studies carried out and thus ensure the quality of Erdyn’s expertise in the shortest possible time. The tools developed within the framework of this project will relieve the consultants of some of the indispensable but time-consuming tasks of identifying content of interest.
The collaboration between Erdyn and Sorbonne University (LIMICS) on this high-stakes project is taking shape in the form of a CIFRE thesis which began in September 2020. This thesis attempts a disruptive approach that is complementary to the ontology creation proposed by most software editors at the present time: natural language processing (NLP) by developing approaches based on deep learning.
In NLP, the sub-field of artificial intelligence that deals with natural language, Question Answering systems have the potential to make the exploitation and consumption of scientific content more efficient and rapid. The major constraint to their use is that their performance decreases considerably as soon as they observe a change in the data domain. By using transfer learning techniques, we aim to adapt Q/A systems to Erdyn’s dynamism and multidisciplinarity, and thus continue to support our clients with increased relevance and efficiency in their innovation efforts in a constantly changing world.