List of Publications:

The optimal treatment for patients with severe coronavirus-19 disease (COVID-19) and hyper-inflammation remains debated. A cohort study was designed to evaluate whether a therapeutic algorithm using steroids with or without interleukin-1 antagonist (anakinra) could prevent death/invasive ventilation. Patients with a >= 5-day evolution since symptoms onset, with hyper-inflammation (CRP>= 50mg/L), requiring 3–5 L/min oxygen, received methylprednisolone alone. Patients needing >= 6 L/min received methylprednisolone + subcutaneous anakinra daily either frontline or in case clinical deterioration upon corticosteroids alone. Death rate and death or intensive care unit (ICU) invasive ventilation rate at Day 15, with Odds Ratio (OR) and 95%.

Recombinant Inbred Lines (RILs) are obtained through successive generations of inbreeding. In 1931 Haldane and Waddington published a landmark paper where they provided the probabilities of achieving any combination of alleles in 2-way RILs for 2 and 3 loci. In the case of sibling RILs where sisters and brothers are crossed at each generation, there has been no progress in treating 4 or more loci, a limitation we overcome here without much increase in complexity. In the general situation of L loci, the task is to determine 2L probabilities, but we find that it is necessary to first calculate the 4L “identical by descent” (IBD) probabilities that a RIL inherits at each locus its DNA from one of the four originating chromosomes. We show that these 4L probabilities satisfy a system of linear equations that follow from self-consistency. In the absence of genetic interference— crossovers arising independently—the associated matrix can be written explicitly in terms of the recombination rates between the different loci. We provide the matrices for L up to 4 and also include a computer program to automatically generate the matrices for higher values of L. Furthermore, our framework can be generalized to recombination rates that are different in female and male meiosis which allows us to show that the Haldane and Waddington 2-locus formula is valid in that more subtle case if the meiotic recombination rate is taken as the average rate across female and male. Once the 4L IBD probabilities are determined, the 2L probabilities of RIL genotypes are obtained via summations of these quantities. In fine, our computer program allows to determine the probabilities of all the multilocus genotypes produced in such sibling-based RILs for L<=10, a huge leap beyond the L = 3 restriction of Haldane and Waddington.

Single nucleotide polymorphisms (SNPs) are used widely for detecting quantitative trait loci, or for searching for causal variants of diseases. Nevertheless, structural variations such as copy-number variants (CNVs) represent a large part of natural genetic diversity, and contribute significantly to trait variation. Numerous methods and softwares based on different technologies (amplicons, CGH, tiling, or SNP arrays, or sequencing) have already been developed to detect CNVs, but they bypass a wealth of information such as genotyping data from segregating populations, produced, e.g., for QTL mapping. Here, we propose an original method to both detect and genetically map CNVs using mapping panels. Specifically, we exploit the apparent heterozygous state of duplicated loci: peaks in appropriately defined genome-wide allelic profiles provide highly specific signatures that identify the nature and position of the CNVs. Our original method and software can detect and map automatically up to 33 different predefined types of CNVs based on segregation data only. We validate this approach on simulated and experimental biparental mapping panels in two maize populations
and one wheat population. Most of the events found correspond to having just one extra copy in one of the parental lines, but the corresponding allelic value can be that of either parent. We also find cases with two or more additional copies, especially in wheat, where these copies locate to homeologues. More generally, our computational tool can be used to give additional value, at no cost, to many datasets produced over the past decade from genetic mapping panels.

In this paper, we consider modelling interaction between a set of variables in the context of time series and high dimension. We suggest two approaches. The first is similar to the neighborhood lasso when the lasso model is replaced by a support vector machine (SVMs). The second is a restricted Bayesian network adapted for time series. We show the efficiency of our approaches by simulations using linear, nonlinear data set and a mixture of both.

Different types of Bayesian networks may be used for supervised classification. We combine such approaches together with feature selection and discretization and we show that such combination gives rise to powerful classifiers. A large choice of data sets from the UCI machine learning repository are used in our experiments and an application to Epilepsy type prediction based on PET scan data confirms the efficiency of our approach.