Description

Actions de Recherches Concertées 2011 - 2016
"Semiparametric Inference for Survival and Cure Models"

The goals of the research

When modeling time-to-event data (like e.g. the time to death of a patient due to a certain disease), we typically assume that all subjects are at risk and will experience the event of interest if followed long enough. However, a typical feature of many medical applications is the possibility of "cure", in the sense that some of the subjects will actually not experience the event. Cure models are survival models allowing a cured proportion of individuals. Moreover, measuring times to a certain event in practice naturally induces the presence of right censoring, meaning that one may only observes lower bounds for these quantities. For instance, a patient might leave the study, he might still be alive at the end of the study or he might eventually die due to another cause.
 
Combining both censoring and possibility of cure involves identifiability problems. Indeed, even though the follow-up period is long, it is hard to distinguish a censored individual in the uncured group from a cured individual. In the otherhand, modeling such a data is of great importance since they are encountered in many applications: beyond medicine, they appear in a wide variety of fields, like sociology, economy, insurance, ecology, applied sciences, etc.

In this project, we propose and study relationships between a survival time of the above type and certain explanatory variables (like for example, the age of the patient when she/he enters the study). More precisely, three attractive broad classes of regression models will be tackled (as well as the interactions between them). They concern very well-known and largely used ways to describe relations between random variables, namely, quantile regression (e.g., study of the linear conditional median and quartiles), nonparametric location-scale models (e.g., study of the nonparametric conditional mean or truncated mean) and frailty models (e.g., extending Cox proportional hazards models to random effects).

As a first step, we will discuss the necessary assumptions needed to identify quantities appearing in the different models in accordance with the complex data structure imposed by both the censoring and cure mechanisms. Next, according to the case, specific developments will be achieved to make inference on these quantities. While a number of completely parametric regression techniques have been recently adapted to treat this type of data, dealing with nonparametric components stays challenging in most situations. The introduction of some basic nonparametric estimators and furthermore, extensions to the above more complex models could therefore have a great impact in practice as well as in statistical theory. As a consequence, each aspect of the new methodologies will be investigated: simulations will be used to study the finite sample behavior of the different procedures, whereas asymptotic theory will be developed to obtain their large sample properties. Moreover, a particular attention will be given to the analysis of real data sets, providing for each new methodology interpretable results and therefore making it more tractable for practitioners.