Time: 15:00 CET
Place: Online (Zoom)
If you are interested in attending the PhD defence of Amina Mollaysa, please send an email with the title "Thesis defense Mollaysa - Zoom" to email@example.com to receive the Zoom link.
Structural and Functional Regularization of Deep Learning Models
We investigate various regularization approaches for deep learning models. As the deep learning models can learn very complex functions using enormous numbers of parameters, the need for appropriate regularization becomes even more crucial. In this work, we are particularly interested in the setting where we have access to some domain knowledge that can be used to design various constraints on the learned model to improve models’ generalization performance.
Throughout the thesis, we consider two types of domain knowledge: information about features’ intrinsic properties, which we refer to as feature side-information, and a black-box software that can measure the properties of any instances sampled from the underlying domain.
In the first half of the thesis, we focus on regularizing the predictive models using the feature side-information. Feature side-information is most often ignored or used only for feature selection prior to model fitting. In this work, we propose to incorporate the feature side-information into the learning of the predictive models where we assume that similar features should have a similar effect on the learned model. We present a regularizer that forces the learned model to be invariant/symmetric to relative changes in the values of similar features where the feature similarity is defined based on the feature side information. We give two ways to approximate the value of the regulariser. An analytical one which boils down to the imposition of a Laplacian regulariser on the Jacobian of the learned model with respect to the input features and a stochastic one which relies on data augmentation. We perform experiments on a number of benchmark datasets which show significant predictive performance gains over a number of baselines, as a result of the exploitation of the side information.
In the second half of the thesis, we focus on tackling the inverse problem: generating discrete structures that exhibit a fixed set of properties in the presence of black-box software which can evaluate the property of any discrete structures from the underlying domain. Even though unconditional generation of discrete structures has been tackled very successfully, the conditional generation remains challenging due to the discrete nature of the data. Existing methods are mostly limited to conditioning on binary classes. When conditioning on continuous properties, it is formulated as property optimization where the algorithms look for discrete structures that have their properties enhanced. We investigate the use of conditional generative models that directly attack this inverse problem by modeling the distribution of discrete structures given properties of interest.
Prof. Stephane Marchand-Maillet, thesis director
Prof. Alexandros Kalousis, thesis co-director
Prof. David Duvenaud, University of Toronto
Dr. Jennifer N. Wei, Google brain research,