Current Projects

  • Learning generative models for molecules

    Drug discovery is a well-known and challenging problem. Typically, one needs to navigate through a vast chemical space of up to \( 10^{60} \) small organic molecules to find a potential drug candidate with desired properties. Such a trial-and-error process is often cumbersome and error-prone. However, there is an abundance of molecule-related data in the real world, and data-driven approaches can be a promising direction to expedite the molecule design stage, which is a pivotal part of the entire chemical development. Machine learning has revolutionized many fields including computer vision, natural language processing. Despite its current success, machine learning-based algorithms are proficient only at solving the forward problem, in which given a specific input structure there is an associated unique property that we want to predict. In contrast, generating a new molecule to meet specific property requirements is considered as an inverse problem, where there is a one-to-many mapping between the required property generation and structurally diverse molecules. This mapping is highly challenging and constitutes a fundamental aspect of molecule-related inverse problems. Furthermore, the underlying nature of molecules is based on graph structures, and the direct optimization over the discrete representation to find the target generated molecule property is a hard problem in the literature. In this project, we aim to leverage generative models, a type of method that has demonstrated its effectiveness in solving the encountered problem as well as a more general setting of inverse modeling. The project spans over two overarching objectives: to develop a single-step generation for a molecule given some expected properties; to do style transfer. The latter objective seeks to modify the structure of a given molecule with known properties to a new molecule by slightly adjusting the provided molecule structure to exhibit some desired characteristics. Previous studies in the literature solved these problems with a two-stage process that includes molecule property optimization in the second stage. However, this additional process is computationally prohibitive and requires retraining different target values for inference, which poses significant challenges. Another limitation of such ... Read more
  • Automated Bridge Defect Recognition

    Infrastructure assets, such as bridges, need to be inspected regularly. Our objective is to reduce the need for human involvement, minimize risks to health and safety, decrease the impact of subjective engineering assessments, digitize asset management, and promote sustainable inspection practices. By achieving these goals, we aim to develop optimal maintenance strategies for infrastructure assets, leading to better long-term outcomes. The project proposes an innovative solution to bridge inspection and condition assessment, which combines the use of UAV (drone) flights with automated defect detection using AI. This approach represents a globally emerging trend in infrastructure management. The developed platform will allow inspections and condition assessments to take place directly on the bridge's digital twin, which will increase efficiency on both inspection and maintenance sides. Targeted interventions can then be made, leading to prolonged life spans and enhanced sustainability effects. To achieve these benefits, the project will use raw images captured during UAV flights and automate the defect recognition process without any manual interventions. The project is funded by Innosuisse, and is a collaboration of our team with the innovative company LeanBI and OST. The role of our group within the automated bridge defect recognition are twofold: Develop Self-Supervised learning models able to exploit the images acquired by the UAVs, reducing the annotation effort required for new downstream task (new bridge types, new defect types). The goal is the be able to train efficiently new segmentation models with a small number of annotated images. This would facilitate the development of new models for new types of infrastructure or defect. To this end, our research are focus on the latest methods in Contrastive Learning applied to image segmentation and Data Augmentation. Develop automatic Out-Of-Distribution method working with image segmentation models. Here, we want first to develop methods able to detect if the sample analyzed (a bridge) is too different from the training samples of the model currently used and, secondly, to adapt them accordingly to the change.  Partners:   Read more
  • MIGRATE – A Multidisciplinary and InteGRated Approach for geoThermal Exploration

    To mitigate climate change, our society must reduce its carbon footprint coming from fossil fuel energies and favor green energy solutions instead. Geothermal energy is a resource this available in abundance. However, the development of this sector is hindered by the insufficient subsurface information, resulting in high exploration risks. To overcome these obstacles and enrich the existing knowledge of the subsurface information, MIGRATE will develop machine learning methods to build new, innovative, and affordable exploration methods to find geothermal sources with lower drilling costs than previous methods.  The MIGRATE project is a collaboration of our group with the Crustal Deformation and Fluid Flow research group of the University of Geneva led by Matteo Lupi, and the team of Domenico Montanari from the CNR research institute. The project uses surface wave ambient noise tomography (SWANT) acquired by dense nodal networks to capture the subsurface velocity structure of the upper crust. The aim of this approach is to improve the resolution at depth while reducing acquisition logistics and costs. The project will develop new analysis methods that remove human bias,  which are currently existent in SWANT, increase reproducibility and reliability, and identify potential geothermal targets. To give a concrete example: One step that needs to be performed within SWANT is picking a dispersion curve (the black dashed line in the right figure ) in the group-velocity dispersion diagram. Picking this curve is usually done by domain experts and thus suffers from human bias. One step towards automating SWANT via machine learning methods will be the automatization of dispersion curve picking.  The end goal is to obtain velocity maps from surface wave ambient noise data with automatized machine learning methods, which exploit uncertainty quantification and sample efficiency. If successful, this approach may have implications for the understanding of geological processes such as seismogenic domains, volcanic systems, and ore deposits, and may promote the further development of other renewable solutions such as heat storage and geological carbon storage. The developed techniques will be used to investigate ... Read more
  • Interpretable Condition Monitoring for Complex Engineering Systems

    Just as it is important for us to monitor our own body in order to maintain good health, we should have an accurate understanding of the state of engineering systems in order to prevent failures. In this project, entitled "Interpretable Condition Monitoring for Complex Engineering Systems", we investigate machine learning methods to build a condition monitoring framework and apply it to real engineering systems such as artificial satellites and unmanned aerial vehicles. An important research question of the project is how to make the condition monitoring methods interpretable for human operators. The lack of interpretability of machine learning models often hinders the effective integration of learning-based methods into real engineering systems. This is particularly evident in condition monitoring applications, where human operators should manually investigate alarms generated by condition monitoring for reasons such as system safety. We address this issue through the idea of grey-box (hybrid) modeling, where data-driven, machine learning models and theory-driven, expert models are combined. Expert models of engineering systems are interpretable by design but often cannot accurately predict the state of the system in real-world situations. Machine learning models can adapt to real-world situations based on data but lack interpretability. We take the best of the two regimes to build data-driven condition monitoring methods that are reasonably interpretable. The project is funded by the Swiss National Science Foundation under the Strategic Japanese-Swiss Science and Technology Program. It is an international collaborative project with a counterpart in Japan; we are working with the Artificial Intelligence Lab of the Research Center for Advanced Science and Technology, the University of Tokyo. Relevant publication from the DMML group: Naoya Takeishi, Alexandros Kalousis: Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling. Advances in Neural Information Processing Systems 34, 2021. Naoya Takeishi, Alexandros Kalousis: Deep Grey-Box Modeling With Adaptive Data-Driven Models Toward Trustworthy Estimation of Theory-Driven Models. arXiv:2210.13103, to appear in AISTATS 2023. Read more
  • EO4EU: AI-augmented ecosystem for Earth Observation data accessibility with Extended reality User Interfaces for Service and data exploitation

    A vast amount of Earth Observation (EO) data is produced daily and made available through online services and repositories. Contemporary and historical data can be retrieved and used to power existing applications, to foster innovation and finally improve the EU citizens’ lives. However, an undersized audience follows this activity, leaving huge volumes of valuable information unexploited. EO4EU aims to provide innovative tools, methodologies and approaches that would assist a wide spectrum of users, from domain experts and professionals to simple citizens to benefit from EO data. EO4EU strives to deliver dynamic data mapping and labelling based on AI adding FAIRness to the system and data. EO4EU introduces an ecosystem for holistic management of EO data, bridging the gap among domain experts and end users, and bringing in the foreground technological advances to address the market straightness towards a wider usage of EO data. EO4EU envisages to boost the Earth Observation data market, providing a digestible data information modeling for a wide range of EO data, through dynamic data annotation and a state-of-the-art serverless processing by leveraging important European Cloud & HPC infrastructures. The role of our group within EO4EU is linked with the Machine Learning related tasks of the project, which can be summarized in two main directions: The study and application of Self-Supervised learning models which will help us exploit the vast volume of unlabelled EO data, in order to minimize the annotation effort required in downstream supervised tasks.  In this way, new use cases will be enabled to efficiently train classifiers, with a significantly reduced required budget of annotated/ labelled data. This would facilitate the greater public and non-experts interested institutions and individual to access and efficiently use these data. To achieve this, we explore the latest advancements in Self-Supervised Contrastive Learning. The development of tailored models of Learned Compression, suitable to the EO ecosystem. Based on previous expertise in the development of models of Learned Compression, we will develop the models that will increase the compression efficiency comparing to standard, ... Read more