Christian Fabian is a Ph.D. candidate at the Self-Organizing Systems Lab at the Technical University of Darmstadt and a Research Assistant at hessian.AI.
In 2020, he completed his master’s degree in economics, followed by a master’s degree in mathematics in 2021. Since October 2021, he has been with hessian.AI in Darmstadt, conducting research at the Self-Organizing Systems Lab under the supervision of Prof. Dr. Heinz Koeppl since March 2022.
In his research, Fabian focuses on “Reinforcement Learning for Mean Field Games.” The goal is to predict the collective behavior of very large groups of individuals using machine learning methods. This often involves hundreds of thousands or millions of actors.
Examples of such scenarios include the spread of a pandemic in the population, financial markets, biological swarm systems, and social networks. Models need to be developed that work for such large numbers of participants.
Standard reinforcement learning procedures reach their limits in such multi-agent scenarios. Therefore, Fabian specializes in the use of so-called Mean-Field Approximations, which are intended to provide a simplified understanding of complex interactions.
Specifically, Fabian uses reinforcement learning in conjunction with so-called graphons. A graphon is a tool used in mathematics to examine and understand large, dense networks, such as social networks.
Instead of considering individual connections, such as who is friends with whom, a graphon provides a simplified overview. This allows Fabian to model how individuals are influenced by their network neighbors and how they spread information, even with a high number of participants.
In the case of a pandemic, for example, the method could map personal contacts through which infections spread. The model can then learn what protective measures individuals take depending on the infection status of their neighbors and make predictions for further development.
A current challenge Fabian is working on is to make the graphon models more realistic. The structures used so far do not closely resemble real social networks enough to enable precise predictions.
In his publications, he has already taken steps in this direction. The next goal is to develop learning methods that can depict real structures of social networks. Another goal is to combine current models with Inverse Reinforcement Learning, for example, to explain the behavior of large groups of people in cognitive science.
hessian.AI provides a conducive environment for this research direction. Regular exchanges, conferences, and shared resources such as computing power enable the transfer of ideas between different areas of AI.
His area benefits from encountering real problems while delivering scalable algorithms that others can build on—a collaborative process that hessian.AI aims to facilitate deliberately.
Potential application areas include, in addition to estimating the impact of public health measures on the dynamics of pandemics, predicting market behavior and systemic risks in the financial world, or controlling drone swarms.
Such models could help politics and economics make informed decisions in the future—such as containing pandemics or regulating financial markets. AI systems that accurately model collective behavior have the potential to provide significant societal benefits.”
Darmstadt, 18 October 2023. A team from TU Darmstadt has received an Amazon University Collaboration Award for its research in the field of machine learning. It is the second research award from the US company to go to TU Darmstadt in two years. The team will collaborate with AI researchers from Amazon Alexa in Berlin as part of the funding.
Many people have probably come into contact with chatbots and voice models in recent months, whether at work, at university or in the classroom. All of these models build on research in areas such as machine learning and natural language processing (NLP).
However, the rapid progress in Natural Language Processing also leads to the paradoxical situation that even the top scientists often lose track of the current state of research: The available scientific literature is simply growing far too fast. This is also due to the fact that research itself is increasingly using machine learning algorithms and a flood of ever new language models, which further accelerate the scientific output. Against this backdrop, even AI professionals find it difficult to keep up with the latest developments – and a plethora of scientific papers.
A team from the TU Darmstadt is trying to find a solution to this problem, whose research project has been funded by Amazon since this year with an Amazon University Collaboration Award. The project aims to create a “virtual research assistant” that quickly and reliably helps researchers to close their own knowledge gaps by answering their questions. The assistant will pick out the right content from the mass of accessible scientific literature and provide it in dialogue with the user as a natural language answer.
It is particularly important to the AI researchers to guarantee the accuracy of the information given. Many current chatbots repeatedly fail to give factually correct answers. For the virtual assistant, the researchers are working on a combination of large language models with so-called symbolic reasoning, which should prevent the generation of false information. This should also promote greater transparency in how the assistant arrives at its results.
The research project entitled “Modeling Task-oriented Dialogues Grounded in Scientific Literature” is being conducted at the Ubiquitous Knowledge Processing (UKP) Lab at TU Darmstadt in cooperation with Amazon Alexa. The first phase is planned for two years (2023-2025) with a budget in the low six-figure range. The financial support will also be used to fund a PhD student position. The project is led by Prof. Dr. Iryna Gurevych, head of the UKP Lab and founding member of hessian.ai.
The findings from the project could not only make the work of future AI researchers easier – they could possibly also find application in other areas with rapidly growing knowledge bases, for example in medical research. Because here, too, sifting through existing research takes a lot of time. The team expects this to have long-term and transformative effects for researchers and professionals.
The Amazon University Collaboration Award for the machine learning project is the second grant from the US company to go to TU Darmstadt within two years: Professor Jan Peters’ robotics research has already been funded with an Amazon Research Award since 2022.
Jannis Brugger studied natural sciences and computer science in Koblenz and Mainz, where he began to work on artificial intelligence.
He is currently doing his PhD at hessian.AI as part of the 3AI (The Third Wave of AI) project.
Jannis Brugger’s speciality is equation discovery, which involves deriving mathematical formulae from data sets.
Brugger gives a simple example: “If we have a data set with two bodies of mass, for example, we can try to derive the laws of gravity from it.”
Other fields of application are materials science, where better batteries could be developed, for example, or biochemistry, where formulas could be found that describe how molecules assemble.
Equation Discovery could therefore become a standard tool for scientists in the future, analysing data from thousands of experiments and generating a set of formulae that scientists can then study.
What is special about Brugger’s research is the neuro-symbolic focus: in his work, he combines neural networks with rules that describe the formation of formulae, i.e. a grammar.
The neural network analyses the data and searches within the grammar for new formulas that fit well with the patterns in the data.
One advantage of this method is that the researcher can interpret and transform the formulas found – and the network only outputs well-formed formulas.
This distinguishes his approach from other methods that, for example, rely exclusively on transformer models such as those on which ChatGPT is based.
The use of such models is promising, but the lack of a grammar cannot, for example, prevent the AI model from outputting incorrect syntax, such as generating an addition sign several times in a row. The grammar, on the other hand, provides a clear search space that the model explores.
In addition, the grammar theoretically allows the integration of domain knowledge – that is, the knowledge that scientists have already gathered about a research area. This narrows the search space and can thus speed up the discovery of new, useful formulas.
He says his position at hessian.AI allows him to interact with various experts from other fields in his work; the 3AI project now has more than a dozen PhD students working in different research groups.
The different groups often have a unique perspective on similar problems, he says, for example, there is overlap between equation discovery, programme synthesis and proof finding.
Brugger sees a major challenge in his work in dealing with “noisy data”, i.e. data from the real world that contain measurement inaccuracies or come from different sources.
Successful Equation Discovery is therefore still limited to “perfect data” and further research is needed to realise the promise of the method.
If successful, Equation Discovery could change science – and thus our society – forever.
Dr Martin Mundt studied physics and worked on neuroinspired models in his master’s thesis. He received his doctorate in computer science from Goethe University Frankfurt in 2021.
He then moved to the AIML Lab at TU Darmstadt as a postdoc and has been Junior Research Group Leader of the Open World Lifelong Learning (OWLL) group at TU Darmstadt and hessian.AI since 2022.
Mundt and his OWLL group are working on the question of how AI systems can learn for life. Today’s AI methods train AI models with fixed data sets. However, the real world does not consist of such data that is always the same, says the researcher.
In the field of lifelong learning, Mundt is therefore looking for approaches that can continuously learn from the constantly changing data in a real application in such scenarios.
The problem: Modern machine learning methods are very unstructured and the systems learn individual elements of the data independently. Humans, on the other hand, learn in a structured way, says Mundt, for example first the easier concepts of a new language and then the harder ones that build on them.
However, this approach does not work easily with current methods. When an AI model learns a new concept, it usually forgets large parts of the first one – this is known in science as “catastrophic forgetting”.
Mundt is therefore looking for a holistic approach to lifelong learning that allows AI systems to learn in a structured way without forgetting, and furthermore to always decide whether new data really contain new concepts.
“That is the fundamental idea of my research: how can I learn continuously in a structured way and make robust decisions at the same time?” says Mundt. For this, he says, the current frontier of machine learning must be pushed and also combined with symbolic systems.
Mundt sees the biggest challenge in the field of lifelong learning in finding a suitable approach that does not tackle the problem too broadly – i.e. does not require general intelligence right away – but also does not focus only on one problem, which is then no longer compatible with other systems.
That is why it is also important to ensure transparency in the research community, he says. The work in his field would have to clearly define which problems it solves and where it no longer works.
To support this process, Mundt has published a comprehensive overview of the last 30 years of AI research that highlights the different aspects of lifelong learning and proposes a transparency map on which researchers can show the different dimensions of their work and improve comparability.
Martin Mundt is also a board member of the non-profit organisation Continual AI, where he supports, among other things, the development of Avalanche, an end-to-end library for continuous learning.
hessian.AI is a central enabler for his research, says Mundt, unique in its organisation of researchers and incredibly helpful in the exchange with these experts from different AI fields. This exchange and the shared resources, such as access to data centres, are a basic prerequisite for his research.
It highlights the limitations of current AI systems, which require large amounts of data, energy and other resources, and could lead to approaches that move from static learning to lifelong learning, producing AI systems that are not focused on specific benchmarks but are directly relevant to the application.
AI systems that recognise whether they already know something could also help reduce current problems such as hallucinations in language models.
Finally, continuous learning systems can also enable a participatory process, says Mundt, more specifically the inclusion of different population groups in the learning and updating process of AI systems.
Prof. Dr Thorsten Papenbrock is a Qualification Professor at Philipps-Universität Marburg and heads the Big Data Analytics group there.
Papenbrock studied IT Systems Engineering at the Hasso Plattner Institute, where he completed his doctorate in 2017 and subsequently worked as a senior researcher. In 2021 he followed a call to Marburg.
Prof. Dr Thorsten Papenbrock and his Big Data Analytics group are researching “up and down the whole Big Data stack”, as he says. In addition to classic topics such as data cleansing, data integration or distributed computing, his focus is currently on data analytics with an emphasis on data profiling. The goal: to improve databases and data management.
For example, he is researching methods to find structural metadata that describe “what the data looks like, how it works, how it is connected. In data profiling, he looks for rules that the data follow. These are often very complex algorithms, “but that’s the beauty of it, we’re looking for the challenge,” says the researcher.
“In terms of content, we are anchoring ourselves in green IT,” says Papenbrock. He sees computer science as a discipline that can help speed up processes. His research can, for example, make data warehousing more efficient – a great source of climate, he says. In concrete terms, the aim is to process large amounts of data more quickly, i.e. to speed up the profiling, cleansing and integration of data and thus make it more energy-efficient.
Papenbrock is therefore working with companies such as BMW and Rolls-Royce or in public projects of the BMWi or the NFDI to improve such data processes.
In the BMW Analytics project, for example, he is researching how predictive maintenance of motorbikes is possible based on unreliable sensor data. The AKITA project is about developing more efficient aircraft turbines that emit less CO₂ and noise. To this end, the researchers are developing systems to detect anomalies in engine test data. This should shorten Rolls-Royce’s test cycles and thus accelerate the development of better turbines.
In the Precision LDS project funded by the BMWi, AI methods are to be developed to support the surface coating process for offshore wind turbines. The coating is intended to make the plants resistant to salt water and is often checked manually for defects. Papenbrock’s research is intended to make this process more efficient: “Then no one will have to go over five kilometres of coating surface once with a test strip, but instead you will have an AI algorithm that says: look again at such and such a spot”.
In addition to the sometimes enormous complexity of the algorithms, Papenbrock cites the exchange of data between industry and research or between research and the public as a major challenge. Here, good networking, as offered by hessian.ai, is an advantage.
The centre helps him to network with the hessian AI research and data science scene and to make his work visible. In addition, researchers could access the computing resources that are scarce at universities.
But he also sees hessian.AI as an opportunity to contribute his own research topics and help shape the centre. Because his work could make an important contribution to AI research, where the principle of “garbage in – garbage out” applies. With his methods, the often faulty or incomplete data sets could be processed better – and thus more efficiently. In this way, it also directly promotes green IT.
According to Papenbrock, this involves the acceleration of processes and the intelligent control of systems such as heating and air conditioning systems. “Here I also see my task as communicating what is possible in the area of green IT and how it can be implemented,” says the data expert.
Prof. Dr. Visvanathan Ramesh is Professor of Software Engineering with a focus on “Bio-inspired Vision Systems” at the Goethe University Frankfurt and heads the “Center for Cognition and Computation” there.
After studying in India and the USA, he completed his doctorate in 1995 at the University of Washington on systematic methods for quantifying the limits of image processing systems. Ramesh then moved from academia to industry: from 1995 to 2011 he worked at Siemens Corporate Research in Princeton, where he rose from technical staff to Global Technology Field Leader. His focus was on real-time vision systems and machine vision.
In 2011, Ramesh followed the call to Frankfurt. His current research focus is transdisciplinary research in the field of systems science and the development of intelligence.
Over the past decade, the increasing availability of Big Data and AI tools has significantly accelerated the development and application of AI systems in the real world – and with it the need for methods to make such AI systems safe.
For Prof. Dr. Visvanathan Ramesh, the solution to this task lies in a holistic, transdisciplinary systems perspective that combines traditional model-based thinking with modern, data-driven machine learning.
In doing so, Ramesh builds on a thirty-year research base and develops scalable AI designs that map the context of the respective “world”, the tasks and the performance requirements to transparent, explainable and cognitive architectures for each application domain.
For example, Ramesh is researching, among other things, how AI systems can be developed for real-world use in various industries. Statistical methods for image processing from Deep Learning are an important part of this – but more is needed.
“My research has always been about how to develop machine vision systems in a principled way, with a clear understanding of where the limitations of the system are,” says Ramesh. “Ultimately, the system is built for a specific purpose.”
This requires a holistic systems approach that models the world in which the system is intended to function – including the questions the system is intended to answer and the performance it is intended to deliver.
“Such systems also need to communicate when they stop working,” says the scientist. Only in this way are robust AI systems for real-world use possible.
In his research, Ramesh models application domains – or worlds – for different problems. One example: the automated inspection of a bridge by a drone looking for cracks in the structure.
“I have certain principles according to which I can design the system. In the bridge example, that means something like: what are their properties, what will I see there, what camera sensors do I use and so on,” says Ramesh. “I can pull this information from science, for example, and use it to construct a clear picture of a contextual model of this ‘world’.”
The model can then be used to infer, for example, which images the AI system will see, which are important and which can be ignored.
In doing so, the variables that are relevant to the problem are just as important as those that are not – so the system knows what to look out for. He and his team then build causal and probabilistic models. Modern deep-learning methods such as unsupervised learning can help with the latter, for example in simulations.
“We combine classical engineering methods with neural networks,” the scientist explains. The resulting systems are often also called hybrid AI systems.
Ramesh also exchanges ideas on this topic with other scientists at hessian.AI, such as Prof. Dr Kristian Kersting and Prof. Dr Constantin Rothkopf, both from TU Darmstadt.
Ramesh describes the development of intelligent architectures with an algorithmic core that enable continuous learning systems as a central challenge. Continuous learning is a topic that other hessian.AI researchers are also working on, including Martin Mundt, who heads the “Open World Lifelong Learning” group at the TU Darmstadt and hessian.AI and has a PhD from Ramesh’s “Center for Cognition and Computation”.
He also sees the integration of humans and machines as an important task in the long term: “AI and humans should resonate with each other, we should integrate naturally, complement each other and develop together,” says the scientist. AI must understand humans and humans must understand AI.
His work is to embed AI systems in context so that they do their job robustly and predictably. This way, they can be used safely in critical areas such as healthcare, and in the future, even the increasingly scaling systems can still remain verifiable, Ramesh concludes.
Dr. Simone Schaub-Meyer is an expert in computer vision and conducts research at the interface of computer vision, computer graphics and machine learning.
Schaub-Meyer completed her doctorate at ETH Zurich in 2018 in collaboration with Disney Research Zurich. The scientist then conducted research on augmented reality technologies as a postdoc at the Media Technology Lab at ETH Zurich.
In 2020, she moved to the Visual Inference Lab at TU Darmstadt as a postdoc. Since 2021, she has been the Junior Research Group Leader of the “data-Efficient Video Analysis” (EVA) group there.
EVA was founded as a DEPTH research group as part of the hessian.AI cluster project “The Third Wave of Artificial Intelligence – 3AI” funded by the hessian Ministry of Science and the Arts.
Dr. Simone Schaub-Meyer is researching data-efficient, robust and controllable methods of video analysis, i.e. methods that extract data from videos and can then be used for video interpolation, for example.
Their methods should be efficient in two areas, explains Schaub-Meyer: they have high computational efficiency – because at a resolution of 4K, for example, huge amounts of data have to be processed – and they understand video content with as few annotations as possible.
These annotations are usually created manually by humans to make the image content understandable for computers. For example, in an image showing a cat on a table, both objects are labelled “cat” and “table”. This method enabled large datasets like ImageNet and thus the triumph of supervised machine learning in computer vision.
In the meantime, however, self-supervising methods have become established in which AI models are trained with billions of images without manually created labels and then fine-tuned to their respective field of application using specialised data sets with labels.
Schaub-Meyer is researching how such algorithms are used in video analysis and how they can learn with fewer labels.
Without labels, other signals are needed for learning. Schaub-Meyer is therefore researching methods with which temporal relationships in videos can be efficiently and robustly extracted, represented and used for various applications, such as representing movements in video analysis, synthesising new video images or segmenting and tracking objects in videos.
She also wants to investigate such representations in the diffusion models that are currently in widespread use, such as stable diffusion. Such diffusion models have many advantages, but also problems, says Schaub-Meyer. The researchers now need to find out what such models understand, what biases exist in the network, what problems they can solve and where their limits lie.
Her EVA research group was founded by hessian.AI. Schaub-Meyer appreciates the interdisciplinary collaboration that the centre facilitates, the exchange with other researchers and the financial support.
The scientist sees a major challenge in developing models that “really solve the problem and don’t do something unpredictable”. To do this, she says, a better understanding of such large AI models must be developed, their interpretability improved and the models made more robust. In this way, trust in the models can also be strengthened – a central challenge if they are to be used in critical areas.
Roshni Kamath is a PhD student at Technische Universität Darmstadt. She completed her bachelor’s degree in information technology software engineering at the University of Mumbai and holds a master’s in Artificial Intelligence at Katholieke Universiteit (KU) Leuven.
Prior to joining the Open World Lifelong Learning research group at TU Darmstadt in June 2022, she worked as a software engineer, and was an AI researcher at Forschungszentrum Jülich.
Imagine you are learning to play a musical instrument. As you progress, you learn new songs and techniques. At the same time, you’re likely to remember and be able to play the pieces you’ve learned before.
This ability to accumulate and retain knowledge and skills over time is a fundamental aspect of human learning. To bring this same capability to AI systems, Kamath’s research focuses on “continuous learning in AI.”
Her goal is to better understand the mechanisms of AI models and optimize them to efficiently incorporate new information into existing knowledge, thereby improving their performance over time.
One aspect of her research addresses a major challenge in continuous learning: “catastrophic forgetting.” When a model trained on one task is presented with new data for a different task, it can forget what it has learned. This leads to a drop in performance. Kamath’s research focuses on understanding this phenomenon and finding ways to prevent it.
For example, when training a model with brain images from different hospitals, variations in resolution, metadata, and acquisition process can affect model performance and catastrophic forgetting. Kamath wants to go beyond simply measuring the accuracy of an AI model and focus on the multifaceted factors that can influence learning and forgetting, an approach that aims to resemble real life.
“The objective of my work is to create new techniques and establish links between different machine learning paradigms,” says Kamath. Her goal is to produce advanced AI systems that possess the ability to independently learn in an unbounded environment. These systems can learn continually, detect novel situations successfully, and decide on the data to use for training.
In this context, Kamath’s research involves teaching models about themselves, the data on which they were trained, and what they are capable of. This is part of creating a framework for models to prevent catastrophic forgetting in open-world lifelong learning scenarios.
For Kamath, this means balancing different approaches and taking a holistic view. Transfer learning, for example, enables AI models to leverage knowledge from previous problems as they face new challenges. By transferring knowledge between tasks, AI systems can adapt more efficiently to new data and situations without having to start from scratch.
According to Kamath, this can also raise concerns, for example in data privacy. She notes: “In a hospital setting, many patients are not comfortable with storing or sharing their data. Fine-tuning a model for a new task by transferring knowledge can risk inadvertently leaking information from previous tasks, even if it was anonymized or initially non-sensitive.
Her research aims to find a balance, particularly given the impact of lifelong learning in AI systems on energy consumption. Improving the performance of models can lead to energy savings and more sustainable AI systems, particularly those involving large-scale machine learning models that require significant computational resources.
For example, transfer learning can reduce the amount of training time and computational resources required for each new task by using pre-trained models or shared knowledge between tasks, which can save energy. Kamath regularly attends meetups with the hessian.AI network, sharing ideas with other researchers and learning about their challenges. These interactions provide valuable insights into the AI community’s questions and allow her to improve and focus her research, furthering the development of continuous learning in AI.
Prof. Dr Hilde Kuehne conducts research at Goethe University Frankfurt in the Department of Computer Science and at the MIT-IBM Watson AI Lab. She studied Computational Visualistics at the University of Koblenz-Landau and did her PhD at the Karlsruhe Institute of Technology.
Robots that work collaboratively with humans and always have the right tools at hand – what a human is capable of, a machine can also do in theory. To do this, it must understand human actions and be able to analyse movements.
Kuehne teaches the AI how to see. Action recognition began with the automatic classification and recognition of simple video sequences such as clapping or jumping. Today, Kuehne trains AI systems large neural networks that can automatically recognise and classify a wealth of human movements and activities. For her video database on human motion recognition, she was awarded the TPAMI Test-of-Time Award in 2021.
The challenge: Video data is complex. The same actions have different motion sequences and should still be reliably recognised by an AI. For Kuehne, the solution lies in multimodal learning, which includes text and audio in addition to video in training the data. This additional data helps the AI to become more autonomous – a process similar to human learning. The more human-like the understanding of the data becomes, the more realistic its informative value. This in turn improves the result of motion detection in the real world.
Kuehne relates the multimodal data to each other in a coordinate system. Movements in videos, spoken language and texts are sequenced, coded and end up in a “multimodal embedding space”. In this space, the data are sorted according to their semantic proximity. In this way, with increasing data, a universe of information is created that recognises movements and activities in the real world better, faster and increasingly independently without manual input.
For AI to be able to perceive the world in a similar way to a human, it needs large amounts of data, but recording and classifying them is time-consuming and expensive. Current models are trained with about one million videos – too few for Kuehne. She sees a problem in scaling: “Data on the internet only represent a part of the real world, not the whole world.” Videos on YouTube or Instagram are often similar in their visual language and the neural network cannot exploit its potential.
Kuehne sees one solution in synthetic data, with which the amount of data for AI training can be increased and varied almost without restriction. Because videos from the internet have an uneven gender distribution or do not sufficiently depict movements and activities that occur in the real world.
In order to exploit the full potential of AI, Kuehne works together with scientists from other disciplines, such as linguistics. She is convinced: “Research is fundamentally becoming more collaborative. The days when every researcher had his or her own silo and did research exclusively in that silo are over.” That’s why she appreciates the hessian.AI.er network.
Kuehne integrates ideas from the vision, language, text and machine learning communities into her research: “The real value lies in the exchange between the different disciplines. For action recognition and multimodal learning, the connection of different modalities is indispensable: only in this way is it possible to integrate the different perspectives on the world into AI development.
The researcher therefore combines language and visuals with different levels of information so that AI systems function better and are useful for users: for example, assistance systems that recognise falls and support people in everyday life, or monitoring systems at underground stations that recognise people on the tracks and stop train traffic in time. AI is supposed to be the eye that humans lack in some places.
Prof. Dr Dominik Heider is a professor at the Philipps University of Marburg and an expert in bioinformatics. After studying computer science with a minor in biology, Heider completed his doctorate in 2008 in the Department of Experimental Tumour Biology and the Institute of Computer Science at the University of Münster.
Heider then moved to the University of Duisburg-Essen as a postdoctoral researcher in the field of bioinformatics, where he habilitated in 2012. After a brief excursion into industry, he followed a call to the Straubing campus of the Technical University of Munich as a professor in 2014. In 2016, he moved to Philipps-Universität Marburg, where he heads the interdisciplinary Heiderlab.
Prof. Dr Dominik Heider develops AI solutions for biomedical problems such as predicting drug resistance in pathogens or modelling diseases.
No easy undertaking: For one thing, AI methods can rarely be easily applied to existing biomedical data, and for another, computer science and biomedicine speak different languages.
“Clinical data are very heterogeneous, ranging from image data to sequence data of the microbiome, for example,” says Heider. Therefore, you cannot pull an AI method out of the drawer and apply it directly.
In addition, we are often dealing with small amounts of data from a few patients, often with a large imbalance within the data, for example in the case of rare diseases, where a few data from patients are juxtaposed with large amounts of data from healthy people. In addition, the data sets are often incomplete, values are missing that were not measured, or there are measurement errors.
“We have to take all these aspects into account if we want to apply AI methods to this data,” says the scientist. He and his team are therefore focusing on the development of AI methods tailored to the biomedical field.
AI offers the chance to bring together researchers from different fields and to get to know AI methods in other contexts. This helps to introduce new methods into bioinformatics, and vice versa, bioinformatics brings real-world problems into other fields that otherwise work with “toy samples”, says Heider.
Heider sees a major challenge in the communication between computer science and biomedicine: “Computer scientists and medical doctors do not have a common language, they do not understand each other,” says the scientist. And even within medicine there are different languages.
A common language must therefore be developed at the interface and a basic understanding of biomedicine and its issues, for example, which parameters are important in a medical data set and which are primarily measured.
Heider is involved in several projects funded by the EU, the Federal Ministry of Education and Science and the German Academic Exchange Service, including the Deep-iAMR collaborative project, which runs until 2023, in which AI methods were researched that can differentially predict and classify antibiotic resistance mechanisms in newly sequenced bacterial genomes and identify potential new antibiotic targets.
In the current “Deep Legion” project, Heider and other scientists are developing AI methods to identify virulence factors of the Legionnaires’ disease pathogen.
This research is enormously important because, according to WHO estimates, the number of annual deaths from resistant infections will rise to 10 million by 2050, says Heider. That is roughly equivalent to the number of cancer deaths per year.