Interview with Werner von Seelen Benjamin Paaßen November 19, 2015 Werner von Seelen studied power systems engineering in Hannover, where he also completed his doctoral dissertation on the topic “Information Processing in Homogeneous Networks of Neural Models.” In 1972, he became a professor of biomathematics at the University of Mainz. From 1980 – 1983, he served as the president of the German Society for Cybernetics. In 1989, he helped to found the Institute for Neuroinformatics at the Ruhr University Bochum and held the Professorship of Neural Computation and Theoretical Biology until his retirement in 2001. Werner von Seelen was awarded the Karl-Kupfmüller-Ring in 1995 from the Technical University of Darmstadt. He was one of only 11 researchers to be distinguished with this award since 1977. 1998 he recieved the Karl Heinz Beckurts 1 This is a preprint Preprint of the publication [1], as provided by the author. 2 award for his achievements in the field of neuroscience and its application to intelligent systems. His areas of expertise range from neuroscience to computer science, from the theory of dynamic systems to applications in pattern recognition, robotics, and driver-assistance systems. Currently, Werner von Seelen is working on the “Autonomous Learning” priority programme funded by the German Research Foundation (DFG), among other activities. KI: In your view, what does “autonomy” mean as a characteristic of biological systems? Biological systems exist in a world that, in principle, cannot be fully described by models, and have to survive in it. Now comes the question: how can such a problem be solved? How can survival be ensured? The second issue is whether the structure I am using is sensible, one that I might be able to represent information with? The third enormous problem is that we must have objective functions. This is the task of evolution. When we consider all of these boundary conditions and constraints, the question of autonomy becomes more like: when is something still autonomous? Under biological boundary conditions, autonomy means the following: can I use the information that I have already acquired, the part of the world that I have to analyse, in such a way that I am not only able to react quickly enough, but also take the appropriate actions in order to make it through the situation. Here, autonomy is related to the question: am I able to behave in a way that ensures my survival? In other words, I am first able to perceive that part of the environment that is now relevant, and secondly, I can put the knowledge that I have acquired to use in order to appropriately represent the situation. Finally, I have the option to choose how I react. KI: How would you apply the term autonomy to technical systems? When we were talking about autonomy in the previous context, it was slightly obscuring the formal, theoretical aspect of this process. Autonomy requires two things: first, that you can organise yourself, meaning that you are able to develop structures for thought and forming representations; and second, you must be able to continuously adapt during the course of your life, meaning that you change your system. If we now shift to the technical side, it would mean that self-organisation would be a general expression of autonomy. Now I can ask: what is selforganization? This is an exceedingly difficult question because you must do two things: you need one system from among all possible systems, and you need an objective function that is quantifiable – not just “survive”. Most structures do not achieve this. This means that evolution first “searched for” islands in which self-organisation was even possible. The answer is the brain. In the brain area, an example is the two-dimensional networks that extend through the entire cortex. This means that the actual reason for the these networks being this way is not some small technical informational advantage, but rather, the ability to learn. This is a preprint Preprint of the publication [1], as provided by the author. 3 What do we need this for in engineering? We could come up with ideas for taking over more and more human functions that are characterized as taking place in a natural environment. On the other hand, we must also take into account that the costs associated with developing complex systems are always on the rise, with the caveat that such systems are also increasingly prone to mistakes. This is because the engineer’s standard design requires that all conditions must be considered. That means that self-organization would be a potential answer to designing complex systems: If I were able to define a monotonously increasing or decreasing objective function and find a structure, then I would be able to organise the design of a system through learning techniques. Adaptation to new tasks by learning could also occur. Another area would be generating “intelligent” solutions – intelligent in the sense that they conserve resources. When it comes to learning per se, I must say here that learning techniques converge, but do not scale well. You need an enormous amount of data. The second difficult point is that if you want to adapt, you have to learn something new. Learning something differently takes tremendous effort. Now, as someone with a brain, you know from experience that this is obviously not necessary – you learn something new to supplement it. In the brain, according to the current hypothesis, the variation of a task that has to be solved comes with the prerequisite that it follows a different pathway through the brain. If you want to learn something new, then a new pathway is created. The old one remains. During the second step, a reconciliation takes place between the old and the new pathways. In technical systems, this is a massive limitation that we must be aware of. KI: When you look back, what would you consider the most important milestone in previous research on the path towards autonomous systems? That the structures for learning were investigated – first of all neural networks. This was surely a breakthrough. These structures were the vehicle that allowed for this topic to even be brought into the world. Without neural networks, it would not have happened and with neural networks, we were able to use mathematics. Kolmogorov had already provided the key formal aspects, albeit in an entirely different context. The further-developed learning techniques subsequent to this were also of course a step forward because ultimately, yes, they had to be able to function. Then there was a massive mistake, which we all fell for at the time: we wanted to copy biology in the sense of improving the quality of technical systems and we did it in such a way that we took functions out of the spectrum of possibilities that had a single objective function. Then we tried to optimize it with methods from biology. This generally missed the mark because one has to be clear about this: Evolution evolved not only sub-operations, but also their cooperation. In the moment when you have a defined objective function and can provide a structure, I would say that you should stick with engineering methods. Generally speaking, these methods are more efficient when it comes to optimization. Biology changes structures, and variation in structures largely replaces optimization. This is a preprint Preprint of the publication [1], as provided by the author. 4 KI: That ties in with a current trend: the term “Deep Learning” is a hot topic in the field. Is the changing of structures something that interests you in the context of “Deep Learning”? First, “Deep Learning” means learning in a hierarchical structure. You can ask yourself: are neural networks the final product of structures that develop in the brain? This is not the case. Hierarchies are potentially the next level of structure, meaning that the networks are integrated into the hierarchies. What is the essence of a hierarchy? A hierarchy is a relative ordering in which the position of a neuron determines its function. They are ultimately pathways through a structure. This means that moving through the hierarchy in different ways will cause different functions to be performed. If you use hierarchies to learn a set of rules, the rules themselves are usually not completely unrelated. I can re-use parts of these pathways. It can be shown very generally that learning with hierarchies is much more efficient – that is, with less data processing. KI: Where do you see the future of autonomous learning, and what opportunities for further development do you see in the coming years? This question can be answered in two ways: What are some potential applications, speaking from today’s point of view? And what must still be developed? We already said earlier that self-organisation would relieve us from the enormous effort entailed in designing complex systems. It would free us from having to accept a perpetual reorganization in the adaptation of systems. And as a third point, we would have intelligent solutions. At the same time, this leads us straight to the next problem: what is knowledge? If we take knowledge, in the biological sense, it first becomes apparent through behaviour. This means that we must be able to make the wealth of experiences that we have built upon in the past useable in the current situation. Knowledge must result from a dynamic process of adaptation. We take knowledge from the past, but relate it to the current situation. It is an important question: can I use experience in a current situation if it only occurs once and does not allow for more statistics? I think that this is the main difficulty – how do we organise knowledge in an objectively oriented way that we in principle already possess, but that also has to be adapted to the current situation? Biology does it the following way: it tries to construct the world. What does the brain know? It arrives at a situation in one way or another. Now a danger appears and the brain is faced with the question: what do I do? The brain must therefore predict and possibly also verify whether the prediction made sense. If the prediction was valid, then I have both learned and generalized at the same time. I have learned how to behave in this situation and I have also generalized this information – whenever this situation comes up, I will be able to call upon this knowledge. KI: Would you characterize this organisation of knowledge as the central, unresolved topic of research in your field? Yes. To this I would add: brains have evolved so that they can analyse the This is a preprint Preprint of the publication [1], as provided by the author. 5 world – namely in an adequate way. Everything that serves this purpose is of an evolutionary advantage. Being able to think is one of the most essential things. Let’s take thinking as problem-solving. What do I need to solve problems? I need knowledge, and I need a functional structure with which problem solving is possible. Of course, I must also be able to define an objective function because I do not know what the real objective function is in a problem. Operating knowingly affords you better chances. To be able to apply thought to ambitious ways of looking at a problem, one needs knowledge – otherwise it won’t work. But knowledge alone does not suffice. This system must interact dynamically with the knowledge in order to zero in on the objective function. KI: Let’s turn now to applications. The Google Car is frequently in the news and currently two test tracks are supposed to be opened in Germany for autonomous vehicles. You have researched extensively on autonomous vehicles and driver-assistance systems, especially with regard to perception. Which challenges remain for us in this field and how close do these autonomous vehicles come to the vision of autonomous learning? Autonomous vehicles are characterized by their ability to arrive at a predefined destination regardless of the traffic situation. There was a series of developments that tried solving the whole by splitting it into smaller tasks, such as perception, defining situations, predicting a trajectory, and so on. Today, such systems are able to do this in simple situations – assuming that everyone on the road follows the rules. Then there is no problem and every vehicle can drive on its own. Problems arise when a mistake is made, and mistakes are mostly caused by people. This means that the human driver is a part of the natural environment that cannot be modelled. You do not have enough time for the observation needed to even try. Imagine you are at an intersection. In addition to the car traffic, a tram goes through the intersection. There are also bike lanes and pedestrian crosswalks. Now, just think about how many situations you have to predict. It’s hopeless. KI: Here again we find aspects that you have already mentioned: in traffic it is also the case that you do not want to have a certain number of accidents until you are able to learn from them. The strategy is not so much to analyse accidents but to avoid possibly dangerous situations. If one can extend the prediction of the observer up to about 10 seconds with reliable prediction accuracy and organise in parallel and continously “situations” related to possible behaviour, then one can just count the behavioural options and try to influence them in order to avoid critical situations. Accidents have to be inhibited before something can happen. There is also another thing: the driver model. Which driver do you want to model? The spectrum of driver behaviour is so broad that you cannot predict it. Instead, one should do the following, which we have already done in Bochum: you observe a good, sensible driver for a certain period of time and imitate him on the basis of his actions. This is not very complex in traffic. The driver can speed up or slow down (longitudinal acceleration) and steer (lateral This is a preprint Preprint of the publication [1], as provided by the author. 6 acceleration), but not much more. Then you open up the system to learning. After a period of imitation, I can successfully learn to adapt to the individual driver. I would like to point out another difficulty: when you make a system learn, you must be able to quantify what it has learned. You have to be able to evaluate its quality. How do you do this? Because most systems are behavioural, you would always have to be generating new situations and observing when the system makes a mistake. This is nonsense of course. A general question when it comes to learning systems is: what condition is it in? Can I rely on this system and when does it fail? With what probability will it fail? KI: Its in the nature of exceptions that they are exceptional. That is: They are not represented in my training data. If I understand correctly, you do not think that the quantity of data should be increased so as to represent all cases? This is the wrong idea. Thermodynamics shows us the following: all processes that are somehow attached to thermodynamics cannot be fully modelled. This is generally the case. In a natural environment, we do not have the possibility to exhaustively measure it. With the help of large volumes of data, we can drive up the quality. Learning with enormous amounts of data, however, actually skips over what is at the core of the problem. I have also done this, so I don’t exclude myself from this criticism. But given my experience, I would say that learning is a structured process with an underlying dynamic: knowledge must be configured in such a way that not only can I access the right data, but I must also be able to do so quickly. KI: Finally, is there another topic you would like to address? There are many problems that can be associated with learning, such as problems of datamining. The question then becomes: what can I learn from large volumes of data? Am I successfully able to appreciate the underlying semantics with respect to a query? Can I formulate sensible systems that contain information of interest to me? The background to this is that we are able to learn from concrete events. But we are also able to learn even if we are lying on the couch and do not experience a concrete event. We are able to learn from information. Insofar as I am interested in large data sets, I am not interested in using them to track movements and generate profiles. I would also like to address to the question “Can the brain recognize the brain?” The complexity is accepted as a constraint. I do not think that complexity should be a barrier. Complexity can also be put into hierarchies. If I want to explain how ion channels in nerve cells behave when I cross the street, this is a relatively absurd level of explanation. This does not mean that this is not the basis for explanation. But this makes about as much sense as reading a newspaper with a microscope. I think that this system is just very, very elegant and we have not yet understood its elegance. Yes, biology is complex, but we make things even more complicated by improperly posing questions. Is autonomous learning important? It is not only important in understanding biology, but it is also important because it solves problems – technical problems. This is a preprint Preprint of the publication [1], as provided by the author. 7 These problems are really pressing if we just think about system design. But they are indeed profound when you consider that I can understand, that I can imitate thought processes, and I can build these “tools” into the system design. In this respect, autonomous learning is also an important prerequisite in terms of better understanding intelligent processes. One can ask a basic question: what role should humans play in all of this? Of course it might happen that many of the functions humans perform become automated. But you shouldn’t jump to the wrong conclusion – namely that humans will be replaced by technology. On the one hand, there are functions that technology can take over from humans. But we will never be able to take away man or woman’s ability to feel for others in a tough situation or suffer. For this, an identical brain structure would be necessary. We must be able to both empathize and evaluate this in the same way. KI:Thank you very much! References [1] Benjamin Paaßen, Interview with Werner von Seelen, KI - Künstliche Intelligenz, November 2015, Volume 29, Issue 4, pp 445-448