Multi-agent in action: Michael Küpper from Deutsche Bahn on putting railway back on the fast track

Multi-agent in action: Michael Küpper from Deutsche Bahn on putting railway back on the fast track

Jarek Jarzębowski - July 7, 2024

When thinking of the butterfly effect, surprising life twists and turns may first come to mind. But there’s a down-to-earth example that you probably have experienced firsthand. It’s the railway - the main subject of Jerzy Jarzębowski’s recent discussion with Michael Kupper. Think how a disruption of train traffic in the north of Germany can heavily impact the schedule as far as the Swiss border. One little event can have a huge impact on the whole transportation system.

Michael and his team have found an AI remedy for that. In this conversation, he explains why classic AI methods do not solve railway-related issues and which solutions can actually make a difference. His findings could change the face of transportation as we know it today. Dive into Michael’s insights and track down the promising innovations in the railway sector and beyond it.

Key Takeaways from the Conversation

More than classical AI: the classical AI approach is crucial for predictive maintenance, helping to identify and address issues before they lead to significant disruptions. However, when it comes to increasing efficiency across tens of thousands of routes, traditional mathematical optimization and classical AI are not enough. More advanced AI solutions involving generative artificial intelligence and multi-agent systems are a must.

Multi-agent reinforcement learning enabling large scale innovation: the transportation sector can benefit enormously from upgrading to multi-agent reinforcement learning (MARL). Unlike single-agent approaches, MARL treats each train as an independent decision-making unit. This allows for greater flexibility and coordination in constructing schedules, thereby optimizing overall network efficiency while ensuring that trains operate without interfering with each other..

Challenges in generalization and scalability: while AI models show promise in specific settings, there are ongoing challenges in generalizing these solutions to different networks and traffic scenarios and scaling them effectively. In the next years, the critical aspect to develop will be the adaptability of AI systems to various operational contexts and consistent performance across larger networks.

Full automation as the future: transportation sector is traditionally slow to integrate new technologies. It has started now on this fast track to an automated future. Companies will aim towards increased automation across various functions, including scheduling, operational control, maintenance, and customer service. Fully automated systems may become the norm within the next decade, driven by continuous advancements in AI.

Conversation with Michael Küpper

Jarek Jarzębowski: Hello, Michael. Can you tell me a little bit more about yourself, your background, and your experience in data and logistics in general?

Michael Küpper: Well, I am a former particle physicist turned management consultant turned digitalization manager. 

I’ve been working with Deutsche Bahn, developing a digitalization project of the railway sector in Germany, for the past six and a half years. I’ve been in charge of building an automated AI-based capacity and traffic management system, which is the central brain of the future completely automated digitalized railway system we’re building here at Digitale Schiene Deutschland (DSD). 

Before that, I was a management consultant for over ten years, working in various industries. For about three years, I worked for the Boston Consulting Group, and then for seven years, I ran my own company on the East Coast of the US.

Jarek Jarzębowski: Can you tell me a bit more about DSD? What is it, how is it shaped, what will it do, what will it look like, and what tech is behind it?

Michael Küpper: Yes, DSD is a sector initiative that encompasses the entire railway sector in Germany, which is one of the biggest in the world and includes several hundred train operators. Of course, there are the big ones, some of which belong to Deutsche Bahn Holding, but also a lot of smaller, privately run operators. 

It also includes the infrastructure manager, which is DB InfraGO, formerly known as DB Netz. In the European Union, railway operations are separated from the infrastructure, just as there is a similar unbundling in electricity, gas, and telecommunication networks.

Inside Deutsche Bahn, Digitale Schiene Deutschland is organizationally located in the infrastructure management company because many of the technological innovations we are supposed to foster and produce either refer primarily to the infrastructure or need to be orchestrated by that neutral, regulated entity.

The goal of DSD is to significantly improve capacity, quality, and punctuality of the railway system by applying fundamentally new technologies, including technologies adopted from other sectors.

Jarek Jarzębowski: Can you tell me a little bit more about the current state of innovation and the use of AI in the industry? I mean, trains in general are pretty traditional or even old-fashioned in some ways. What is the actual use of technology, AI, and data from your perspective in the field?

Michael Küpper: There are multiple uses of AI in the industry at various stages of technological maturity. Let’s start with what one could almost call classical AI, the pattern recognition that has been around for a while. That is, of course, also being used in the railway sector, including at Deutsche Bahn. For example, detecting faults on trains directly or via patterns in sensor data from wheels, engines, the pantograph, and other components. This is used to detect flaws and then do reactive maintenance.

The next stage of AI involves making sense of all that data for predictive maintenance of vehicles and infrastructure. Various train operators and infrastructure managers worldwide are using, and further developing, such technologies.

Now, let’s move to more recent developments since about 2016-2017, when Google’s DeepMind had its breakthroughs with AlphaGo and AlphaGo Zero. We, and a few other teams around the world, are building on the AI concepts behind that—specifically deep reinforcement learning—to solve tricky automation problems in railway systems that have never been tackled on a large scale before. This is where the system my team has been developing for the past few years comes in, addressing the automation of planning and operational control.

Jarek Jarzębowski: Can you expand on the problem itself? Why is it such a big problem, and why has it never been tackled?

Michael Küpper: The problem is a gigantic optimization problem. In Germany, for instance, we have about 40,000 train runs per day on 33,000 kilometers of network. When you reach the scale of about 30-35 trains, traditional mathematical optimization methods become infeasible because they take too long to come to reasonable decisions. Decisions like which train goes where, what happens if a track becomes unusable, or a vehicle gets stuck. You need to reroute trains, slow down some, accelerate others, all while respecting numerous constraints like electricity, profile gauge restrictions, passenger connections or similar dependencies among cargo trains. Today, this work is done by hundreds of dispatchers and signallers.This is where AI comes into play. This is not just pattern recognition or learning from past decisions; it’s creating something new, a combination of decision-making systems and generative AI.

Jarek Jarzębowski: Can you tell us a bit more about the solution that you are working on? What is the technology behind it, how are you approaching it, and what is the state of progress in tackling the problem?

Michael Küpper: Only a handful of teams in the world are working on this problem because it is very railway-specific. The market is limited, and the work requires top-notch experts and extensive cloud computing resources for reinforcement learning training. Most of these teams follow a single-agent reinforcement learning approach, where a few decision-making units decide on an abstract level about train order or specific meta settings of a schedule. These meta-decisions are then translated into an actual executable schedule for the railway system.

What’s unique about our team’s approach at DSD is the multi-agent reinforcement learning approach. In this model, every train becomes its own decision-making unit, allowing for maximum freedom in constructing schedules but requiring maximum coordination among the trains. This ensures they generate schedules without impeding each other’s paths.

Jarek Jarzębowski: Can you share a bit about the outcomes that you have already achieved?

Michael Küpper: We have a prototype that can plan schedules for a couple of hundred trains on medium-sized networks of a few thousand route kilometers. The same prototype can also change live schedules of around 40 trains in a regional node, reacting spontaneously to disruptions in a previously created schedule.

Jarek Jarzębowski: How do you deal with the need for a lot of computing power for such a multi-agent approach?

Michael Küpper: Let me get back to that question after a short explanation of the background. The system must reschedule when disruptions occur, for example when a track becomes unusable due to an unauthorized person in it. Within seconds, the system comes up with a new schedule for dozens of trains, rerouting not only those trains directly affected by the blocked track, but optimizing the overall traffic holistically. Eventually, this will ensure that disruptions in one area, say North Germany, can be accounted for in real-time with all secondary and higher-order effects down to the Swiss border in the south.

Jarek Jarzębowski: It reminds me of the butterfly effect. If one change in the route due to a blocked track can change the routes of trains on the other side of the country, can you actually track what’s going on inside the algorithm? Or is it a black box where you input data and get the output without knowing what happened inside?

Michael Küpper: The optimization problem that the algorithm solves is highly complex, so there’s no linear connection between input and output. However, once the system is fully trained, the neural networks inside the machine are frozen, meaning the same starting situation, theoretically, should always produce the same reaction. By analyzing the parameters inside the neural network, we get hints about which factors were most influential in making specific decisions. This helps us explain the outcomes as much as possible.

And now about the need for computing power: As described before, decisions in operations are made by a fully trained AI system with frozen neural networks. This does not require much computing power, relatively speaking. The major effort has been spent during training. Admittedly, training is resource-intense, and therefore we use powerful cloud services for it.

Jarek Jarzębowski: What are the biggest obstacles your team faces in implementing this solution, and what strategies are you using to overcome them?

Michael Küpper: There are a few challenges, and that’s an understatement. Technically, we face issues of generalization and scalability. We have good experience with both, but are still far from where we need to be. 

Scaling, so far, shows that processing time grows linearly with the number of trains. However, there’s no guarantee this remains true for up to 40,000 train runs. We believe it does, but the path might be bumpy.

The other challenge is generalization. We have good experience training the system on a specific network with specific trains and then modifying these parameters. The system still produces good results, but we don’t know yet how big these changes can be without losing quality in the schedules.

Jarek Jarzębowski: You mentioned aiming for this to work across all of Germany and potentially beyond. Does this mean such a solution could be usable in any railway system worldwide, or would it need significant changes to work in other systems?

Michael Küpper: Many of the basic concepts, AI modeling and neural network configurations we’ve developed, are transferable to related problems in other transportation systems. It doesn’t even have to be a railway system; it could be a subway or streetcar system with similar characteristics. However, customization will be necessary due to differences in operational rules, vehicle characteristics, safety systems, and optimization goals.

Jarek Jarzębowski: Apart from this optimization challenge, do you see other significant challenges that AI might tackle in the near future in your field?

Michael Küpper: Yes, I think based on the progress we’ve made, it’s feasible and worthwhile to use similar approaches for related problems. For example, handling maintenance capacities in facilities connected to the rail network, organizing the driving and maintenance of a vehicle fleet, managing a staff of thousands of locomotive drivers and train conductors, or deploying a limited number of vehicles for maximum productive operations. These operational problems could also be tackled with a multi-agent reinforcement learning approach.

Jarek Jarzębowski: One of the most talked-about subsets of AI currently is large language models (LLMs). Do you think LLMs can be used in the railway industry, and if so, in what capacity?

Michael Küpper: Yes, teams in multiple organizations are already experimenting with using LLMs for various tasks inside the industry. This includes passenger information distribution, speech generation, customer service, and live video chatbots to replace or add to traditional staff at information booths. Another use is in systems engineering, where LLMs assist in crafting comprehensive requirements documents and generating test cases. These tasks may not get fully automated ever – and probably shouldn’t –  but LLMs can be very helpful in aiding systems engineers.

Jarek Jarzębowski: Since you have been working in the data science field for so long, do you have any advice for other teams in the railway industry or logistics on approaching the use of data science?

Michael Küpper: Based on my experience, I have two main recommendations: firstly, ensure a solid data foundation, and secondly, allocate enough time and budget. Many companies start using AI with high expectations, but neglect the importance of accurate infrastructure data. Data inaccuracies can hinder progress significantly. Moreover, these projects take time and require significant financial investment due to the need for highly paid experts and extensive computing resources. Management and stakeholders must understand that automation and digitalization in such complex industries cannot be rushed and require substantial and sustained investment.

Jarek Jarzębowski: How do you see the future of the industry in terms of technology use, especially AI, in the next five to ten years?

Michael Küpper: While predicting the future is always tricky, I see a continuous trend toward increased AI use across various aspects of the industry. In the next decade, I believe we will witness fully automated scheduling and operational control in transportation systems worldwide. AI will be sensibly employed to solve complex problems in maintenance, customer service, scheduling, and more.

Jarek Jarzębowski: Is there anything else you would like to share with our audience?

Michael Küpper: Rail companies are not pursuing digitalization and AI research just for its beauty. They do this to address some of the toughest challenges in their industry. They need to increase capacity on the existing rail network, improve quality, reliability, and efficiency. In Germany, trains are often full, and we expect significantly higher passenger numbers in the near future. Building new tracks is almost impossible, lengthy and prohibitively expensive. The aim of the German government is to double the passengers and to increase freight transport to a modal split of 25%.  So we need to increase capacity by at least 30% on the existing network with innovations and new digital technologies. This requires automated and optimized driving, a safety system allowing shorter and flexible distances between trains, and intelligent traffic orchestration. Furthermore, predictive maintenance allows for more dynamic and accurate maintenance schedules, increasing vehicle usage and thereby capacity.

Jarek Jarzębowski: Thank you, Michael Kupper, for sharing these insights and the impressive work you and your team are doing at DSD. It has been a fascinating discussion, and I am sure our audience will appreciate the depth of information and your perspective on the future of the railway industry.

Michael Küpper: Thank you, Jarek. It was a pleasure to discuss these topics with you.

Michael Küpper’s Background

Michael Küpper joined Digitale Schiene Deutschland (DSD) at Deutsche Bahn in late 2017. As Product Manager, he has built and led the scaled-agile team-of-teams that implements DSD’s Capacity & Traffic Management System (CTMS) until 2023. He now serves as Stakeholder Manager to drive the strategic vision of CTMS and its enabling technological foundations within the railway sector at large. Michael holds a PhD in physics from The Weizmann Institute of Science in Israel and has over 10 years of experience as strategy and management consultant. Throughout his career, he has introduced Artificial Intelligence (AI) in environments, where AI had not been previously applied, from particle physics analysis to housing price prediction to rail traffic management.

Closing Thoughts

Classic AI may not be the answer to railway traffic management automation - but advanced AI tools can tackle such issues, perspectively solving some of the biggest pain points of passengers, cargo customers, and operators alike. As the challenges of modern logistics continue to arise, AI combining decision-making systems and generative capabilities may emerge as the ultimate conductor. 

The multi-agent reinforcement learning solution employed by Michael Kupper’s team for Deutsche Bahn could serve as a blueprint for other transportation companies aiming for capacity, reliability and efficiency improvement. Although there are some obstacles on the way, the prospects for successful innovation on a larger scale are promising. Let’s see who joins the multi-agent bandwagon!

About the author

Jarek Jarzębowski

Jarek Jarzębowski

People & Culture Lead

Linkedin profile Twitter

Jarek is an experienced People & Culture professional and tech enthusiast. He is a speaker at HR and tech conferences and Podcaster, who shares a lot on LinkedIn. He loves working on the crossroads of humans, technology, and business, bringing the best of all worlds and combining them in a novel way.
At nexocode, he is responsible for leading People & Culture initiatives.

Would you like to discuss AI opportunities in your business?

Let us know and Dorota will arrange a call with our experts.

Dorota Owczarek
Dorota Owczarek
AI Product Lead

Thanks for the message!

We'll do our best to get back to you
as soon as possible.

This article is a part of

AI Revolution Diaries
5 articles

AI Revolution Diaries

Step into the narrative of change with our AI Revolution Diaries, where each interview captures a moment in the ongoing revolution of artificial intelligence across industries. These diaries detail the firsthand experiences of businesses at the forefront of integrating AI, highlighting the transformative impact and the lessons learned throughout their journey of innovation.

Engage with our series to discover the strategies that drive successful AI integration, and grasp the benefits and hurdles encountered by pioneers in the field. Let us be your guide in navigating the transformative journey of AI, empowering your business to harness the full potential of data and shape the future of your industry.

check it out

Becoming AI Driven

Insights on practical AI applications just one click away

Sign up for our newsletter and don't miss out on the latest insights, trends and innovations from this sector.

Done!

Thanks for joining the newsletter

Check your inbox for the confirmation email & enjoy the read!

This site uses cookies for analytical purposes.

Accept Privacy Policy

In the interests of your safety and to implement the principle of lawful, reliable and transparent processing of your personal data when using our services, we developed this document called the Privacy Policy. This document regulates the processing and protection of Users’ personal data in connection with their use of the Website and has been prepared by Nexocode.

To ensure the protection of Users' personal data, Nexocode applies appropriate organizational and technical solutions to prevent privacy breaches. Nexocode implements measures to ensure security at the level which ensures compliance with applicable Polish and European laws such as:

  1. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (published in the Official Journal of the European Union L 119, p 1); Act of 10 May 2018 on personal data protection (published in the Journal of Laws of 2018, item 1000);
  2. Act of 18 July 2002 on providing services by electronic means;
  3. Telecommunications Law of 16 July 2004.

The Website is secured by the SSL protocol, which provides secure data transmission on the Internet.

1. Definitions

  1. User – a person that uses the Website, i.e. a natural person with full legal capacity, a legal person, or an organizational unit which is not a legal person to which specific provisions grant legal capacity.
  2. Nexocode – NEXOCODE sp. z o.o. with its registered office in Kraków, ul. Wadowicka 7, 30-347 Kraków, entered into the Register of Entrepreneurs of the National Court Register kept by the District Court for Kraków-Śródmieście in Kraków, 11th Commercial Department of the National Court Register, under the KRS number: 0000686992, NIP: 6762533324.
  3. Website – website run by Nexocode, at the URL: nexocode.com whose content is available to authorized persons.
  4. Cookies – small files saved by the server on the User's computer, which the server can read when when the website is accessed from the computer.
  5. SSL protocol – a special standard for transmitting data on the Internet which unlike ordinary methods of data transmission encrypts data transmission.
  6. System log – the information that the User's computer transmits to the server which may contain various data (e.g. the user’s IP number), allowing to determine the approximate location where the connection came from.
  7. IP address – individual number which is usually assigned to every computer connected to the Internet. The IP number can be permanently associated with the computer (static) or assigned to a given connection (dynamic).
  8. GDPR – Regulation 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of individuals regarding the processing of personal data and onthe free transmission of such data, repealing Directive 95/46 / EC (General Data Protection Regulation).
  9. Personal data – information about an identified or identifiable natural person ("data subject"). An identifiable natural person is a person who can be directly or indirectly identified, in particular on the basis of identifiers such as name, identification number, location data, online identifiers or one or more specific factors determining the physical, physiological, genetic, mental, economic, cultural or social identity of a natural person.
  10. Processing – any operations performed on personal data, such as collecting, recording, storing, developing, modifying, sharing, and deleting, especially when performed in IT systems.

2. Cookies

The Website is secured by the SSL protocol, which provides secure data transmission on the Internet. The Website, in accordance with art. 173 of the Telecommunications Act of 16 July 2004 of the Republic of Poland, uses Cookies, i.e. data, in particular text files, stored on the User's end device.
Cookies are used to:

  1. improve user experience and facilitate navigation on the site;
  2. help to identify returning Users who access the website using the device on which Cookies were saved;
  3. creating statistics which help to understand how the Users use websites, which allows to improve their structure and content;
  4. adjusting the content of the Website pages to specific User’s preferences and optimizing the websites website experience to the each User's individual needs.

Cookies usually contain the name of the website from which they originate, their storage time on the end device and a unique number. On our Website, we use the following types of Cookies:

  • "Session" – cookie files stored on the User's end device until the Uses logs out, leaves the website or turns off the web browser;
  • "Persistent" – cookie files stored on the User's end device for the time specified in the Cookie file parameters or until they are deleted by the User;
  • "Performance" – cookies used specifically for gathering data on how visitors use a website to measure the performance of a website;
  • "Strictly necessary" – essential for browsing the website and using its features, such as accessing secure areas of the site;
  • "Functional" – cookies enabling remembering the settings selected by the User and personalizing the User interface;
  • "First-party" – cookies stored by the Website;
  • "Third-party" – cookies derived from a website other than the Website;
  • "Facebook cookies" – You should read Facebook cookies policy: www.facebook.com
  • "Other Google cookies" – Refer to Google cookie policy: google.com

3. How System Logs work on the Website

User's activity on the Website, including the User’s Personal Data, is recorded in System Logs. The information collected in the Logs is processed primarily for purposes related to the provision of services, i.e. for the purposes of:

  • analytics – to improve the quality of services provided by us as part of the Website and adapt its functionalities to the needs of the Users. The legal basis for processing in this case is the legitimate interest of Nexocode consisting in analyzing Users' activities and their preferences;
  • fraud detection, identification and countering threats to stability and correct operation of the Website.

4. Cookie mechanism on the Website

Our site uses basic cookies that facilitate the use of its resources. Cookies contain useful information and are stored on the User's computer – our server can read them when connecting to this computer again. Most web browsers allow cookies to be stored on the User's end device by default. Each User can change their Cookie settings in the web browser settings menu: Google ChromeOpen the menu (click the three-dot icon in the upper right corner), Settings > Advanced. In the "Privacy and security" section, click the Content Settings button. In the "Cookies and site date" section you can change the following Cookie settings:

  • Deleting cookies,
  • Blocking cookies by default,
  • Default permission for cookies,
  • Saving Cookies and website data by default and clearing them when the browser is closed,
  • Specifying exceptions for Cookies for specific websites or domains

Internet Explorer 6.0 and 7.0
From the browser menu (upper right corner): Tools > Internet Options > Privacy, click the Sites button. Use the slider to set the desired level, confirm the change with the OK button.

Mozilla Firefox
browser menu: Tools > Options > Privacy and security. Activate the “Custom” field. From there, you can check a relevant field to decide whether or not to accept cookies.

Opera
Open the browser’s settings menu: Go to the Advanced section > Site Settings > Cookies and site data. From there, adjust the setting: Allow sites to save and read cookie data

Safari
In the Safari drop-down menu, select Preferences and click the Security icon.From there, select the desired security level in the "Accept cookies" area.

Disabling Cookies in your browser does not deprive you of access to the resources of the Website. Web browsers, by default, allow storing Cookies on the User's end device. Website Users can freely adjust cookie settings. The web browser allows you to delete cookies. It is also possible to automatically block cookies. Detailed information on this subject is provided in the help or documentation of the specific web browser used by the User. The User can decide not to receive Cookies by changing browser settings. However, disabling Cookies necessary for authentication, security or remembering User preferences may impact user experience, or even make the Website unusable.

5. Additional information

External links may be placed on the Website enabling Users to directly reach other website. Also, while using the Website, cookies may also be placed on the User’s device from other entities, in particular from third parties such as Google, in order to enable the use the functionalities of the Website integrated with these third parties. Each of such providers sets out the rules for the use of cookies in their privacy policy, so for security reasons we recommend that you read the privacy policy document before using these pages. We reserve the right to change this privacy policy at any time by publishing an updated version on our Website. After making the change, the privacy policy will be published on the page with a new date. For more information on the conditions of providing services, in particular the rules of using the Website, contracting, as well as the conditions of accessing content and using the Website, please refer to the the Website’s Terms and Conditions.

Nexocode Team