Expert Interview: Christian Rahmig of German Aerospace Center on railML and Digital Twins for the Railway Industry

October 12th, 2021 | by GEONATIVES

(10 min read)

Our series of expert interviews continues: this time, Christian Rahmig of German Aerospace Center (DLR) shares his thoughts on data formats, digital twins and the railway industry. We are more than happy to have one of the most senior experts in this field post his insights on our platform

Christian, thank you very much for taking the time to answer a few questions for the blog of our GEONATIVES Think Tank. We highly appreciate that you offered this interview in the early phase of our initiative. Would you mind introducing yourself to our readers?

My name is Christian Rahmig and I have been a Transportation Researcher for almost 20 years now. After studying Transportation Engineering at the Technical University of Dresden I joined the German Aerospace Center’s Institute of Transportation Systems (DLR TS) in 2007.

For many years, I worked as Scientific Staff with a focus on the topics of multi-sensor data fusion and train-borne localization of trains. This was also the time when I got into contact with geodata and digital maps, which, in the end, keeps enthralling me until today. In 2010, I learned about railML, an XML-based data exchange format for railway applications. I liked their approach of community requirement driven development, and only one year later, I became the coordinator for the infrastructure scheme at – an honorary office that I still hold today. In 2015, I became a business developer for the railway department at DLR TS. Following the latest restructuring, I kept my role as business developer, but moved to the Department of Information Acquisition and Data Modelling, in the year 2021.

Apart from this “business” story I am personally a big supporter of railway transport, because I think it has to play a main role in a future sustainable mobility concept. Realizing the consequences of climate change, more and more people have come to the conclusion that we have to change something in our way of life – and especially in the mobility domain, we still have so much potential for changes. By the way, I also like cycling… but now let’s focus on geodata 😊

You are involved in activities across our five pillars, most prominently as one of the “stakeholders” in “data formats”, “data processing” and “data lake”. Would you mind describing what roles you see yourself in?

I see myself in various roles and these roles can be reasoned by the different backgrounds. At DLR, I am working as a business developer trying to initiate new projects and cooperations, especially in the domain of railways – sensors, data management, algorithms. Based on this role, I want to contribute to GEONATIVES as a stakeholder, who provides applications, projects and use cases that require geodata or make use of them to generate added value. I think, that it is very important to promote such initiatives and make them known to many other stakeholders in the geodata domain, because the challenges we are facing are very similar and we can solve them much better when all the stakeholders in the different domains work together.

Secondly, there is my role as railML infrastructure scheme coordinator, which is focused on coordinating the development of the XML-based data exchange format railML in the field of infrastructure. This role directly connects with the pillar “data formats” and their harmonization and standardization. Finding a common language between the different applications, programs, stakeholders, and data models is the key to creating a generic and multi-purpose digital twin of geo-referenced transport infrastructure.

Currently we have the European Year of the Railway. What are the most promising trends regarding railway in society and its geodata? What are research and industry working on?

One of the biggest challenges in the railway sector is still the availability of geodata. They are needed for several purposes, ranging from digital maps for train-borne localization and European Rail Traffic Management System (ERTMS) integration to geo-referenced monitoring of infrastructure condition and maintenance efforts to such “simple” applications like passenger information. Although we need geodata more than ever, available data remains limited.

I believe, that open data will play a central role in solving this challenge. Already now, there exist many examples where OpenStreetMap’s data treasure has been used at least for an initial digital railway map or prototype example data set. The times, when people were pointing on poor quality or reliability of OSM data, are over. In many cases, OSM data actuality is much better than for traditionally collected data sets. Making use of these open geodata by developing tools and processes for a data quality focused integration of them into various applications, is a central task for research and development, also in the railway data ecosystem.

The second main topic I see addresses standardization and digital twins as such: In comparison to other domains, railway domain discovered quite late the importance of data model harmonization and standardization as key factor for future-oriented railway system development. But now this topic is on the agenda, and luckily, it plays a central role in the European railway research funding framework as it is integrated into the Shift2Rail action plan. Besides related projects, there are also initiatives like that aim at unifying and standardizing railway data exchange based on synchronized data dictionaries. However, at the end of the day the question remains, whether standardized models, formats and dictionaries are sufficient for obtaining a complete digital twin of railway (infrastructure) domain.

A third development aspect that I welcome very much is the increasing involvement of small and medium sized stakeholders. For example, let’s take the railway infrastructure managers: for many years, in Germany the term “infrastructure manager” was synonymously used with “Deutsche Bahn” (DB), but we have to accept that we have more than hundred railway infrastructure managers in Germany. In that context, we need to think about system innovations and development in a holistic approach beyond DB boundaries, because future systems shall be interoperable between the different providers and stakeholders resulting in a seamless experience for the end-user – the passenger or the freight shipping company. This holistic approach requires the involvement of national administration and regulation bodies, too. Fortunately, this realization increasingly prevails and leads to new cooperations already in the research and design phase.

However, as a railway passenger I expect much more compared to what is currently available or possible. If we want the railway domain to become the most important mode of transport in a connected and green European mobility system of the future, we have to focus on very specific issues and solve them together. Ask yourself: how many clicks (or time) does it take for you to book a flight from Bordeaux to Prague and how many clicks will you need to get a train ticket for the same connection? We can – and we have to – do better, also after the European Year of the Railway.

German Aerospace Center is working in the domains aerospace, road and rail (among others). All of them are interested in collecting data, processing and using them for simulation and operations. Do you think that these domains can learn from each other, use synergies in sensor systems, tool chains and data formats? Can you incorporate that into projects and bring these ideas closer to the relevant stakeholders?

Definitely, the domains can learn from each other and use synergies. Especially in terms of data models and data exchange, experiences can be shared and joint approaches can be followed. Therefore, I appreciate initiatives and projects that try to establish a dialogue in terms of joint conferences, workshops etc. to exchange ideas and experiences from the various domains. For example, the railML conference is open to anyone and often invites people from other domains for sharing their knowledge and best practices. Nevertheless, specifying use cases as the basis for data model and data exchange requirements is as important as learning from the other domains. It takes both to develop solutions that suit the needs and which are interoperable with the mobility sector in its various characteristics.

Do you think it makes sense to try to share data across domains, for example by building a digital twin covering the needs of different stakeholders? What could railML® contribute in that respect?

The question is tricky: Some people may understand it in a way like “Do you think saving same data twice if used in two different domains shall be avoided?”. There is nothing wrong about sharing data even across domains if this sharing results in more benefits than costs. On the other side, it makes no sense to invest too much effort in finding the “one and only” data model aka digital twin that suits to all use cases and applications in all domains, just to avoid double storage of data. I prefer a use case-oriented strategy here, which means, that data modeling approaches shall be driven by the domain requirements. If there is a big overlap between use cases or domains, it is very useful to harmonize data modeling and probably end up with one digital twin and a single data storage used for several applications. But in case of very diverging use cases or domains, we should accept the situation and allow for different (domain-driven) digital twins to exist in parallel.

You may argue that this second case is the situation of today with lots of data silos that are not connected with each other. And this is, where standardized data exchange formats like railML enter the scene: Based on a commonly agreed and standardized glossary or data dictionary, it enables a connection of data models. Different digital twins may be (loosely) coupled via a data exchange format that uses a commonly agreed terminology. So, in the end, the result is not a single digital twin, but something like a “federated digital twin”.

Coming now more to the format description railML®. Currently it differentiates between the entities Infrastructure, Interlocking, Rolling Stock and Timetable (we have described some of them in our railway simulation feature). So the focus is more on operation and maintenance. Why is railML® not providing an extension for supporting driving simulation, visualization and decision-making? Will it be added in the future?

Short answer: yes, if needed.

Long answer: railML is a community driven initiative. The exchange format (and the underlying data model) contains elements and attributes that fulfill use case requirements. We are not modelling for the sake of modelling, nor do we want to create a complete digital twin of the railway world usable for any kind of application. The focus of railML is on implementing joint user needs considering open discussion, standardization and best practice examples. So, if stakeholders would like to avoid developing proprietary interfaces for driving simulation, visualization and decision-making, they may easily approach the railML initiative and its community with their requirements. Collecting these requirements from different stakeholders, harmonizing them and “translating” them into a standardized data model – that is the objective of railML development.

Is it possible to define and document “quality” of the data in railML®? How important is (or would be) this aspect to the stakeholders?

Quality of data is important. However, railML is just a data format and data model. This data format/model is principally independent from data quality requirements. Further, the use cases that have been implemented in the railML schema so far, did not require an explicit modelling of quality parameters, e.g. an error ellipse for a spatial accuracy. Therefore, railML format and data model are quite empty on this. But the more use cases and applications in the safety-relevant domain are being set up, the more questions about data quality and data reliability become important. So, as said in the previous answer: If there is a need for explicit modelling of data quality parameters within the railML community, the railML data model and data format will react flexible focusing on creating a standardized and generic approach usable for various related applications and use cases.

In a prior question we asked about sharing data across domains. Do you think that a digital twin should be focused on specific use cases as railML® is focused on the (broad) use case of operations and maintenance?

A digital twin should always be focused on use cases and their requirements. In order to reduce complexity of the digital twin, it is very useful to focus on specific use cases. This will shorten the time for designing and implementing such a digital twin.

However, there are two aspects that must be considered:

First, the use case has to be generic and independent from a single company or stakeholder. At railML, we use the digital twin to exchange data between different stakeholders and their tools and applications. Therefore, the use case comprises requirements from different sides, which enables a generic use case development. Use cases that are very specific for only one specific action or tool, will result in a very specific digital twin, that, in the end, is usable only for a very specific set of applications. So, keep use cases generic.

Secondly, future development will be driven by emerging digital twin approaches, not only in the railway domain. The biggest benefit of this development being the core of so called “Digitization” and “Industry 4.0” etc. is the cross-domain linking between digital twins. While digital twins become wider and more complex, it is a central task for the developer teams to ensure cross-referencing and model connections. The first and most important step for such a “digital twin framework” is a common understanding of terms and vocabulary. Designing such a joint dictionary is a task to be done by the whole “digital twin community” leaving no one outside the discussion as this would create a risk for incompatible digital twins. To conclude: digital twins must be compatible at least at defined connection points.

In, e.g. Europe, China and Russia, the railway domain in general and the rail network in particular are currently mostly dominated by national railway companies. In the US and Japan the rail network is mostly privately owned. Sometimes infrastructure and rolling stock are legally separated, sometimes not. Would it make sense to create and use joint digital twins or is competition good for the business?

As said before, it is important to have connected digital twins. We must avoid thinking (and modelling) in silos. Of course, rollingstock people have a different view on the railway infrastructure than the infrastructure managers, but in the end, they use the same infrastructure. For the digital twin, this can be similar: either infrastructure managers and railway undertakings agree on working with the same digital twin of railway infrastructure, or they agree on a joint vocabulary and on connection points before they start specializing their own digital twins. Both approaches have their advantages: specialized digital twins may be faster implemented and better adapted to specific requirements while a joint digital twin may provide more options for cross-domain applications and services. The question about which option is the most promising one, will be answered by best practice implementations of prototype applications and use cases.

In the road domain there are a lot of stakeholders mapping the same environment again and again. Thus, we have a lot of data formats describing the same traffic area with different approaches. In contrast, the national railways seem to be able to agree to one format railML® despite the fact that they have to deal with various train protection and signaling systems. Is the railway domain better positioned for collaboration than the road domain?

No, not at all. The railway domain is very conservative and driven by national interests. For many years, development of railway systems, technologies and even standards has been very much dominated by country specific interests and requirements. Compared to the aerospace sector whose focus was on international operation from the beginning, the railway sector is opening very slowly towards multi-lateral developments and standardization. To name it: the European Train Control System (ETCS) is for sure a big step into the right direction in thinking railways globally, but the ETCS specification itself is not the best example for a unified, standardized concept of future railway command and control systems. Why? Because it leaves too much room for interpretation.

In recent years, new approaches like the Conceptual Data Model (CDM) or RailTopoModel (RTM), developed by UIC and or the initiative EULYNX try to minimize this room for interpretation and setting up and specifying different parts of a joint railway digital twin. This is a good development for sure. However, the way how these data models are developed still breathes the spirit of siloed interests and viewpoints: Some models are the result of research projects of a small group of stakeholders, other models allow for contributions only from a specific type of stakeholders, e.g. only railways. When developing the data exchange format railML, we wanted to make it better: the community is open for anyone including industry, user, research, development etc. The model development is use case driven without losing the focus on finding a generic solution that suits to all the different stakeholders. My hope is that with such an open and transparent development strategy, we may finally overcome the time of siloed thinking and siloed railway system development. A digital twin on the basis of a joint railway data ecosystem is a central pre-condition for strengthening the role of railways in future mobility systems.

Final question that we had also asked in our first interview: We started our initiative some months ago; what topics would you like to see covered in a blog like ours?

Considering the things that I said before, the most interesting topic for me is building up a community of “geonatives” on the basis of goal-oriented technical discussions; a community that is open for anyone incorporating players from industry as well as from users and developers; a community that consists of representatives from all the different silos of applications and use cases without limiting their thinking to these silos; and a community that is agile in discovering innovative methods for interaction, discussion and joint development of the various pieces for a unified digital twin.

Building up such a community requires motivation of people. From my work with railML I have learned that people are motivated better when the topic is very close to their specific examples, use cases, and applications. Therefore, I would like this blog to be filled with best practice examples, examples from all over the world. We may use these examples, these user stories to initiate the discussion, which keeps the blog alive.

And as a railway man, of course, I am interested in any kind of contribution originating or addressing the railway domain! 😊

Thanks a lot, Christian!