AI IN AVIATION

Partial in,
partial out

Why AI in aviation fails without connected data

By Filip Filipov, CEO at OAG

Data quality has dominated aviation's AI conversation for a while now. Yet the costlier problem sits one layer deeper: even clean data fails when it lives in silos.

As AI is about to reshape every corner of the aviation industry (from how travellers book trips to how airlines manage disruptions in real time), the conversation has rightly turned to data.

Specifically, to a question that will define which organisations lead and which fall behind:

Is the data feeding these AI systems good enough?

In a recent report, we tackled one dimension of that question: data quality.

We illustrated why flawed, biased, or outdated inputs are the silent killer of AI in a zero-defect industry like aviation. But quality is only half the equation.

There's a second failure mode, which is arguably less understood but potentially just as costly: Partial in, partial out.

You can have perfectly accurate schedule data (every departure time validated, every gate assignment correct) and still get a catastrophically wrong AI output. Why? Because accuracy alone doesn't guarantee completeness.

Every road it does know is correctly placed.

Every turn it suggests is technically possible.

But the route it hands you will never be the best one, and sometimes it will send you down the wrong way into a dead end that it simply doesn't know exists.

Let’s start on the customer-facing side:

An agentic AI travel planner that is built outside the traditional GDS ecosystem and powered by publicly available schedule data recommends a routing from Munich to Bangkok via Istanbul with a 70-minute connection.

The schedule data is accurate. But the system has no access to validated MCT rules for Istanbul Airport, no awareness of visa transit requirements for German nationals connecting through Turkey, and no terminal transfer intelligence for the specific carrier combination.

A traditional OTA, which is plugged into curated data sources, would have automatically filtered this itinerary out. But the AI planner that is working with incomplete inputs presents it as the optimal option. The traveller discovers the problem at the gate. 70 minutes is nowhere near enough to make the connection, even with both flights running on time.

Now, to the operations side:

Imagine an airline's AI-driven disruption management tool that detects an incoming storm that will delay inbound aircraft at a major European hub. As a result, it begins rerouting passengers and rebooking connections.

The model has excellent weather data and accurate schedule feeds. But it lacks real-time crew duty-time data and doesn't account for the fact that three crew members on the rerouted aircraft are approaching their legal rest limits.

The AI's proposed recovery plan creates a secondary disruption (one that a human ops controller with the full picture would likely have caught immediately).

In both cases, the data was clean. It was trustworthy. It just wasn't enough.

Aviation is a zero-defect industry. Twenty minutes off the mark, and you've got a missed connection, a stranded passenger, and a cascading delay that ripples across an airline's entire network.

In fact, missed connections are among the most stressful experiences in air travel and are a key driver of consistently low passenger satisfaction scores across the industry.

Slide1 (2)

This is primarily driven by the data completeness problem.

As AI moves from advisory to autonomous (meaning from generating recommendations to executing decisions), the cost of incomplete data inputs grows exponentially.

Boeing 787

1TB / 24 hrs

Airbus A350

2.5TB / 24hrs

GE Jet Engine

5,000 data points / second

Very few industries generate more data (or depend on it more critically) than aviation.

On the operations side, the numbers are staggering.

A Boeing 787's engines generate roughly 1 terabyte of sensor data every 24 hours.
An Airbus A350 carries over 50,000 sensors and produces 2.5 terabytes of data daily, according to Aerospace America.
A typical GE jet engine alone collects information at 5,000 data points per second.

And these figures represent just one aircraft.

Scale this to the global fleet, and the picture becomes almost incomprehensible. According to Oliver Wyman, the world's commercial aircraft fleet will generate about 100 million terabytes of operational data in 2026 alone. This figure has grown rapidly over the past decade as newer, more sensor-dense aircraft have entered service.

Generating vast amounts of data is only half the challenge. The real problem is what happens to that data once it exists.

Aviation is an industry built on split-second coordination between at least six major stakeholder groups: airlines, airports, air traffic control, ground handling, maintenance teams, and regulators. Each operates its own systems, on its own update cycles, often with its own data formats. Schedules update in one system but lag in another. A gate change is communicated to passengers, but is not necessarily synced with the baggage handling system. A delay in one airport's database takes hours to propagate to partner airlines.

The result is structural fragmentation, and it shows. According to IATA, about two-thirds of airlines struggle with operational silos, and nearly half of all flight delays (47%) stem from poor coordination between functions like ground handling, maintenance, and flight ops.

These are data integration failures. And they don't stop at the hangar door.

The same fragmentation that plagues aviation's operational backend also runs through its commercial front end, where travellers search, compare, and book flights. Different systems, different formats, different update cycles. The pattern repeats, just with different players.

In a 2025 global survey by Sabre, 91% of travel agencies reported using four or more different booking systems. The three dominant distribution models (GDS, NDC, and direct airline APIs) each operate with their own data formats, pricing structures, and content frameworks. Even within the NDC standard, different airlines interpret and implement it inconsistently, making seamless integration an ongoing challenge.

Here is where the stakes have risen sharply in just the past two years.

The commercial front end of aviation is no longer a collection of search boxes and filter menus. It has become a conversational interface.

According to Phocuswright, more than half of all travellers already use ChatGPT or similar AI tools regularly for trip planning.

Meanwhile, the share of travellers who refuse to use AI for planning has shrunk to just 11% according to Skift and McKinsey.

AI is now a core part of how millions of people decide where to go, when to fly, and how much to pay.

This shift has profound implications for the underlying data layer.

AI agents planning (and soon booking) trips need access to comprehensive, real-time, standardised data across multiple domains such as schedules, availability, connection rules, and pricing. Pricing in particular is mission-critical. According to our own research into the current state of airline pricing, up to 25% of all airline fares were dynamically priced at the end of 2024, with most airlines stating the goal of eliminating static pricing entirely over time. In a world where fares are generated dynamically in response to demand signals, competitive moves, and individual traveller profiles, an AI agent working with stale or partial pricing data is fundamentally broken.

This is a gap the current AI agent debate in the airline context still largely misses. Much of the conversation focuses on model capabilities, reasoning, and user experience. Far less attention is paid to whether the data these agents are drawing on is comprehensive enough to support the decisions they are making on behalf of travellers. Dynamic pricing makes this gap very consequential: the fare an AI agent sees at 10:00 may be obsolete by 11:00.

This brings us back to the core thesis of this analysis: in aviation, having accurate data from one source isn't enough. You need accurate data from many sources, integrated coherently, updated in real time, and accessible in formats that AI systems can actually consume.

This is precisely why the breadth of an aviation data provider's offering matters as much as the depth of any single dataset. The fewer the gaps, the less an AI system has to guess, aka hallucinate.

Already today, incomplete data integration causes painful, entirely avoidable failures.

Consider a real-world example from Lufthansa Group that illustrates this with striking clarity.

Before every flight, cabin crews receive a hold item list, which is a document detailing known defects on the aircraft. At Lufthansa, this list could run to forty pages of legacy telex shorthand, which is often delivered just a few minutes before boarding. No time to read it all, let alone cross-reference entries.

Page one might note that the in-flight entertainment system in seats 7A through 7F is inoperative.

But buried around page twenty-two, a second entry indicates that a partial repair has been completed, stating that three of the six screens now work.

Mentally reconciling those two entries, from two different pages, against the clock, while simultaneously preparing the cabin and greeting passengers, simply doesn't happen.

So the passenger in 7D sits down, discovers a dead screen, and presses the call button. The crew member learns about the defect from the passenger, not the other way around. That reversal, where the customer knows more than the crew does, is when the service recovery window closes. A glance at the booking data would have shown a top-tier frequent flyer. A proactive gesture before pushback would have cost almost nothing.

Now, the irony is: the data existed. All of it: the maintenance record, the partial repair status, the seat assignment, the loyalty tier. It just lived in separate systems and never reached the right person in a usable form.

Fortunately, Lufthansa partnered with zeroG to build an AI-driven solution that surfaces every known cabin defect on the crew's tablets as a visual seat map, links each affected seat to the passenger sitting in it, and puts compensation options one tap away. No forty-page printout. No buried contradictions.

And here's the telling detail: the hardest part wasn't building the AI model. It was curating the data, reconciling entries from maintenance systems riddled with contradictions, and building logic to resolve conflicting records across fragmented sources.

This is the data quantity problem in miniature.

Scale this pattern across an airline's entire operation (think crew scheduling, turnaround coordination, disruption management, passenger rebooking) and the cost of incomplete data integration becomes enormous.

This pattern is not unique to the airline industry.

Across sectors, the companies that have gained the most from data-driven intelligence are those that have unified fragmented sources.

Target is an example from the retail space that perfectly exemplifies this, and Starbucks took the concept even further. Here we explore what each brand achieved and why it matters.

Target

The challenge
About a decade ago, the US retailer set out to identify expectant mothers among its customers, knowing that major life events like pregnancy often disrupt habitual shopping patterns. For any brand, that disruption is an opportunity: customers in these phases try new products and establish new behaviours, and pregnancy in particular triggers increased spending that can extend over several years as the child grows.
The breakthrough
Multiple data sources
A statistician on Target's team discovered that when a customer suddenly started buying unscented lotion, specific vitamin supplements, and extra-large bags of cotton balls in a particular sequence, the probability of pregnancy was remarkably high. No single purchase was a signal. But the combination of twenty-five products, drawn from different data sources and linked to a single customer ID, allowed Target to predict not just pregnancy but approximate due dates – often before the customer had told family members, which, in all fairness, also raised significant data privacy concerns.
The outcome

Starbucks

Single intelligence layer
The result
The outcome
- Between 2023 and 2024, Starbucks attributed $2.1 billion USD in incremental revenue to AI-driven personalised recommendations.
- The company's mobile app, now a cornerstone of its digital strategy, generated $1.8 billion USD (25% of total US revenue) in the same period.
- The impact on store performance was equally striking: average order values jumped 12-15% in channels where personalisation was deployed, and same-store sales grew 4-6% as a direct result of AI-led campaigns.

In both cases, the lesson is the same.

Weather data alone doesn't drive a promotion.
A single purchase record alone doesn't predict a pregnancy.
Loyalty data alone doesn't tell you when a customer might prefer an iced frappuccino over their usual flat white.

But unified, those signals become intelligence that no single source could deliver.

Aviation faces exactly this challenge, only with higher stakes, tighter margins, and zero tolerance for error.

The pattern across every example in this article is the same: the data often exists, but it doesn't connect.

Stitching together aviation's fragmented data landscape is extraordinarily difficult.

Legacy infrastructure makes integration slow.
Competitive sensitivities make data sharing complicated.
And the sheer volume makes manual reconciliation impossible.

This is precisely the environment where AI is supposed to help.

But AI can't integrate what it can't access.

To ensure that AI can deliver on its promise of transforming the airline business, comprehensive, well-connected data is just as essential as accurate data. Data breadth alone isn't enough. What matters is what happens when those datasets are connected – when schedules, real-time flight status, connection intelligence, demand signals, and competitive pricing stop living in silos and start informing each other.

That's where data becomes intelligence.

This is the role OAG was built for. We provide aviation's broadest spectrum of datasets, and the intelligence layer that connects them. By turning fragmented inputs into coherent, AI-ready intelligence, OAG enables airlines, airports, and travel-tech partners to build systems that see the full picture and, most importantly, make decisions accordingly.

Fewer blind spots. Fewer gaps between what an AI system knows and what it needs to know. Because in aviation, what your AI doesn't know can ground your operation.

Data sets

Data Delivery

Analytics

Industries

Insights

Support

Company

Contact Us

Partnerships