• michaelthould

Case Study : Big Data solution for real time customer insights at leading Airline

Large Australian Airline with over 30,000 employees and more than 6000 daily flights.


In the Airline industry booking data is the essential driver for many business decisions and is an important input in many operational and sales processes. To provide the best service to customers the Airline requires this data and associated customer insights in near real time to be provided to the greater business community.

The objective was to build a platform that can provide data from various different operational data sources, analyse this data in near real time and provide the information to business systems.

Some data structures contain more than 3000 fields that are updated over 1 million times per day with peaks of over 80 transactions per second.


The solution was built in line with a typical Bigdata architecture. It is based on the Hadoop Ecosystem hosted on the AWS cloud platform Elastic Map Reduce and a number of peripheral systems allowing ingestion and event driven distribution of data to business systems in near real time.

The platform ingests data from a variety of data sources. The main source is the Amadeus Booking system that provides a live feed of booking updates in a continuously ingested stream. Other real time data feeds include Flight updates and Customer Profile information. In the future more Data Sources will be added with the aim to enrich the data and to provide the ability to perform cross domain analytics.

Processing is based on Clustered computing models using the Apache Hadoop ecosystem. Spark Streaming is used for long running, streaming applications that process up to 500 transactions per second. Transient clusters use the capabilities of code driven cloud deployments to generate the infrastructure on the fly and shutdown once the analytics job is complete. This allows cost effective and flexible processing for complex analytical jobs with large amounts of complex data.

Data is stored predominantly in clustered relational database systems and simple storage solutions that allow structured as well as unstructured data to be stored.

Data is served to operational and business systems that require access to real time customer insights. The Data API exposes the data via an API Gateway to internal system or third party partners.

An Event API provides the means to notify interested systems of changes in the underlying data. This way systems can make full use of the real-time nature of the data and the analytical capability.


Fusion Professionals provided technical leadership to make this project a successful implementation. Our staff provided Solution Architecture, Development Management, established development processes and standards as well as implemented large parts of the framework.


Fusion Professionals is very experienced in providing streaming Bigdata platforms based on Hadoop and Spark frameworks. Our technical ability and architectural excellence helps customers move to the modern, integrated analytics platforms for 21st century business models.


The delivered solution is a platform rather than a standalone system. It is an extendable data analytics platform that will allow the airline to process large amounts of data, process machine learning experiments as well as do the heavy lifting in complex analytics processes that interact directly with operational systems.

Here are some of the highlights:

  1. Near real time analytics capability

  2. Bigdata processing platform

  3. Architecture framework

  4. Event driven API interface to socialise real time customer insights

  5. Fully cloud hosted

  6. Continuous deployment of infrastructure and applications

  7. Operational architecture for monitoring and management

Cloud hosted platforms can provide extreme processing power at relatively low costs. This can be achieved with on demand deployment and automatic scaling of environments. Spot pricing can be used to create large processing clusters at a fraction of the cost of inhouse analytics platforms.


Information cycles are becoming faster with customers expecting personalised service experiences. This is only possible if data is analysed fast and made available to the business units in real time:

  1. Real time customer insights

  2. Better service ability through faster data cycles

  3. Ability to engage with customers in a more personalised customer experience

  4. Improved profitability through better customer sentiment

Whilst this sort of business model is very cutting edge today in only a few years it will be a must for all companies to know their customers and their daily interactions with the company in real time.


This bigdata platform is designed to perform complex processing in extremely short time frames. Here are some of the benchmarks achieved during testing and in production:

  1. Over 500 transactions per second

  2. 15-20 second latency from business transaction to insights delivered

  3. Average processing time within the system 2-3 seconds

  4. Over 900 API calls per second (testing limit)

  5. High Availability with inbuilt Disaster Recovery

  6. Auto Scaling and Auto Healing

  7. Deployment of complete environment in under 50 minutes

Start your journey into the 21st century data processing today and let us talk to you about your requirements to move your business closer to your customers.

76 views0 comments