On 30 September 2019, a 47-minute outage disrupted Airways New Zealand’s air traffic services. Christchurch controllers lost radar data and primary communications, operating in a degraded display mode and using backup systems. Auckland controllers experienced a shorter loss before switching to local bypass surveillance. Despite the disruption, all 41 domestic flights in controlled airspace landed safely with no loss of separation.
Executive summary Tuhinga whakarāpopoto
What happened
- On 30 September 2019, at approximately 1320 (times in this report are in New Zealand Daylight Time (universal coordinated time + 13 hours) expressed in 24-hour format), Airways Corporation of New Zealand Limited’s (Airways) air traffic services were unexpectedly disrupted for 47 minutes (the outage).
- For the duration of the outage the Christchurch domestic controllers lost processed radar surveillance target information from their control screens. The system automatically reverted to a degraded display mode that allowed controllers to continue to monitor aircraft positions. The outage also affected their primary communication method with aircraft pilots and other sector controllers requiring them to switch to their backup communication systems. The Auckland controllers also lost the use of their surveillance control screens for a shorter period, until they switched over to local bypass surveillance information.
- At the time of the outage 41 domestic flights were airborne in New Zealand controlled airspace. All aircraft landed safely, with no airspace separation issues.
Why it happened
- The outage was initiated by a capacitor failure in an uninterruptible power supply (UPS) unit in the Christchurch air traffic management centre. This failure alone should not have caused a loss of air traffic management services. However, incorrect power connections from the UPS to the core digital network equipment supporting the air traffic management services caused some of that equipment to also lose power. That power loss caused the outage.
- The Commission found maintenance checks had not been conducted on the core digital network equipment in accordance with Airways’ procedures. If they had been, Airways would very likely have discovered the incorrect power connections and prevented this outage.
What we can learn
- Safety-critical equipment should be stress tested to ensure it is resilient to power supply interruption.
Who may benefit
- This report will benefit air traffic management centres, and anyone required to maintain electrical equipment that performs essential safety functions.
Factual information Pārongo pono
Narrative
- On 30 September 2019 at approximately 1320 the New Zealand domestic air traffic services were unexpectedly disrupted for 47 minutes (the outage) when an uninterruptible power supply (UPS) failed.
- For the duration of the outage the Christchurch domestic controllers lost processed radar surveillance target information from their workstation control screens. This meant that the displayed target information and aircraft position accuracy gradually degraded, then the workstations automatically switched over to radar data processor bypass mode (RDP Bypass) after 20 seconds. In RDP Bypass mode, the preferred surveillance sensor for each sector workstation continued to provide aircraft position, altitude and identity to the controller. Three sectors (Christchurch Terminal Manoeuvring Area (CH TMA), Area South (STH) and Queenstown Approach (QN APP) sectors) did not have a bypass surveillance sensor, so those workstation control screens reverted to display predicted position data generated from the aircrafts’ flight plans.
- The outage also affected the controllers' primary communication method with aircraft pilots and other sector controllers. All sector controllers had to change over to their backup communication system.
- The Auckland controllers temporarily lost the same surveillance target information until they switched over to local bypass surveillance information. Processed data displayed on their screens was initially sent from the Christchurch air traffic management centre (ATMC) until it was disrupted by the outage.
- At the time of the outage, 41 aircraft were airborne. During the outage, pilots who couldn’t contact controllers reverted to their non-normal (airline standard operating procedures are grouped as ‘normal’ procedures. In unusual or abnormal situations, they are called ‘non-normal’ procedures. Loss of contact with a ground controller calls for a non-normal procedure) procedures and continued with their flight plan. All aircraft landed safely at their intended destination, except two that returned to their points of departure.
- At 1407 air traffic services resumed as normal.
Background
- Airways Corporation of New Zealand Limited (Airways) has two ATMCs – one located in Christchurch and the other in Auckland. The Christchurch ATMC was physically located in the Andy Heard Building at the time of this incident. The ATMCs provide facilities and radar surveillance data processing for controllers to control airborne aircraft in New-Zealand-controlled airspace. The controllers are responsible for the safe and efficient movement of aircraft, including providing the required separation from other aircraft.
- The Airways aeronautical telecommunications network equipment connects all air traffic management equipment in New Zealand and provides a system for the transport of digital data and communications. The core of the network equipment includes multi-protocol label switches (MPLS) and internet protocol multiplexors (IPMux). The outage affected the MPLS equipment in the Christchurch ATMC. 2.9. To prevent a power loss to essential services, the Christchurch ATMC is supported by two independent power supply systems (system A and system B). Each of these incorporates mains power with a backup diesel generator and a UPS (power supply system A includes UPS A; power supply system B includes UPS B) (see appendix 1).
Organisational information
- Airways is certified by the Civil Aviation Authority (CAA) as an Aeronautical Telecommunication Service provider under Civil Aviation Rules (CAR) Part 171: Aeronautical Telecommunications Services – Operation and Certification.
- Airways provides air traffic services for the New Zealand domestic airspace and for the international Oceanic area (this includes the OCR and OCS areas and covers significant areas of both Pacific and Southern Oceans, as detailed on appendix 2) (see appendix 2). Most of the domestic airspace sectors are managed by controllers (there are three types of controllers: en-route, approach and aerodrome) in the Christchurch ATMC, except for the Raglan sector and Oceanic area which are managed by controllers in the Auckland ATMC.
Uninterruptable power supply
- A UPS is designed to ensure continuous power is supplied to essential equipment even when there is an interruption or loss of the mains power supply (for example, when the external mains supply is disrupted or during the change-over from mains to generator power).
- A UPS has its own internal batteries that are charged through a rectifier (an electrical device that converts alternating current into direct current by allowing a current to flow through it in one direction only), from the mains power supply (see Figure 4). In the event of a loss of mains power supply, the UPS draws power from its batteries and continues to provide power through an inverter. The UPS will continue to supply power until either the mains power supply is re-established, the backup generator power supply is available, or the UPS batteries are discharged.
- The UPS has a reserve mains power supply that is connected to both the automatic and manual bypass switches. If the UPS experiences an internal fault, it automatically switches to the reserve mains power supply and should continue to supply uninterrupted power to essential equipment.
- The external manual bypass switch is provided so that the entire UPS can be bypassed to facilitate maintenance or replacement while the load is still connected to mains power.
Previous air traffic services outage in 2015
- In about 2013, Airways started a major digital network upgrade called the IPMux project, which delivered the IPMux and MPLS network equipment. Most of the new aeronautical telecommunications network equipment had been installed and services migrated across by June 2015. On 23 June 2015 the Christchurch ATMC experienced a network outage caused by a broadcast storm. That outage was investigated by the Commission (AO-2015-005: Unplanned interruption to national air traffic control services, 23 June 2015).
- This outage in 2019, occurred in the same network equipment cabinets that were involved in the previous outage in 2015.
Analysis Tātaritanga
Introduction
- The power supply systems in the Christchurch ATMC were designed to be resilient to equipment failure. This analysis examines how the electrical power supply system A for the aeronautical telecommunications network did not have the resilience expected from the design and, when it failed, how it disrupted the provision of domestic air traffic services.
- The outage to the Christchurch ATMC aeronautical telecommunications network resulted in Airways’ sector controllers not being able to provide normal air traffic services for the duration of the outage.
- The scope of this analysis is limited to the loss of air traffic services and to the associated equipment, personnel and procedures. Relevant safety actions taken after the outage are also covered.
- The Commission identified one safety issue related to maintenance checks of essential equipment.
Why the outage occurred
- The aeronautical telecommunications network equipment had two independent generator-backed power supply systems, system A and system B. The redundancy in power supply systems was so designed that if one power supply system failed, the equipment would continue to operate from the other power supply system.
- The outage involved three significant technical factors:
- the failure of a capacitor in UPS A
- a short circuit that tripped a circuit breaker, which disconnected UPS A as a power supply
- the incorrect power connection of the MPLS network equipment.
- The initiating point for the Airways outage was the failure of a capacitor in the inverter of UPS A. The capacitor’s external casing was compromised in the failure, allowing debris from inside the capacitor to be spread over surrounding circuit boards (see Figure 3) and exposed copper terminals. This debris was very conductive and created a short circuit inside the UPS cabinet.
- These type of electrolytic power capacitors are under high electrical loads and Airways technicians were aware that they are prone to occasional premature failure. Airways had managed this risk by replacing the capacitors within the manufacturer’s recommended reliable operating lifetime of six years. The capacitor that failed had been replaced 12 months before this outage.
- The capacitor failure caused an internal problem in UPS A when the debris from the capacitor exploded within the UPS. The UPS detected the internal capacitor failure and switched to the reserve supply. However, the conductive debris from the capacitor failure created a short circuit inside the UPS.
- The short circuit created an overload on the reserve supply, which tripped the reserve circuit breaker at the main switchboard. This removed all power to the UPS A connected loads.
- The essential network equipment (see appendix 1) was provided with dual internal direct current (DC) power supplies. These were intended to provide power supply redundancy to the equipment and to be fed from two different external power supply sources, systems A and B.
- When power supply system A stopped providing power to the aeronautical telecommunications network equipment, power supply system B should have taken over. However, both power supply system connections for the MPLS equipment were plugged into UPS A electrical power outlet distribution (EPOD) (see Figure 4). The dashed cable from EPOD A to MPLS B indicates the incorrect power connection. The correct connection should have been from EPOD B to MPLS B, as shown in the diagram in appendix 1. 3.13. Therefore, when UPS A stopped providing power, the MPLS equipment experienced a complete loss of power and stopped functioning.
- The MPLS network equipment is duplicated in a separate cabinet to provide resilience and redundancy. That separate equipment should have been operational and maintained air traffic control services. However, the incorrect connection was repeated in both cabinets, so all the MPLS network equipment lost power.
- The power loss to the MPLS equipment in Christchurch caused a degradation to the surveillance information presented to domestic air traffic controllers and a loss of their primary voice communications systems.
- A second incorrect cabling issue was detected with the network equipment, although this cabling error did not contribute to the outage. Both power supply system connections for the IPMux equipment were plugged into UPS B EPOD (see Figure 4). The dashed cable from EPOD B to IPMux A indicates the incorrect power connection. The correct connection should have been from EPOD A to IPMux A, as shown in the diagram in appendix 1.
- The MPLS and IPMux incorrect power connections had likely existed since installation, before the network outage in 2015.
- At the time of the outage in 2015, Airways used black cables for all UPS power supply system connections. In 2018 they introduced a new cabling standard that ensured UPS power cables were colour coded for UPS A and UPS B. The IPMux cabinets involved with this incident at the Christchurch ATMC had not been brought up to this standard because they would become redundant after Airways relocated operations in 2023 from the Andy Heard Building to the new ATMC being built nearby.
Essential equipment resilience
Safety issue: Airways did not conduct maintenance checks of essential equipment to ensure it was resilient to a power supply failure.
- Airways intended to conduct annual UPS power outage checks to ensure that all essential equipment had a dual power supply from systems A and B.
- The Commission found Airways had not conducted the required maintenance checks on their aeronautical telecommunications network equipment in accordance with their preventative maintenance action procedures.
- Airways had conducted significant work on the network equipment cabinets before the 2015 outage. Airways’ technicians had predicted that incorrect power connections could exist and defined a maintenance checking procedure in 2018 (UPS power outage checks).
- Airways’ management deferred the UPS power outage checks in 2018 and 2019. At the time they believed that there was a greater risk of an outage to air traffic services from conducting the UPS power outage checks than not.
- If Airways had conducted the UPS power outage checks, it would have exposed the incorrect power connection issue with the MPLS and IPMux network equipment. This would have provided the opportunity for Airways to correct the power connection issues, thereby preventing the 2019 outage.
Findings Ngā kitenga
- The unexpected interruption in the Christchurch ATMC to the aeronautical telecommunications network equipment prevented Airways sector controllers from providing normal air traffic services during the 47-minute outage.
- The outage was initiated when a capacitor prematurely failed within UPS A, then escalated to a full UPS failure and caused a loss of power to aeronautical telecommunications network equipment.
- The aeronautical equipment systems were designed with redundancy to allow for a UPS outage of this nature. However, the dual power supply units on essential network equipment (the MPLS) had only been connected to UPS A, which failed, instead of being connected to UPS A and UPS B.
- The incorrect power connections to the MPLS and IPMux network equipment had
- Airways had not carried out the required preventative maintenance power outage checks to establish which power supply was connected to each item of equipment. Had they done so, Airways would very likely have discovered the incorrect power connections likely existed since installation, and before the network outage in 2015.
Safety issues and remedial action Ngā take haumanu me ngā mahi whakatika
General
- Safety issues are an output from the Commission’s analysis. They typically describe a system problem that has the potential to adversely affect future operations on a wide scale.
- Safety issues may be addressed by safety actions taken by a participant, otherwise the Commission may issue a recommendation to address the issue.
Safety issue: Airways did not conduct maintenance checks of essential equipment to ensure it was resilient to a power supply failure.
- Maintenance checks on essential equipment provide an opportunity to identify any defects within a system, rectify them, and thereby ensure the operational integrity of the system. Maintenance checks are especially important for equipment that provides an essential safety function, such as aeronautical telecommunications network equipment.
- Airways had not been conducting UPS power outage checks on its aeronautical telecommunications network equipment.
- Airways management has taken the following safety action to address this issue:
-
updated their preventative maintenance action procedures requiring UPS power outage checks to be completed on each UPS every six months
-
integrated their risk evaluation framework into business practices to provide assurance reporting of completed preventative maintenance action checks to their executive leadership team and the Airways’ Board.
-
-
In the Commission's view, this safety action has addressed the safety issue. Therefore, the Commission has not made a recommendation.
Other safety action
Airways’ safety action
- Participants may take safety actions to address issues that would not normally result in the Commission issuing a recommendation.
- Airways’ internal investigation into the outage identified 26 safety actions. The safety actions predominantly focused on Airways’ business contingency plans and providing increased resilience in their ATMC systems to prevent a similar outage in the future.
- Airways has made significant changes to its ATMC infrastructure since the 2019 outage, with the aim of improving the resilience of its aeronautical telecommunications network. The Christchurch ATMC aeronautical telecommunications network equipment has been moved to a new building specifically designed for it. The building has separate rooms for UPS A and UPS B, as well as for their associated backup generators. The controllers are scheduled to shift into the new building in early 2023.
- Airways has changed its UPSs to a more robustly designed model with physically divided internal compartments. This is intended to isolate internal faults and prevent them escalating into a full UPS failure.
- Airways has applied common wiring standards for UPS connections across all their facilities, including the Auckland and Christchurch ATMCs as well as airport towers. UPS power distribution cables are now all colour coded to distinguish the UPS they are fed from. The differentiation in colour makes it easier for technicians to identify any incorrect power connection issues.
CAA safety action
- The CAA opened an investigation into this outage. Its investigation examined the Airways’ investigation and subsequent findings and actions.
- The CAA conducted a special purpose audit of Airways on 11 February 2021, to review whether Airways had effectively implemented the safety actions identified in their outage investigation report.
- The CAA special purpose audit found Airways had effectively implemented most of the safety actions, with only two being outstanding. As a result, the CAA raised two findings.
- The first finding related to Airways reviewing its organisational safety culture in accordance with its exposition. The CAA has since closed this finding, recording that Airways appointed external safety culture experts to conduct a safety culture survey and implement the results. Airways also carried out several other related actions including establishing a safety culture programme team that includes the New Zealand Airline Pilots Association and the Aviation and Marine Engineering Association.
- The second finding related to Airways’ Safety and Assurance team educating, monitoring and enforcing the timely completion of corrective actions across the business. The CAA has since closed this finding, recording that the Safety and Assurance team has implemented monthly reporting of overdue actions to all executive members, as well as improved action management practices including prioritisation of completing actions.
- The resolution of the findings closed the CAA audit, confirming all safety actions identified by Airways in its investigation into the outage had been effectively implemented.
-
On review of the draft report Airways disagreed with the CAA’s interpretation of the intent of the audit and provided the following comment:
The scope of the CAA audit conducted on 11 February 2021 was the following:
CA Act 1990, CAR Part 12, Part 100 and Part 172.
Quality of Safety Office Investigation Knowledge and Expertise.
To our knowledge, the special purpose audit was not to review whether Airways had effectively implemented the safety actions identified in their outage investigation report.
The CAA requested action implementation status reports regarding this investigation on separate occasions however the audit focus was different to the one described in the TAIC report, and the findings raised by the CAA during this audit were not issued as an outcome of two actions not being implemented. The resolution of the findings raised during the audit had also had no link to the last two actions being effectively implemented.
Safety recommendation from TAIC inquiry into the Airways outage 2015
-
Airways experienced a similar outage to their aeronautical telecommunication network equipment on 23 June 2015. TAIC opened an inquiry (AO-2015-005: Unplanned interruption to national air traffic control services, 23 June 2015) and on 28 September 2017 issued one safety recommendation (028/17) to the Secretary of Transport:
“to update and restructure CAR Part 171 to include the wider scope of technology, software and navigation aids that are normal for a modern air navigation service and to make provision for the rule to cater for future changes in technology”.
- CAR Part 171 was last amended 10 March 2017.
- The CAA has initiated an Air Navigation Services Regulatory Framework (it includes the review of CAR Part 171, 172, and 174) policy project with the intention of making CAR Part 171 more performance based and fit for purpose with respect to current and future technologies. The project is in the policy development phase and is expected to be completed by October 2023.
- The CAA advised the Commission that the policy project may result in recommendations for amendments to the relevant Civil Aviation Rules, and that any decision on whether to amend Civil Aviation Rules rests with the Minister of Transport.
- The policy project does not include making the rule amendments, which is a separate process.
Recommendations Ngā tūtohutanga
General
- The Commission issues recommendations to address safety issues found in its investigations. Recommendations may be addressed to organisations or people and can relate to safety issues found within an organisation or within the wider transport system that have the potential to contribute to future transport accidents and incidents.
- No new recommendations were issued.
Key lessons Ngā akoranga matua
- Safety-critical equipment should be stress tested to ensure that it is resilient to power supply interruption.
Conduct of the Inquiry He tikanga rapunga
- On 30 September 2019 at approximate 1400 hours the Commission became aware through the news media that Airways had experienced an air traffic services outage.
- At approximately 1440 hours on the day of the outage a Commission investigator contacted the Airways Safety Analysis, Change and Resilience Manager requesting information about the outage, and received a full briefing the following day.
- On the 30 September 2019 the Civil Aviation Authority notified the Commission of the Airways outage occurrence.
- On 2 October 2019 the Commission opened an inquiry into the outage under section 13(1) of the Transport Accident Investigation Commission Act 1990 and appointed an investigator in charge.
- On 7 October 2019 two Commission investigators conducted a scene investigation of the Christchurch air traffic management centre and interviewed eleven Airways personnel.
- On 22 August 2022 two Commission investigators viewed the newly established Christchurch air traffic management centre building and gathered further evidence.
- On 7 December 2022 the Commission approved a draft report for circulation to three interested parties for their comment.
- The Commission received three responses with one submission, and two with no comment. Changes as a result of the submission have been included in this final report.
- On 22 March 2023, the Commission approved this final report for publication.
Glossary Kuputaka
- Aeronautical telecommunications network equipment
- A digital data network that facilitates ground-to-ground and ground-to-air communications
- ATMC
- The building that contains aeronautical telecommunications network equipment and sector air traffic controllers
- IPMux
- Internet protocol multiplexer
- Inverter
- An electrical device that converts direct current to alternating current
- MPLS
- Multi-protocol label switching
- Power supply system
- Two separate power supply systems A and B. Each system includes electrical mains power supply, reserve mains power supply, uninterruptable power supply and a backup diesel generator
- Rectifier
- An electrical device that converts alternating current into direct current by allowing a current to flow through it in one direction only.
- UPS
- Uninterruptable power supply
Appendix 1. ATMC power supply schematic

Appendix 2. New Zealand air traffic control sectors