Problems with RIPE NCC Access

Incident Report for RIPE NCC

Postmortem

Network Outage Post-Mortem

Date: August 7 - August 8, 2025

Incident Duration: 40 minutes (23:50 – 00:30 UTC)

Impact: Hosts in the NorthC datacenter were unreachable.

Severity: Critical

Executive Summary

On August 8, 2025, scheduled maintenance was carried out by our fiber provider Eurofiber on one of the redundant links between the AM3 and NorthC datacenters. The link was  covered by another fiber routed along a different path, so at most, some blips in the network were expected due the traffic needing to move over to the other path.

While no service impact was anticipated due to the redundant network design, an unexpected hardware issue occurred on a core switch interface during the maintenance window. This resulted in network degradation and connectivity loss to several hosts located in the NorthC datacenter.

The failure was detected by our monitoring systems within five minutes. The engineering team promptly began an investigation and partial recovery was observed approximately 40 minutes later. The affected interface is currently under investigation with our hardware vendor.

Timeline (UTC)

23:05

Initial alerts because the link went down due to maintenance. 24/7 engineer was alerted, but took no action as this was expected and no services were impacted

23:50

Incident begins: Connectivity to hosts in NorthC datacenter lost.

Network degradation impacts services.

23:55

Monitoring alerts indicating loss of connectivity are triggered.

Initial suspicion falls on the ongoing fiber maintenance.

00:00 – 00:25

Engineering team validates alert data and confirms network issue.

Network troubleshooting begins, focusing on switch interface behavior.

00:30

Services begin to recover; connectivity is gradually restored.

Root Cause Analysis

The exact cause of the issue is still under investigation. Preliminary findings suggest a malfunction in one of the interfaces on a core switch, which coincided with the fiber maintenance. Although the redundant network topology was expected to prevent disruption, the failure of a single interface led to unexpected impact.

We are working with the hardware vendor to determine whether the issue stems from a physical fault, firmware bug, or misbehavior under failover conditions.

Posted Aug 08, 2025 - 14:13 CEST

Resolved

This incident has been now resolved. It now seems to be related to a temporary network issue, but we will investigate further in the morning.
Posted Aug 08, 2025 - 02:51 CEST

Update

We are continuing to investigate this issue.
Posted Aug 08, 2025 - 02:25 CEST

Update

Multiple RIPE NCC services are inaccessible at the moment due to an issue with our single sign on service.
Posted Aug 08, 2025 - 02:24 CEST

Investigating

We are currently investigating the issue
Posted Aug 08, 2025 - 02:21 CEST
This incident affected: RIPE Database, LIR (Member) Portal, RIPE NCC Access and RPKI (RPKI Dashboard).