RPKI CA Software downtime

Incident Report for RIPE NCC

Postmortem

At 02:20 UTC, the RPKI core systems and Publication as a Service (PAAS) were unavailable. Around 06:00 UTC, Publication as a Service was fully operational again. Any changes made during the outage are lost. Since these are changes from automated systems, recovery should be automatic.

Due to an issue with an NFS volume share, both RPKI CA and PaaS were unavailable.
The RPKI on-call engineer first began troubleshooting the RPKI CA issue, hence why troubleshooting the issue with the Publication as a Service (PAAS) was delayed. Unlike the issue with RPKI CA systems, the backup of the PaaS service was also unreachable, which meant restoring directly on the filesystem was not a possible workaround. At 04:00 UTC, the owners of the NFS service were contacted, and they started investigating the unavailability of the NFS. Around 06:00 UTC, the service was restored, and Publication as a Service was again operational.

During the outage of the NFS service, the Krill instance that publishes objects for our PAAS had no data and therefore initialized an empty repository. This caused the RRDP and rsync repositories for PAAS to temporarily contain no objects.

Apologies for any inconvenience this has caused.

Posted Jun 19, 2025 - 14:48 CEST

Resolved

This incident has been resolved.
Posted Jun 19, 2025 - 06:05 CEST

Investigating

We are currently investigating this issue.
Posted Jun 19, 2025 - 04:30 CEST
This incident affected: RPKI (RPKI Dashboard, Publication as a Service (API endpoints)).