NFS File Share Unresponsive in Storage Cluster.
One of the NFS file shares which is hosting the www static content present on the storage cluster was taking a long time to load the files available on it. This caused the static content to be unavailable on the Atlas web page, the LIR Portal and the RPKI Dashboard.
(all times in UTC)
(May 3, 2024 4:41 AM) Alert from atlas.ripe.net since none of the members were active.
(May 3, 2024 5:20 AM) Investigation points to a fault with the underlying hardware cluster.
(May 3, 2024 6:50 AM) Identified the specific hardware and isolated the server which was hosting the affected file share.
(May 3, 2024 7:18 AM) Maintenance mode is enabled on the isolated host to prevent any recurring disruption.
(May 3, 2024 7:40 AM) Alert closed as all members are back online and static content is available. IT opens a case with the vendor to investigate the issue further.
An ESXi host experienced a hardware issue which caused the file share to become unresponsive. The issue is being investigated with our supplier.
The affected host was placed into Maintenance Mode to facilitate the migration of all Virtual Machines to a different host. Additionally, the NFS share was migrated to ensure accessibility of static content. Following these migrations, the issue ceased, and static content became available.