System and Infrastructure Status News

SDSC Expanse Lustre filesystem issues

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org, expanse-ps.sdsc.access-ci.org

Start Date: January 28, 2025, 5:00 a.m.

End Date: January 28, 2025, 3:00 p.m.

Update: The Lustre MDS issue was resolved this morning and the filesystem access is back to normal. Dear Expanse User We are currently seeing issues with the Expanse Lustre filesystem. This is leading to very slow responses or timeouts on access. We will update once the problem is resolved. SDSC User Services Staff

Posted: March 20, 2026

Update idp.access-ci.org

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: identity.access-ci.org

Start Date: January 27, 2025, 8:00 p.m.

End Date: January 27, 2025, 8:30 p.m.

On January 27, 2025, the ACCESS CI Identity Provider (https://idp.access-ci.org/idp/) will be updated to the latest Tomcat version. No downtime is expected.

Posted: March 20, 2026

Reminder: Bridges-2 Additional Hardware and Maintenance Schedule January 27-30

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: bridges2-em.psc.access-ci.org, bridges2-gpu.psc.access-ci.org, bridges2-rm.psc.access-ci.org, bridges2-ocean.psc.access-ci.org

Start Date: January 27, 2025, 2:00 p.m.

End Date: January 30, 2025, 11:00 p.m.

As announced in August, we are excited to welcome the addition of ten HPE Cray 670 nodes, with eight (8) H100-SXM5-80GB GPUs and 2 TB node memory each, interconnected by a high-performance Infiniband network to the Bridges-2 system. The installation and testing will require an extended maintenance period beginning on Monday, January 27 at 8:00AM Eastern time and running through Thursday, January 30 at 5:00PM Eastern time. During this time, all Bridges-2 nodes, VMs and Filesystems will be unavailable. We thank you for your patience and understanding. As always if you have any questions or problems, please send them to help@psc.edu

Posted: March 20, 2026

Update registry.access-ci.org Email Verification Plugin

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: registry.access-ci.org

Start Date: January 27, 2025, 2:00 p.m.

End Date: January 27, 2025, 2:30 p.m.

On January 27, 2025, the Email Verification Plugin (https://github.com/cilogon/EmailVerificationEnroller) used by the ACCESS User Registry (https://registry.access-ci.org/) will be updated for compatibility with future versions of COmanage Registry (https://spaces.at.internet2.edu/display/COmanage/Registry+4.4.0+Release+Announcement). Server instances will be restarted during this update which may cause in-progress registrations/logins to fail.

Posted: March 20, 2026

Jira services are unavailable and have degraded performance in certain regions

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: tickets.access-ci.org

Start Date: January 23, 2025, 4:30 p.m.

End Date: January 24, 2025, 1:00 p.m.

We have been informed of the degraded performance of Jira Work Management, Jira Service Management, and Jira Cloud customers in certain regions. We will provide more details as soon as we have. This incident affects: Jira Service Management Web, Service Portal, Opsgenie Incident Flow, Opsgenie Alert Flow, Opsgenie Incident Flow, Opsgenie Alert Flow, Jira Service Management Email Requests, Authentication and User Management, Purchasing & Licensing, Signup, Automation for Jira, and Assist. https://jira-service-management.status.atlassian.com/incidents/4s58pz6sk3zj This has been resolved

Posted: March 20, 2026

Unscheduled Anvil Outage

Published

Infrastructure News Type: Degraded

Affected Infrastructure: anvil.purdue.access-ci.org, anvil-gpu.purdue.access-ci.org

Start Date: January 21, 2025, 7:30 p.m.

End Date: January 21, 2025, 8:57 p.m.

Update: As of Tuesday, January 21st, 2025 at 3:57pm EST, this has been resolved and capacity has been restored. The Anvil cluster began experiencing issues with electrical power around 2:30 PM EST. RCAC engineers are working with Purdue electricians to safely restore power. Anvil is operating at reduced capacity while a handful of nodes were shut down as a precaution. If your jobs were running on these please resubmit. If you have any questions, please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/help-ticket. We will provide an update by 5:00 PM.

Posted: March 20, 2026

Anvil Cluster Open Ondemand Maintenance - January 17, 2025

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: anvil.purdue.access-ci.org, anvil-gpu.purdue.access-ci.org

Start Date: January 17, 2025, 2:00 p.m.

End Date: January 17, 2025, 6:00 p.m.

Update: As of 12:00pm EDT. Jan 17, Anvil team has completed maintenance and returned the Open Ondemand service on Anvil cluster back to normal service. Please enjoy the new features on this dashboard and let us know if you notice any bugs or want more features by submitting a ticket through ACCESS Help Desk at https://support.access-ci.org/help-ticket. Update: the maintenance has been postponed to Friday Januany 17, 2025 The Open Ondemand service for Anvil will be unavailable from Friday, January 17 at 9:00am EDT, 2025 to Friday, January 17 at 5:00pm EDT, 2025. During the maintenance, Anvil team will perform a reconfiguration to the Open Ondemand dashboard for Anvil which include a brand new design of the dashboard with new features listed below. What’s New on the dashboard? - Service Unit Balance and Usage: Monitor your allocation usages and remaining balance on Anvil. - Disk Usage: Monitor your storage utilization across Anvil's file systems. - Job Queue: View and manage your running and queued jobs on Anvil. - News Feed: Stay updated with the latest Anvil news and announcements. - Partition Status: Monitor the current state of partitions/queues on Anvil. - My Jobs Page: Re-designed page to show detailed job information for your jobs and jobs in your allocation(s) as well as job management. - Performance Metrics Page: Analyze your job performance and resource utilization patterns over time. What will impact you? - All Slurm jobs on Anvil (including jobs that have already submitted through Open Ondemand before this maintenance) will continue and NOT be impacted. - All functions including login to Open Ondemand will be unavailable during the maintenance. Anvil Open Ondemand service will return to full production by Friday, January 17 at 5:00pm EDT, 2025. Please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/help-ticket (https://support.access-ci.org/help-ticket**) if you have any questions or suggestions.

Posted: March 20, 2026

ACES Maintenance - January 16

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: aces.tamu.access-ci.org

Start Date: January 16, 2025, 3:00 p.m.

End Date: January 17, 2025, 2:00 a.m.

The ACES cluster will be unavailable during maintenance from 9am to 8pm CST on Thursday January 16 A reservation is in place to prevent jobs from running past the start time of the maintenance period. After the maintenance has been completed, the maximum permitted time limit for jobs in the cpu queue will be reduced from 7 days to 3 days.

Posted: March 20, 2026

DeltaAI compute outage January 13

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: delta.ncsa.access-ci.org

Start Date: January 13, 2025, 2:00 p.m.

End Date: January 13, 2025, 6:00 p.m.

DeltaAI resource users, On Monday, January 13th the DeltaAI compute resource will be unavailable from 8AM to 5PM. During that time all compute nodes will be offline and no jobs will run. During the outage software controlling the High-speed Network will be upgraded and modified to allow the addition of 18 new nodes. There will be no changes to the compute image or other user space software. The DeltaAI logins, storage, scheduler and Openondemand systems will all remain available throughout the outage. Jobs can be submitted during the outage but will not run until the outage ends. The DeltaAI team

Posted: March 20, 2026

Hive Gateway Retirement

Published

Infrastructure News Type: Retirement

Affected Infrastructure: hive.gatech.access-ci.org

Start Date: January 13, 2025, 6:00 a.m.

End Date: Not Specified

Dear ACCESS Community, The Hive Gateway resource at Georgia Tech will enter retirement on January 13th, 2025. The original hardware has reached end-of-life after an additional year of performance post-award and can no longer be supported. The Gateway will remain accessible for existing users, in order to access any data on the system, but it will be impossible to run new jobs. We currently plan to fully turn off Gateway access for ACCESS accounts by March, 2025. Please feel free to reach out if you have any concerns via email at pace-support@oit.gatech.edu. Best, The PACE Team

Posted: March 20, 2026

ACCESS XDMoD Unplanned Partial Outage

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: xdmod.access-ci.org

Start Date: January 9, 2025, 6:00 p.m.

End Date: January 10, 2025, 11:00 p.m.

UPDATE 01/10 - The service has been fully restored at this time. There is a unplanned partial outage for ACCESS XDMoD from approximately 12:00 EDT on Thursday, January 9th until 17:00 EDT on Friday, January 10th. This is a partial outage that impacts viewing job performance data in the Single Job Viewer. An update will be posted once the issue is resolved.

Posted: March 20, 2026

Jetstream2 Planned Outage: January 6–9, 2025

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: jetstream2.indiana.access-ci.org, jetstream2-gpu.indiana.access-ci.org, jetstream2-lm.indiana.access-ci.org, jetstream2-storage.indiana.access-ci.org

Start Date: January 6, 2025, 3:00 p.m.

End Date: January 9, 2025, 3:00 p.m.

On Monday, January 6, 2025 at 9AM EST, Jetstream2 will begin a maintenance outage that will last through approximately 9AM EST on Thursday, January 9 (subject to change). This infrastructure maintenance is being done in conjunction with a third party vendor to update Jetstream2’s cooling system in order to accommodate a resource expansion. This maintenance outage will affect all primary Jetstream2 resources (CPU, GPU, Large Memory, and Storage) and user interfaces (Exosphere, CACAO). While existing instances at satellite regions will not be affected by the maintenance, the Exosphere and CACAO user interfaces will be inaccessible for all regions. When maintenance is complete, all instances will be returned to either a shelved or active state. If your instance was in an active, unshelved state, it will be brought back to that state. If it was shelved, it will remain shelved. NOTE: If your instance was in an errored, shutoff, or suspended state, it will be restored to an active state. We strongly advise all Jetstream2 users to review the states of their existing instances, as well as save and close their work prior to January 6. You can preserve your work by: - Safely shelving any active instances - Backing up essential data outside of Jetstream2 or creating images (https://docs.jetstream-cloud.org/general/instancemgt/#image) of your instances During the outage, please refer to the Jetstream2 status page (https://jetstream.status.io/) for the most up-to-date information. We appreciate your understanding and hope to mitigate any inconvenience this might cause. If you have any questions, please contact the Jetstream2 Support team at help@jetstream-cloud.org (mailto:help@jetstream-cloud.org?subject=).

Posted: March 20, 2026

Update idp.access-ci.org

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: identity.access-ci.org

Start Date: January 2, 2025, 7:00 p.m.

End Date: January 2, 2025, 7:30 p.m.

On January 2, 2025, the ACCESS CI Identity Provider (https://idp.access-ci.org/idp/) will be updated to the latest Tomcat version. No downtime is expected.

Posted: March 20, 2026

Update Terms and Conditions for ACCESS User Registry

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: registry.access-ci.org

Start Date: January 2, 2025, 2:00 p.m.

End Date: January 2, 2025, 3:00 p.m.

On January 2, 2025, a new "Terms and Conditions" document (a.k.a., Acceptable Use Policy (https://access-ci.org/acceptable-use/)) will be configured for the ACCESS User Registry (https://registry.access-ci.org/). The ACCESS CI Acceptable Use Policy (AUP) requires users to re-acknowledge responsibilities regarding use of ACCESS systems and services every 12 months. Configuring a new Terms and Conditions document in the ACCESS User Registry essentially resets the 12-month clock so that the next time a user logs in to https://registry.access-ci.org/ , they will be prompted to read and agree to the AUP. Enforcement of the 12-month AUP agreement has not yet been put in place, so users should not notice any change in behavior for other ACCESS websites or HPC resources. However, users will begin to receive email notifications that they need to agree to the updated Terms and Conditions document. At some point in the future, enforcement of the 12-month AUP agreement will be put in place requiring users to visit https://registry.access-ci.org/ and agree to the new Terms and Conditions document. No downtime is expected for this configuration update.

Posted: March 20, 2026

DeltaAI Emergency Outage

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: delta.ncsa.access-ci.org

Start Date: December 27, 2024, 12:00 a.m.

End Date: December 27, 2024, 9:00 p.m.

DeltaAI Resource Users: A campus wide facilities event triggered a partial power outage at our data center. To safely restore full power to the system we must shutdown DeltaAI now. We hope DeltaAI will be back online at sometime tomorrow but will send an update in the morning when we know more. The DeltaAI Team.

Posted: March 20, 2026

Delta Emergency Outage

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: delta-cpu.ncsa.access-ci.org, delta-gpu.ncsa.access-ci.org

Start Date: December 27, 2024, 12:00 a.m.

End Date: December 27, 2024, 9:00 p.m.

Delta Resource Users: A campus wide facilities event triggered a partial power outage at our data center. To safely restore full power to the system we must shutdown Delta now. We hope Delta will be back online at sometime tomorrow but will send an update in the morning when we know more. The Delta Team.

Posted: March 20, 2026

Expanse Lustre filesystem issues [Resolved]

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org, expanse-ps.sdsc.access-ci.org

Start Date: December 18, 2024, 10:30 a.m.

End Date: December 18, 2024, 7:00 p.m.

Update: The Lustre OSS with the problems was fixed and returned to service and the filesystem is accessible on Expanse now. Dear Expanse User We are currently seeing issues with the Expanse Lustre filesystem. This is leading to very slow responses or timeouts on access. We will update once the problem is resolved. SDSC User Services Staff

Posted: March 20, 2026

Update "Reply-to" address for email from ACCESS User Registry

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: registry.access-ci.org

Start Date: December 3, 2024, 2:00 p.m.

End Date: December 3, 2024, 2:00 p.m.

On December 3, 2024, the "Reply-to" mail header for email sent from the ACCESS User Registry (https://registry.access-ci.org/) will be changed from "registry@cilogon.org" to "support@access-ci.atlassian.net". This enables users who have problems with their new ACCESS IDs to easily open an ACCESS Help Ticket (instead of contacting CILogon support staff). Note that this only affects the "Reply-to" mail header. The "From" mail header will remain "registry@cilogon.org" so as not to affect any existing email filters.

Posted: March 20, 2026

Expanse racks impacted by power maintenance

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org

Start Date: November 25, 2024, 4:30 p.m.

End Date: November 26, 2024, 1:00 a.m.

Dear Expanse User, A maintenance is ongoing to update part of the datacenter power infrastructure. This has impacted more Expanse racks than we originally anticipated (due to cooling considerations) and as a result some of the jobs running on Expanse were impacted. The jobs will get a NODE_FAIL error in Slurm so they will not be charged SUs. We will update once the maintenance is complete and the nodes are returned to service. In the interim, Expanse will have fewer available nodes than normal so wait times will likely increase today. We are sorry for the unexpected impact and please send us a ticket (either ACCESS or SDSC ticketing system) if you have any questions. Thanks SDSC User Services Staff

Posted: March 20, 2026

Discounted Exchange rate for ACCESS credits on Anvil CPU

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: anvil.purdue.access-ci.org

Start Date: November 19, 2024, 6:00 a.m.

End Date: March 31, 2025, 5:00 a.m.

Anvil CPU is now offering a discounted exchange rate for researchers with Explore, Discover, and Accelerate allocations. The revised exchange rate is now 1 ACCESS credit = 1 Anvil CPU service unit (core hour). This new rate will be applicable for all exchanges to Anvil CPU starting 11/19/2024. Please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/help-ticket if you have any questions.

Posted: March 20, 2026