Securing HPC using Federated Authentication
Federated authentication can drastically reduce the overhead of basic account maintenance while simultaneously improving overall system security. Integrating with the user’s more frequently used account at their primary organization both provides a better experience to the end user and makes account compromise or changes in affiliation more likely to be noticed and acted upon. Additionally, with many organizations transitioning to multi-factor authentication for all account access, the ability to leverage external federated identity management systems provides the benefit of their efforts without the additional overhead of separately implementing a distinct multi-factor authentication process. This paper describes our experiences and the lessons we learned by enabling federated authentication with the U.S. Government PKI and InCommon Federation, scaling it up to the user base of a production HPC system, and the motivations behind those choices. We have received only positive feedback from our users.
💡 Research Summary
This paper presents a practical deployment of federated authentication for a large‑scale high‑performance computing (HPC) environment at the MIT Lincoln Laboratory Supercomputing Center (LLSC). The authors describe how they integrated two major federated identity ecosystems—the academic‑focused InCommon Federation and the U.S. Government Public Key Infrastructure (PKI)—into the MIT SuperCloud Portal, thereby reducing administrative overhead, improving user experience, and strengthening security through existing multi‑factor authentication (MFA) mechanisms.
The motivation stems from managing over a thousand users spread across MIT, the Massachusetts Green High‑Performance Computing Center (MGHPCC) member institutions, and external collaborators. Traditional local account management required frequent password resets, identity verification, and manual group assignments, consuming a large fraction of staff time and exposing the system to delayed detection of compromised or orphaned accounts. By delegating authentication to users’ home institutions, the LLSC could offload these routine tasks while gaining automatic lifecycle management (e.g., account termination when the home institution disables the user).
The InCommon Federation provides SAML 2.0‑based single sign‑on (SSO) for more than 1,000 higher‑education and research organizations. MIT’s Touchstone identity provider, a member of InCommon, enforces Duo‑based MFA. Many other InCommon IdPs also require or support MFA, allowing the HPC platform to inherit strong authentication without additional infrastructure. The U.S. Government PKI comprises three programs—DoD (Common Access Card), Federal Common Policy (PIV cards), and the Federal Bridge for commercial participants. These programs issue X.509 certificates with policy OIDs that encode the Level of Assurance (LoA). Cross‑certificates enable mutual trust among the programs, creating a web of trust that encompasses over 5.4 million smart‑card holders.
Implementation was built around the MIT SuperCloud Portal, an Apache httpd‑based web interface that already impersonates the authenticated user for all OS calls via a custom multi‑processing module (MPM). To support federated authentication, the authors replaced the original HTTP Basic authentication (which is stateless and unsuitable for token‑based flows) with a form‑based login backed by cookie‑based session tracking, while still using Linux PAM for the initial credential check. They adopted the open‑source SimpleSAMLphp library because the portal’s codebase is PHP‑centric and SimpleSAMLphp offers straightforward metadata handling, automatic federation metadata refresh, and flexible attribute extraction.
During integration, the portal generates its own SAML Service Provider (SP) metadata (entity ID, endpoints, X.509 certificate) and registers it with InCommon. The SP metadata is periodically refreshed from the signed InCommon metadata feed. When a user selects an IdP, the browser is redirected to the IdP’s SSO service; after successful authentication, the IdP posts a signed SAML assertion containing attributes such as eduPersonPrincipalName. The portal extracts this attribute and looks up a matching local Unix account. For MIT users the attribute matches the email address, but many partner institutions provide opaque identifiers. In those cases, the authors employ a manual “failed‑login” workflow: the failed attempt is logged, an administrator reviews the log, extracts the identifier, and populates the ts_principal field of the user record. This workaround addresses privacy‑driven attribute suppression by some IdPs.
Beyond web access, the authors extended the portal to support SSH key registration using the PKCS#11 interface. Smart‑card public keys can be extracted and stored as authorized_keys entries, enabling MFA‑protected SSH sessions via OpenSSH (v5.4+) or PuTTY‑CAC. The portal’s MPM ensures that any action performed through the web UI runs with the same privileges as a direct SSH session, preserving filesystem permission semantics and simplifying auditing.
Security benefits observed include: (1) automatic de‑provisioning when the home institution disables the user, (2) inheritance of institutional MFA (Duo, CAC, PIV), eliminating the need for a separate HPC‑specific MFA rollout, (3) unified logging of web and SSH activities under the same user identity, and (4) reduction of password‑related attack surface. Operational challenges involved handling SAML metadata synchronization, dealing with IdPs that omit requested attributes, and establishing a reliable manual onboarding process for non‑standard identifiers. The authors mitigated these issues through automated metadata refresh, clear error‑logging, and a documented admin workflow.
The deployment successfully scaled to serve millions of potential government PKI users and thousands of InCommon users, receiving uniformly positive feedback from the research community. The paper concludes by outlining future work: fully automated attribute mapping, policy‑driven role‑based access control within the HPC environment, and leveraging federated authentication logs for real‑time threat detection. The authors demonstrate that federated authentication, when thoughtfully integrated, can dramatically streamline HPC user management while elevating the overall security posture.
Comments & Academic Discussion
Loading comments...
Leave a Comment