Computer Security and the Grid
… or how I learned to stop worrying and love The Grid.
Dane SkowFermilab Computer Security Awareness Day
8 March 2005
Grid, grid(s), and more grids
• No commonly accepted definition of a grid.– Grid: New function parallel to Web noosphere– grid: Particular instance of a internally consistent grid.
• Parallels Internet being network of networks
• FNAL participates in Grid at many different levels– Open Science Grid (OSG)– LHC Computing Grid (LCG)– Particle Physics Data Grid (PPDG)– FermiGrid– SAMGrid
• Resources may participate in multiple grids
Why allow Grids ?
• To get the work done
• Better integration host lab and remote resources• Facilitate resource sharing among larger communities• Establish common standards for use of services
• There’s nothing in the Grid that users can’t (and aren’t ) already do independently.
• Faith that facilitating this leads to innovation and improvement
Four Pillars of (Grid) Security
• Identity (DN or public key)– Isolation– Traceability
• Authentication (TLS handshake)– Prevent Identity Theft
• Authorization (gridmapfile or Globus+OGSA-AuthZ+Services)– Access Control– Resource Control
• Audit (logfiles)– Troubleshooting– Forensics– Accounting
Identity
• Needs globally unique name– Cert (or DN) is managed namespace
• /etc/grid-security/certificates lists trusted CAs• /etc/grid-security/xxxxxxxx.signing_policy files are
tool for this
– Public keys statistically unique
• Granularity of identity under debate– Process level ?– Human vrs. Agent moderated ?
Authentication
• Openssl is your friend• Debate over scope of authentication
– Session level– .1 Msec ≈ 1 day– 1 Msec ≈ AFS token lifetime– 10-100 Msec ≈ account revalidation lifetime
• Privacy is serious concern
Authorization
• Gridmapfile is default at /etc/grid-security/grid-mapfile
• Authorization callouts at /etc/grid-security/gsi-authz.conf
• Expression and interpretation of policy by 3rd parties is TBD
Audit
• No common standard or requirement yet.
• Rely on local expertise and experience to guide
• Not clear what tools are useful/needed
Putting it all together
VOMS Attribute Server
(push - user,VO)
AuthZ Provisioning (Site, eg. GUMS,
Resource)
AuthZ Access Control
(VO - eg. RB,Site - eg.SAZ,
Resource)
3rd party authN
Grid Resource
Management of ACLs
ACLs on Resource Object
AuthZ
Policy E
ngine
Incident Response
• Identify– Requester identity (full cert preferred)– Requesting IP address– Requesting identity (full cert preferred)
• Contain– Local action first priority– Now frequently requires coordinated action– FCSC Alert grid incident response channel
• Explain– Need to identify thresholds of investigation (wipe vrs
investigate decision)• Respond
– Authorization is “big stick”, not network directly
Vulnerability Assessment
• Troubles of new software/ideas– Exploit of software vulnerabilities– Configuration errors– Logic errors
• All the same old stuff only more so– Inventory attacks (worms)– Broad authentication cells
• Explicit, not just shared/sniffed passwords
– Trojan applications and system software
Forensics
• New services– Gatekeeper is port 2119 TCP– GSIftp (aka Gridftp) is port 2811 TCP– Various monitoring/directory services
• Grid Service logs– GLOBUS_LOCATION default is /usr/local/grid/globus– Gatekeeper log is at $GLOBUS_LOCATION/var/globus-
gatekeeper.log– GridFTP log is at $GLOBUS_LOCATION/var/gridftp.log
Authorization Services
• SAZ at FNAL– Site AuthoriZation service– Provides single point access control– Offloads CRL maintenance from servers
• GUMS identity mapping– Grid Identity is X509 SubjectName and Issuer– Local Identity is uid (resource scope)– Kerberos/AFS not mapped (site scope)
• Local Service authorization may do provisioning
Forensics (cont’d)
• System tools– Same as before
• Grid tools– Non-existent– Not clear what can/should be automated– Need to involve VO in most investigations
Roles and Responsibilities
• FCIRT continues as FNAL Incident Response Team
• Site Autonomy– Focus on local defense first– Second priority contain damage
• Now include consideration of Grid partners in assessment
• Coordinate with others using OSG Incident Response Plan – Currently in effect for OSG– Adopted as new model for LCG. Migrating.
Roles and Responsibilities
• Notification– Notification infrastructure to support adhoc
and best effort collaboration on incident response.
– Issues:• Skill level and tolerance vary widely• Reasonable response expectations need to be
developed– Likely that some contention will occur while level-
setting is achieved.
Recommended Reading
• SSL and TLS Essentials, Stephen Thomas
• OSG Incident Response Plan– http://computing.fnal
.gov/docdb/osg_documents/0000/000019/002/OSG_incident_handling_v1.0.pdf
• LCG Security Policy Documents– http://proj-lcg-security.web.cern.
ch/proj-lcg-security/documents.html• Globus Documentation
– http://www-unix.globus.org/toolkit/docs/3.2/index.html