Top Banner
GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008
36

GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Mar 28, 2015

Download

Documents

Isaac Clayton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

GridPP3 StoragePerspective, Achievements, Challenges

Jens Jensen, STFC RAL

GridPP20

TCD Dublin, 11-12 March 2008

Page 2: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

“Bear with me for a moment”

• View of the past– Achievements– Lessons learned

• Present– SRM 2 deployment

• Future– Todo– Really high level stuff

Page 3: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Who we are…

• GridPP storage community• As defined by mailing list, has ~55

members– Covers every UK site– Also in .ie, .nl, .ca, .pl, .it, .de

• However, not all are equally active…– But that’s OK– Isn’t it?

Page 4: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Support

Develo

pers

Dev su

pp

ort

Dep

l. sup

port

Grid

PP su

pp

rot

com

mu

nity

sup

pro

t

(loca

l)

use

rs

Page 5: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Support

Develo

pers

Dev su

pp

ort

Dep

l. sup

port

Grid

PP su

pp

rot

com

mu

nity

sup

pro

t

(loca

l)

use

rs

1 person…

Page 6: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Support

Develo

pers

Dev su

pp

ort

Dep

l. sup

port

Grid

PP su

pp

rot

com

mu

nity

sup

pro

t

use

rs

Maybe reality is a little more complicated

Page 7: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Your name appeared among the beneficiaries who will receive a part-payment of US$2.8 million and has been approved already for months. You are requested to get back to me for more direction and instruction on how to receive your fund. We want to hear from you before we can make the transfer

• Open for questions, goes to Greig and Jens

• Almost all spam• Promising to solve our financial problems

• They tell us: “Storage, size matters”

[email protected]

Page 8: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Status

Page 9: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Status

Page 10: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Status

• 2/3 of sites running DPM– Experimentally on Lustre– (Cambridge, UCL)

• 1/3 of sites running dCache• Tier 1 running CASTOR

– (and dCache)

• Bristol (Jon) running StoRM

Page 11: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Status

• Finished CCRC 08• Should have SRM2 deployed

– At least for Atlas (sites)• Need space token descrs• Problems with space manager in dCache

– And CMS (sites)• More static token descrs initially

– Information system secondary (tokens static)• Still req’d for accounting

• Many people worked hard to make it a success

Page 12: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Experiences

• Went well, mostly• SRM2 used at RAL

– Few odd bugs and issues

– E.g. “-0.00P” free– Negative file sizes

(gridftp 32 bit issue?)

• Took time to get space token (descr) agreed

• Who speaks for expts?

• Using spaces at T2s– OK for DPMers

• Needs firewall open• Endpoint published• Spaces set up

– Harder for dCache• Problems with space

mgr• But running on same

port

Page 13: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Lessons• No way to get through to everyone

– Needs some effort at sites (to do what we need)– Workshop at NeSC was a success

• Storage is more difficult than you'd think– Particularly the occasional peaks– Implementation specific optimisations– Locating the problem – complex implementations

• Need to manage risks more carefully– GridPP2: surprising number of risks happened!

Page 14: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

risksRisks...(dating back to Dec06-Feb07, needs revision)

Page 15: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Special Achievements

• Beyond the call of duty• Recognised internationally• Or special benefits to users

Page 16: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Information Systems

Information collected globally

Used for

accounting

Users locate

resources

Page 17: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Information Systems

• Much work done on information system backends in GridPP– GIP plugin easier– DPM (Graeme, then Greig)– dCache debug (owned by SARA then DESY)– CASTOR

• Disk servers – Tier 1• CASTOR, LSF, tape robot – RAL Storage• Oracle databases – RAL DB group

Page 18: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Special Achievements

• Accounting– Space “available” and “used”– Resource overview and selection– (or non-selection)

• Numerous subtle issues with space• What is used? Available?• Can info be relied on for selection?• Subtle implementation issues• Long propeller head discussions

Page 19: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

SRM/SRB interoperation

using gLite

• Pretend SRB is a

“Classic SE”• Classic SE still supported

by gLite FTS

FTS

SRBDisk storage

SRM

GridFTPGridFTP

SRM selects pool node…

Disk storage

GridFTP

Disk storage

GridFTP

LFC

Page 20: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Achievements - FTS monitoring

Page 21: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Achievements – standards

• SRM 2.2 is now an OGF standard– Collaboration between SRM developers– …and WLCG– New challenges ahead

• GLUE– Contributed to GLUE SE schema– 1.3, also some for 2.0

Page 22: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

What Keeps the Unreasonable

(Wo)Man Awake at Night?• CUS – Campaign for

Usable Storage• Fabric• Staff...!!• Coordination

Page 23: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

What is Usable Storage

• Users: “we want usable storage”• Deployment: “storage is usable if it’s

being used”• Not necessarily…• Identified (currently) 13 areas

– Somewhat overlapping– But that is normal

Page 24: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

What is Usable Storage

• Robust– Doesn’t fall overMeasure uptime (for some definition of

uptime)

• Good performanceRequests per second, concurrent users

– Can be tested – DESY did this for dCacheCan be tested! (Dave Newbold for CASTOR,

ScotGrid for DPM and dCache)

– (Also tests the SRM itself)

Page 25: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

What is Usable Storage

• Good Overall Data PerformanceTests the data movers and networks

– Experiments are good at this– Also 3rd party transfers, and to tape– Optimisations

• Ensures resource availability– Concurrent users (other experiments, same

expt)Ancient available/used metrics

– Load balancing, dynamic alloc.

Page 26: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

What is Usable Storage

• Monitored. Accountable.– See when something goes wrongReliable accounting dataMinimise downtime

• Maintainable– Ease upgrade, installation and configurationMinimise downtime

• Tested (prior to release)

Page 27: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

What is Usable Storage

• Standards compliant and interoperable– Provides SRM 2.2 / GLUE 1.3 / GridFTP– Extensive test suite available

• Secure– Access control, secure implementations

• Supported– Upstream: developers

• Publishing metadata in current schema• Usable by applications (interfaces)

Page 28: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Challenges

Services

Capabilities

Scale,Performance

Economy,Sustainability

Middleware

State of the Art

Users

Challenges

Page 29: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Users

Applications

Culture,History

Customermgmt

Usability

Users

Page 30: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Services

Trust

Availability

Accounting

Discovery

Services

Page 31: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

State of the Art

WebServices

Virtualisation

Media

State of the Art

Page 32: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Middleware

Stability

Applications

MaintenanceSupport

Ease of installAnd Config

Middleware

Page 33: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Scale, Performance

Staging

Transfer rates Size of files

Number of files

Volume

Scale,Performance

Page 34: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Sustainability, Economy

Scale

Trust Dynamic

Agreement

Cost Model

Economy

Page 35: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Capabilities

Content

Access

Curation

SECURITY

Capabilities

Page 36: GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Jens Jensen, STFC/RAL

Conclusion

• Lots of things achieved• Lots of stuff to do

– Somehow always harder than expected– Doesn’t asymptotically tend to zero– Plus there are regular peaks so it doesn’t even

converge

• Storage is important! should not be underestimated

• Good community to go forward into GridPP3