Hashing THEN AND NOW MIKE SMORUL – ADAPT PROJECT
May 26, 2015
HashingTHEN AND NOW
MIKE SMORUL – ADAPT PROJECT
Commodity Storage Performance
2003 JetStor III IDE-FC62MB/s large block
2013 218MB/s workstation SSDPerc 6/MD1000, 400MB/s+
Chip Speed
2003: Pentium 4 3.2Ghz
2013: Core i7 Extreme3.5Ghz
Hashing Performance
SHA-256 HashingJava: 85MB/sCrypto++: 111-134MB/s
Real World PenaltyJava: 20-40% penalty on
slow seek disk
Implications
Flipped bottlenecks
Parallelize Digesting
Independent IO and digest threads
Always have work for the digest algorithm.
Large files saw over 95% of algorithm potential.
Small files unchanged.
Securing Data in Motion
?
Integrity across the network
Internal AuditingProve your hardware
Peer-AuditingProve your friends
Digital SignaturesProve identity
Token BasedProve time
Chronopolis Integrity
Current:Producer supplied
authoritative manifestPeers locally monitor
integrityManually trace back to
point of ingest
Chronopolis Integrity
In-progressSingle integrity token back
to ingestIdeal
Tokens issued prior to arrival‘Prove’ the state of data to
point before Chronopolis
Manifests 2.0
Token manifestsPortable, embeddable
Python, etc
Integrity supporting Provenance
Digests in a cloud validate transfer only
Http headers can pass extended integrity informationEnd-user verification
Closing
Why are you hashing?What do you want to
prove?Hashing Cost/performance