Top Banner
1 Hash Algorithm
56
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Hashing

1

Hash Algorithm

Page 2: Hashing

Hash AlgorithmHash Algorithm A hash algorithm is a one way function that converts a data

string into a numeric string output of fixed length. The output string is generally much smaller than the original data. Therefore it is also called message digest or message compression algorithm.

Hash algorithms are designed to be collision-resistant, meaning that there is a very low probability that the same string would be created for different data.

Two of the most common hash algorithms are the MD5 (Message-Digest algorithm 5) and the SHA-1 (Secure Hash Algorithm). MD5 Message Digest checksums are commonly used to validate data integrity when digital files are transferred or stored.

Page 3: Hashing

One-way Hash FunctionOne-way Hash Function The notion of a one-way function is central to public-key

cryptography. A one-way hash function is a mathematical function which

takes a variable-length input string (called pre-image) and converts it into a fixed-length binary sequence (called hash value).

It is also known as a message digest, fingerprint or compression function,

Furthermore, a one-way hash function is designed in such a way that it is hard to reverse the process, that is, it is easy to compute a hash value from pre-image but it is hard to generate a pre-image that hashes to a particular value.

Page 4: Hashing

Documentse.g. 10-MB

Hash Valuee.g.160-Bit

Hash FunctionCompress Function

A one way hash function H(M) operates on an arbitrary length pre-image message M, and return a fixed length hash value h.

h=H(M) ,where h is the length of m

One-way Hash FunctionOne-way Hash Function

Page 5: Hashing

Many functions can take an arbitrary-length input and return an output of fixed length, but one-way hash functions have additional characteristics that make them one-way:

1. It is relatively easy to compute, but significantly harder to reverse. That is, given M it is easy to compute H(M), but given H(M) it is hard to compute x.

2. Moreover it is also very hard to find another message M’ such that H(M’)=H(M). In other words it is collision resistant.

In this context, "hard" is defined as something like: It would take millions of years to compute M from H(M), even if all the computers in the world were assigned to the problem.

One-way Hash FunctionOne-way Hash Function

Page 6: Hashing

When applying digital signature to a document, we no longer need to encrypt the entire document with a sender's private key, it can be extremely slow. It is sufficient to encrypt the document's hash value instead. Therefor hash algorithm is used to digest the message before applying DSA.

Although a one-way hash function is used mostly for generating digital signatures, it can have other practical applications as well, such as message integrity, password verification, generation of pseudorandom bits, file identification and message authentication code (MAC.)

The Microsoft cryptographic providers support these hash algorithms: MD4, MD5, SHA-1 and SHA256.

One-way Hash FunctionOne-way Hash Function

Page 7: Hashing

Hash functions of 64 bits are just too small to survive a birthday attack. Most practical one-way hash functions produce 128-bit hashes.

This forces anyone attempting the birthday attack to hash 264 random documents to find two that had the same hash value, not enough for lasting security.

NIST, in its Secure Hash Standard (SHS), uses a 160-bit hash value. This makes the birthday attack even harder, requiring 280 random hashes.

Length of One-way Hash Length of One-way Hash FunctionFunction

Page 8: Hashing

A slight change in an input string should cause the hash value of the function to change drastically. Even if 1 bit is flipped in the input string, at least half of the bits in the hash value will flip as a result. This is called an avalanche effect.

Since it is computationally infeasible to produce a document that would hash to a given value or find two documents with the same hash value.

As there is almost no chance to have the same hash value of two different messages, so it is called collision free or collision resistant.

A document's hash can serve as a cryptographic equivalent of the document. This makes a one-way hash function a central notion in public-key cryptography.

Collision ResistanceCollision Resistance

Page 9: Hashing

Hash Collision Hash Collision When different input message results in the same

hash value, then it is called hash collision.

Page 10: Hashing

Application: Hashing Application: Hashing PasswordPassword

Hashing passwords: It's a bad idea for computer systems to store passwords in cleartext (in their original form), because if the bad guy can somehow get to where they're stored, he gets all the passwords.

Knowing how many people foolishly use one password at multiple sites, getting a stash from one system may give access to others.

A more secure way is to store a hash of the password, rather than the password itself. Since these hashes are not reversible, there is no way to find out for sure "what password produced this hash?" - and the so consequence of a compromise is much lower.

Page 11: Hashing

Application: Hashing Application: Hashing PasswordPassword

How password

is stored using hash

“Hello”

Hash Algorithm

9a46ba811185c194762

Hash of the Password

Stored

Original Password

Hashed Password

Page 12: Hashing

Applications of HashApplications of Hash

“World”

Hash Algorithm

Hash of the Password

Stored

Wrong Password

9a46ba811185c194762er4a46b7w0534894789

Do Hashes Matche

d?

Access GrantedAccess Denied

Hash Value Mismatched

[Yes][No]

How password

is verified using hash

Page 13: Hashing

Application: Verifying the Application: Verifying the IntegrityIntegrity

Verifying file integrity The most obvious use is "verifying file integrity".

If you have just downloaded a large piece of software from a website, how do you know that you've received it correctly and that it has not been tampered with?

One way is to download the file again and compare the bits: if the bits are the same, you're probably ok, but if they're different, which ones are the right bits?

Finding out means yet another download with compare, and this gets very tedious very quickly. Instead, if the website publishes the hash values of its download bundles, you can check it yourself.

Page 14: Hashing

Application: With Digital Application: With Digital SignatureSignature

Hashing is to digest the original message while signing the document digitally.

I agree to pay $50 for

the software

Hash Algorithm

er4a46b7w0534894789

Encrypt

43985dlfslfnsv9064klj79dsflk6

Private KeyHash of Document

Digital Signature

Document to be Signed

Page 15: Hashing

Trapdoor One-way FunctionTrapdoor One-way Function A trapdoor one-way function is a special type of one-way

function, one with a secret trapdoor. It is easy to compute in one direction and hard to compute in the other direction. But, if you know the secret, you can easily compute the function in the other direction.

That is, it is easy to compute f(x) given x, and hard to compute x given f(x). However, there is some secret information, y, such that given f(x) and y it is easy to compute x.

A watch is a good example of a trap-door one-way function. It is easy to disassemble a watch into hundreds of pieces. It is very difficult to put those tiny pieces back together into a working watch. However, with the assembly instructions it is much easier

Page 16: Hashing

What is Birthday AttackWhat is Birthday Attack A birthday attack is a name used to refer to a class of brute-

force attacks. It is a type of cryptographic attack that exploits the mathematics behind the birthday problem in probability theory. This attack can be used to abuse communication between two or more parties.

It gets its name from the surprising result that the probability that two or more people in a group of 23 people share the same birthday is greater than 50.7%. Such a result is called a birthday paradox.

Birthday attacks are often used to find collisions of hash functions. However to understand birthday attack we have study the birthday problem.

Page 17: Hashing

Birthday ProblemBirthday Problem In probability theory, the birthday problem or birthday

paradox concerns the probability that, in a set of randomly chosen people, some pair of them will have the same birthday.

By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367, since there are 366 possible birthdays, including February 29.

However, 99.9% probability is reached with just 70 people, and 50% probability with 23 people. These conclusions include the assumption that each day of the year (except February 29) is equally probable for a birthday.

The mathematics behind this problem led to a well-known cryptographic attack called the birthday attack, which uses this probabilistic model to reduce the complexity of finding a collision for a hash function.

Page 18: Hashing

Mathematical base of Birthday Mathematical base of Birthday ProblemProblem

The problem is to compute the approximate probability that in a group of n people, at least two have the same birthday.

The goal is to compute P(A), the probability that at least two people in the room have the same birthday.

However, it is simpler to calculate P(A'), the probability that no two people in the room have the same birthday. Because A and A' are the only two possibilities and are also mutually exclusive, P(A) = 1 − P(A').

When events are independent of each other, the probability of all of the events occurring is equal to a product of the probabilities of each of the events occurring. Therefore, if P(A') can be described as 23 independent events, P(A') could be calculated as P(1) × P(2) × P(3) × ... × P(23).

Page 19: Hashing

Mathematical base of Birthday Mathematical base of Birthday ProblemProblem

The 23 independent events correspond to the 23 people, and can be defined in order. Each event can be defined as the corresponding person not sharing his/her birthday with any of the previously analyzed people.

For Event 1, there are no previously analyzed people. Therefore, the probability, P(1), that Person 1 does not share his/her birthday with previously analyzed people is 1, or 100%.

Ignoring leap years for this analysis, the probability of person 1 can also be written as 365/365, for reasons that will become clear below.

For Event 2, the only previously analyzed people are Person 1. Assuming that birthdays are equally likely to happen on each of the 365 days of the year, the probability, P(2), that Person 2 has a different birthday than Person 1 is 364/365. This is because, if Person 2 was born on any of the other 364 days of the year, Persons 1 and 2 will not share the same birthday.

Page 20: Hashing

Mathematical base of Birthday Mathematical base of Birthday ProblemProblem

Similarly, if Person 3 is born on any of the 363 days of the year other than the birthdays of Persons 1 and 2, Person 3 will not share their birthday. This makes the probability P(3) = 363/365

P(A') is equal to the product of these individual probabilities:

Then

Finally P(A’) = 0.492703 Now as P(A)=1-P(A’) then P(A)= 1- 0.492703= 0.507297 or 50.7% So the possibility of 2 person in a group of 23 people have same

birthday is 50.7%

Page 21: Hashing

Mathematical base of Birthday Mathematical base of Birthday ProblemProblem

So the possibility of 2 person in a group of 23 people have same birthday is P(A)= 1- P(A‘)

Where P(A’)= is the possibility of not have same birthday, that is for a group of 23 people

Then we can write in general, the possibility of 2 person in a group of n people have same birthday is P(A)= 1- P(A‘)

Page 22: Hashing

Snefru: Hash AlgorithmSnefru: Hash Algorithm• Snefru is a cryptographic hash function invented by Ralph

Merkle in 1990 while working at Xerox PARC. It was named after the Egyptian Pharaoh Sneferu, continuing the tradition of the Khufu and Khafre block ciphers.

• The function supports 128-bit and 256-bit output, meaning Snefru hashes arbitrary-length messages into either 128-bit or 256-bit values.

The original design of Snefru was shown to be insecure by Eli Biham and Adi Shamir who were able to use differential cryptanalysis to find hash collisions

Page 23: Hashing

MD4 HashingMD4 Hashing Both MD4 and MD5 were invented by Ron Rivest . MD stands

for Message Digest. The algorithms produce 128-bit hash values.

It takes a message of arbitrary length as input and produces a 128 bit “fingerprint” or “message digest” or hash value as output .

It is computationally infeasible to produce two messages having the same message digest.

This algorithm has incluenced the posterior algorithms like MD5, SHA and RIPEMD.

MD4 is used to compute NTLM password-derived key digests on Microsoft Windows NT, XP, Vista and 7

Page 24: Hashing

MD4 HashingMD4 HashingMD4("The quick brown fox jumps over the lazy dog")

= 1bee69a46ba811185c194762abaeae90

Even a small change in the message will (with overwhelming probability) result in a completely

different hash, e.g. changing d to c:MD4("The quick brown fox jumps over the lazy cog")

= b86e130ce7028da59e672d56ad0113df

The hash of the zero-length string is:MD4("") = 31d6cfe0d16ae931b73c59d7e0c089c0

Page 25: Hashing

MD4 AlgorithmMD4 Algorithm We begin by supposing that we have a b bit message as input,

and that we wish to find its message digest. Here b is an arbitrary nonnegative integer; b may be zero, it need not be a multiple of 8, and it may be arbitrarily large. We imagine the bits of the message written down as follows: m0 m1 m2 m3... mb-1

Page 26: Hashing

Little VS Big EndianLittle VS Big Endian We've defined a word to mean 32 bits. This is the same as 4

bytes. Integers, single-precision floating point numbers, and MIPS instructions are all 32 bits long. How can we store these values into memory? After all, each memory address can store a single byte, not 4 bytes.

The answer is simple. We split the 32 bit quantity into 4 bytes. For example, suppose we have a 32 bit quantity, written as 90AB12CD16, which is hexadecimal. Since each hex digit is 4 bits, we need 8 hex digits to represent the 32 bit value.

So, the 4 bytes are: 90, AB, 12, CD where each byte requires 2 hex digits.

It turns out there are two ways to store this in memory.

Page 27: Hashing

Little VS Big EndianLittle VS Big Endian

Address Value

1000 CD

1001 12

1002 AB

1003 90

In little endian, you store the least significant byte in the smallest address. That is the least significant byte is stored first.

Little Little EndianEndian

In big endian, you store the most significant byte in the smallest address, which is in the reverse order compared to little endian.

Address Value1000 90 1001 AB 1002 12 1003 CD

Big Big EndianEndian

Page 28: Hashing

MD4 : Append bitsMD4 : Append bits Step 1. Append padding bits : The message is padded

(extended) so that its length (in bits) is congruent to 448, modulo 512. That is, the message is extended so that it is just 64 bits shy(Short) of being a multiple of 512 bits long. Padding is always performed, even if the length of the message is already congruent to 448, modulo 512 (in which case 512 bits of padding are added).

Padding is performed as follows: a single “1” bit is appended to the message, and then enough zero bits are appended so that the length in bits of the padded message becomes congruent to 448, modulo 512. (This padding operation is invertible, SO that different inputs yield different outputs-this would not be true if we merely(rarely) padded with 0’s.)

Page 29: Hashing

MD4 : Append Length MD4 : Append Length Step 2. Append length: A 64-bit representation of b (the

length of the message before the padding bits were added) is appended to the result of the previous step. These bits are appended as two 32-bit words and appended low-order word first in accordance with the previous conventions. In the unlikely event that b is greater than 264, then only the low-order 64 bits of b are used.

At this point the resulting message (after padding with bits and with b) has a length that is an exact multiple of 512 bits. Equivalently, this message has a length that is an exact multiple of 16 (32-bit) words. Let M[O.. . N – 1] denote the words of the resulting message, where N is a multiple of 16.

Page 30: Hashing

MD4 : Initialize MD BufferMD4 : Initialize MD Buffer Step 3. Initialize MD buffer: A 4-word buffer (A, B, C, D) is

used to compute the message digest. Here each of A, B, C, D is a 32-bit register. These registers are initialized to the following values (in hexadecimal, low-order bytes first):

word A: 01 23 45 67

word B: 89 ab cd ef

word C: fe dc ba 98

word D: 76 54 32 10

Page 31: Hashing

MD4 : Process MessageMD4 : Process Message Step 4. Process message in 16-word blocks : Process message in 16-

word blocks. It contain 3 round with 16 steps or operation each(MD5 has 4 rounds). It take three 32 bit words as input and produce one 32 bit word as output.

We first define three auxiliary functions that each take as input three 32-bit words and produce as output one 32-bit word.

F(X, Y, Z)=(X Y ) (¬X Z) [Step 0 to 15]∧ ∨ ∧G(X, Y, Z)=(X Y ) (X Z) (Y Z) [Step 16 to 31]∧ ∨ ∧ ∨ ∧H(X, Y, Z)=X Y Z [Step 32 to 47]⊕ ⊕

Where is XOR, ⊕ is AND, is OR and ∧ ∨ ¬ is NOT In each bit position F facts as a conditional: if x then y else z. In each bit

position G acts as a majority function: if at least two of x,y, z are one, then G has a one in that position. The function H is the bit-wise xor or parity function.

MD4 utilizes two “magic constants” in rounds two and three.

Page 32: Hashing

MD4 : Output MessageMD4 : Output Message Step 5 – output : The message digest produced as output is A, B,

C, D. That is, we begin with the low-order byte of A, and end with the high-order byte of D

Page 33: Hashing

MD4 Design GoalMD4 Design Goal Rivest outlined his design goals for the algorithm:

1. Security : It is computationally infeasible to find two messages that hashed to the same value.

2. Direct Security : MD4’s security is not based on any assumption rather than the computational efficiency of computer

3. Attack : No attack is more efficient than brute force.

4. Speed : MD4 is suitable for high-speed software implementations. It is based on a simple set of bit manipulations on 32-bit operands.

5. Simplicity and Compactness : MD4 is as simple as possible, without large data structures or a complicated program.

6. Favor Little-Endian Architectures : Meaning it stores the least-significant byte of a word in the low-address byte position(to the right most position).MD4 is optimized for microprocessor architectures (specifically Intel microprocessors); larger and faster computers make any necessary translations.

Page 34: Hashing

Security of MD4Security of MD4 The security of MD4 has been severely compromised. The first

full collision attack against MD4 was published in 1995 and several newer attacks have been published since then. As of 2007, an attack can generate collisions in less than 2 MD4 hash operations. A theoretical preimage attack also exists.

For evaluating the strength of a hash function 2 concepts are in use.

1. Resistant to preimage attack: Given a hash value obtain a message that has the same hash value.

2. Resistant to collision attack: Obtain two message that has the same hash.

Page 35: Hashing

MD4 VS MD5MD4 VS MD5 Actually both MD4 & MD5 are not encryption algorithm, they are a

hash function that produce a 128-bit hash value. These are used with encryption algorithms, sometimes.

The following are the differences between MD4 and MD5:

1. A fourth round has been added.

2. Each step now has a unique additive constant.

3. The function G in round 2 was changed from (XY v XZ v YZ) to (XZ v Y not(Z)) to make G less symmetric.

4. Each step now adds in the result of the previous step. This promotes a faster "avalanche effect".

5. The order in which input words are accessed in rounds 2 and 3 is changed, to make these patterns less like each other.

6. The shift amounts in each round have been approximately optimized, to yield a faster "avalanche effect." The shifts in different rounds are distinct.

Page 36: Hashing

MD5 HashingMD5 Hashing The MD5 message-digest algorithm is a widely used

cryptographic hash function producing a 128-bit (16-byte) hash value, typically expressed in text format as a 32 digit hexadecimal number. MD5 were invented by Ron Rivest as an improved version of MD4.

MD5 has been utilized in a wide variety of cryptographic applications, and is also commonly used to verify data integrity. It is Intended where a large file must be “compressed” in a secure manner before being encrypted with a private key under a public-key cryptosystem such as PGP.

MD5 can be used to store a one-way hash of a password, often with key stretching.

Page 37: Hashing

Security of MD5Security of MD5 In 2004 it was shown that MD5 is not collision resistant. As

such, MD5 is not suitable for applications like SSL certificates or digital signatures that rely on this property for digital security.

Also in 2004 more serious flaws were discovered in MD5, making further use of the algorithm for security purposes questionable. Specifically, a group of researchers described how to create a pair of files that share the same MD5 checksum.

Further advances were made in breaking MD5 in 2005, 2006, and 2007.

In December 2008, a group of researchers used this technique to fake SSL certificate validity, and CMU Software Engineering Institute now says that “MD5 should be considered cryptographically broken and unsuitable for further use"

Page 38: Hashing

MD5 : Append bitsMD5 : Append bits Step 1. Append padding bits : The message is padded

(extended) so that its length (in bits) is congruent to 448, modulo 512. That is, the message is extended so that it is just 64 bits shy(Short) of being a multiple of 512 bits long. Padding is always performed, even if the length of the message is already congruent to 448, modulo 512 (in which case 512 bits of padding are added).

Padding is performed as follows: a single “1” bit is appended to the message, and then enough zero bits are appended so that the length in bits of the padded message becomes congruent to 448, modulo 512. (This padding operation is invertible, SO that different inputs yield different outputs-this would not be true if we merely(rarely) padded with 0’s.)

Page 39: Hashing

MD5 : Append Length MD5 : Append Length Step 2. Append length: A 64-bit representation of b (the

length of the message before the padding bits were added) is appended to the result of the previous step. These bits are appended as two 32-bit words and appended low-order word first in accordance with the previous conventions. In the unlikely event that b is greater than 264, then only the low-order 64 bits of b are used.

At this point the resulting message (after padding with bits and with b) has a length that is an exact multiple of 512 bits. Equivalently, this message has a length that is an exact multiple of 16 (32-bit) words. Let M[O.. . N – 1] denote the words of the resulting message, where N is a multiple of 16.

Page 40: Hashing

MD5 : Initialize MD BufferMD5 : Initialize MD Buffer Step 3. Initialize MD buffer: A 4-word buffer (A, B, C, D) is

used to compute the message digest. Here each of A, B, C, D is a 32-bit register. These registers are initialized to the following values (in hexadecimal, low-order bytes first):

word A: 01 23 45 67

word B: 89 ab cd ef

word C: fe dc ba 98

word D: 76 54 32 10

There are called chaining variables

Page 41: Hashing

MD5 : Process MessageMD5 : Process Message Step 4. Process message in 16-word blocks : Process message in 16-

word blocks. It contain 4 round with 16 steps or operation each(MD4 has 3 rounds). It take three 32 bit words as input and produce one 32 bit word as output.

We first define three auxiliary functions that each take as input three 32-bit words and produce as output one 32-bit word.

F(X,Y,Z) = (X∧Y) ∨ ((¬ X)∧Z) [Step 0 to 15]

G(X,Y,Z) = (X∧Z) ∨ (Y∧(¬ Z)) [Step 16 to 31]

H(X,Y,Z) = X Y Z⊕ ⊕ [Step 32 to 47]

I(X,Y,Z) = Y ⊕ (X∨(¬ Z)) [Step 48 to 64] In each bit position f acts as a conditional: if x then y else z. In each bit

position g acts as a majority function: if at least two of x,y, z are one, then g has a one in that position. The function h is the bit-wise xor or parity function.

MD4 utilizes two “magic constants” in rounds two and three. The round two constant is fi and the round 3 constant is a

Page 42: Hashing

MD5 : Output MessageMD5 : Output Message Step 5 output : The message digest produced as output is A, B,

C, D. That is, we begin with the low-order byte of A, and end with the high-order byte of D

Main Loop of MD5

Page 43: Hashing

One MD5 OperationOne MD5 Operation

MD4 consists of 48 of these operations, grouped in 3 rounds of 16 operations.

While MD5 consists of 64 of these operations, grouped in 4 rounds of operations.

F is a nonlinear function; one function is used in each round.

Mi denotes a 32-bit block of the message input, and Ki denotes a 32-bit constant, different for each operation

Page 44: Hashing

MD5 : ApplicationsMD5 : Applications MD5 digests have been widely used in the software world to

provide some assurance that a transferred file has arrived intact. For example, file servers often provide a pre-computed MD5 (known as Md5sum) checksum for the files, so that a user can compare the checksum of the downloaded file to it.

Most unix-based operating systems include MD5 sum utilities in their distribution packages;

Windows users may install a Microsoft utility or use third-party applications.

Android ROMs also utilize this type of checksum.

Page 45: Hashing

SHA: Secure Hashing AlgorithmSHA: Secure Hashing Algorithm The Secure Hash Algorithm is a family of cryptographic hash

functions published by the National Institute of Standards and Technology (NIST) as a U.S. Federal Information Processing Standard (FIPS)

It includes the following variations:

1. SHA

2. SHA-0

3. SHA-1

4. SHA-2

5. SHA-3

Page 46: Hashing

The SHA is called secure because it is designed to be computationally infeasible to find two different messages which produce the same message digest. Any change to a message in transit will result in a different message digest, and the signature will fail to verify.

Secure Hash Algorithm (SHA) is necessary to ensure the security of the Digital Signature Algorithm (DSA).It takes a message of any length <264 bits as input and produces a 160-bit message digest as output.

The message digest is then input to the DSA, which computes the signature for the message. Signing the message digest rather than the message often improves the efficiency of the process, because the message digest is usually much smaller than the message.

SHA: Secure Hashing AlgorithmSHA: Secure Hashing Algorithm

Page 47: Hashing

SHA: VariationsSHA: Variations SHA: It is the original version and 160-bit hash function

published in 1993. SHA-0: A retronym applied to the original version of the 160-

bit hash function published in 1993 under the name "SHA". It was withdrawn shortly after publication due to an undisclosed "significant flaw" and replaced by the slightly revised version SHA-1.

SHA-1: A 160-bit hash function which resembles the earlier MD5 algorithm. This was designed by the NSA to be part of the Digital Signature Algorithm. Cryptographic weaknesses were discovered in SHA-1, and the standard was no longer approved for most cryptographic uses after 2010.

Page 48: Hashing

SHA: VariationsSHA: Variations

• SHA-2: It is a family of two similar hash functions known as SHA-256 and SHA-512, with different block sizes. Both algorithm belongs to SHA-2. They differ in the word size. SHA-256 uses 32-bit words where SHA-512 uses 64-bit words. There are also truncated versions of each standard, known as SHA-224, SHA-384, SHA-512/224 and SHA-512/256. These were also designed by the NSA.

• SHA-3: A hash function formerly called Keccak, chosen in 2012 after a public competition among non-NSA designers. It supports the same hash lengths as SHA-2 but its internal structure differs significantly from the rest of the SHA family.

Page 49: Hashing

SHA: Append BitsSHA: Append Bits Suppose given a b-bit message as input and we need to find its

message digest. Step 1. Append padding bits : The message is padded exactly as

MD5. The message is padded (extended) so that its length (in bits) is

congruent to 448, modulo 512. That is, the message is extended so that it is just 64 bits shy(Short) of being a multiple of 512 bits long. Padding is always performed, even if the length of the message is already congruent to 448, modulo 512 (in which case 512 bits of padding are added).

Padding is performed as follows: a single “1” bit is appended to the message, and then enough zero bits are appended so that the length in bits of the padded message becomes congruent to 448, modulo 512.

Page 50: Hashing

Step 2 – append length: A 64 bit representation of message b is appended to the result of the previous step . The resulting message has a length that is an exact multiple of 512 bits.

Step 3 – Initialize MD Buffer: A five-word buffer (A,B,C,D) is used to compute the message digest. Here each of A,B,C,D,E is a 32 bit register . These registers are initialized to the following values in hexadecimal:

Word A : 67 45 23 01

Word B : ef cd ab 89

Word C : 98 ba dc fe

Word D : 10 32 54 76

Word E : c3 d2 e1 f0 These are called chaining variables

SHA: Append BitsSHA: Append Bits

Page 51: Hashing

Step 4 – Process message in 16-word blocks. It contains 4 rounds with 20 steps or operations each(MD4 has 3 and MD5 has 4rounds). Four auxiliary functions that take three 32 bit as input words and produce one 32-bit word as output.

ft(X,Y,Z) = (X∧Y) ∨ ((¬ X)∧Z) for t=0 to 19

ft(X,Y,Z) = X Y Z for t=20 to 39⊕ ⊕

ft(X,Y,Z) = (X Y ) (X Z) (Y Z) for t=40 to 59∧ ∨ ∧ ∨ ∧

ft(X,Y,Z) = X Y Z for t=60 to 79⊕ ⊕

Step 5 – output : The message digest produced as output is A, B, C, D. That is, output begins with the low-order byte of A, and end with the high-order byte of E.

SHA: Process MessageSHA: Process Message

Page 52: Hashing

SHA-1 VS MD5SHA-1 VS MD5

1. Brute force attack is harder (160 vs 128 bits for MD5)

2. Not vulnerable to any known attacks (compared to MD4/5)

3. Little slower than MD5 (80 vs 64 steps)

4. Both designed as simple and compact

5. Optimised for big endian CPU's (vs MD5 which is optimised for little endian CPU’s)

Page 53: Hashing

Security of SHASecurity of SHA Ron Rivest outlined the improvements of SHA with respect

to MD4 and MD5:

1. “A fourth round has been added.” SHA does this too. But in SHA 4th round uses the same f function as the 2nd round.

2. “ Each step now has a unique additive constant ”. But SHA reuses the constants for each group of 20 rounds like the MD4.

3. “The function G in round 2 was changed from ((X∧ Y ) ∨( X ∧ Z ) ∨ ( Y ∧ Z )) to ((X∧Z) ∨(Y∧ (¬ Z )) to make G less symmetric.” SHA uses the MD4 version ((X∧ Y ) ∨( X ∧ Z ) ∨ ( Y ∧ Z ))

Page 54: Hashing

4. “The order in which message sub-blocks are accessed in rounds 2 and 3 is changed”. SHA is completely different ,it uses a cyclic error correcting code.

5. “Each step now adds in the result of the previous step. This promotes a faster avalanche effect”. SHA also follows this change

6. “The amounts of left circular shift in each round have been approximately optimized, to yield a faster avalanche effect. The four shifts used in each round are different from the ones used in other rounds.” SHA uses a constant amount of shifts in each round like MD4

Security of SHASecurity of SHA

Page 55: Hashing

ReferencesReferences http://en.wikipedia.org/wiki/ Applied Cryptography by Bruce Schneier; 10th Anniversary

edition

Page 56: Hashing

56

Md. Shakhawat HossainStudent of Department of Computer Science & EngineeringUniversity of RajshahiE-mail: [email protected]