HTTPS Bicycle Attack By Guido Vranken <[email protected]> ABSTRACT It is usually assumed that HTTP traffic encapsulated in TLS doesn't reveal the exact sizes of its parts, such as the length of a Coo ie header, or the payload of a HTTP P!ST re"uest that may contain varia#le$length credentials such as pass%ords& In this paper I sho% that the redundancy of the plaintext HTTP headers included in each and every re"uest can #e exploited in order to reveal the length of particular components such as pass%ords( of particular re"uests such as authentication to a %e# application(& The redundancy of HTTP in practice allo%s for an iterative resolution of the length of 'un no%ns' in a HTTP message until the lengths of all its components are no%n except for a coveted secret, such as a pass%ord, %hose length is then implied& The attac furthermore exploits the property of stream$oriented cipher suites such as those #ased on )alois*Counter +ode that the exact size of the plaintext can #e no%n to a man$in$the$middle& The paper furthermore gives insight in ho% very small differences in the length of intercepted encrypted( )PS coordinates can #e used to estimate the location on the %orld map for a particular encrypted coordinate& nother example demonstrates that differences in length of intercepted encrypted( IPv- addresses are #ound to specific IP ranges& The paper concludes %ith a set of proposed mitigations against this attac & Table of Contents HTTPS .icycle ttac&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&/ .ST0CT&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&/ /& Introduction&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&1 /& !n TLS side$channel leas&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&1 1& n ote on TLS records and cipher modes&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& &&&&&&&&&&&&&&&&&&2 1& !vervie% of the attac&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&- /& 3ingerprinting&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&- 1& Length deduction throug h su#traction&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& &&&&4 Step /5 preparation&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&6 Step 15 analysis&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&7 2& Implications&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&8 -& !ther examples&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&8 /& Location leas through encrypted )PS coordinates &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&8
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
It is usually assumed that HTTP traffic encapsulated in TLS doesn't reveal the exactsizes of its parts, such as the length of a Cooie header, or the payload of a HTTPP!ST re"uest that may contain varia#le$length credentials such as pass%ords& In thispaper I sho% that the redundancy of the plaintext HTTP headers included in each andevery re"uest can #e exploited in order to reveal the length of particular componentssuch as pass%ords( of particular re"uests such as authentication to a %e#application(& The redundancy of HTTP in practice allo%s for an iterative resolution ofthe length of 'unno%ns' in a HTTP message until the lengths of all its componentsare no%n except for a coveted secret, such as a pass%ord, %hose length is thenimplied& The attac furthermore exploits the property of stream$oriented cipher suitessuch as those #ased on )alois*Counter +ode that the exact size of the plaintext can#e no%n to a man$in$the$middle&The paper furthermore gives insight in ho% very small differences in the length ofintercepted encrypted( )PS coordinates can #e used to estimate the location on the%orld map for a particular encrypted coordinate& nother example demonstrates thatdifferences in length of intercepted encrypted( IPv- addresses are #ound to specificIP ranges&The paper concludes %ith a set of proposed mitigations against this attac&
/& !n TLS side$channel leas&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&11& note on TLS records and cipher modes&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&2
1& !vervie% of the attac&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&-
/& 3ingerprinting&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&-1& Length deduction through su#traction&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&4
1& Padding the secret&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&/82& !n varia#le$length padding schemes&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&1<-& =sing constant$length identifiers to refer to o#>ects&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&1/
1. Introduction
1. On TLS side-cannel leaks
It has long #een no%n that SSL*TLS from hereon referred to as TLS( is no silver #ullet to o#scure the
#ehavior of a user on a net%or& ?hile the sound configuration of #oth endpoints of a connection isunderstood to prevent the decoding from ciphertext to plaintext %ithout having access to the private
eys(, transactions conducted over a channel em#edded in TLS lea various types of information&
These side$channel leas can #e the can #e the result of the delegation of actions re"uired for proper
data transmission on a net%or to protocols at a higher layer that offer no means of o#scuring ey
information, such as source and destination IP's encoded in the Internet Protocol layer, and source port,
destination port, payload fragmentation an so forth encoded in the TCP layer& !ther types of side$
channel information leas are consist of varia#ility introduced entirely outside of TLS's control, such as
spatial and temporal discrepancies for different payloads& +oreover, the aggregate of the properties of
pacets sent #ac and forth #et%een t%o TLS endpoints constitutes a se"uence that may #e uni"ue
lined to path and resource access on a %e# application& Some properties of a TLS session are left
uno#scured #y design, such as the exact ciphersuite used and the exact length of the plaintext %hen
stream ciphers are used&
lot of research has #een performed on ho% to stac up these different 'no%ns' in order to
meticulously reconstruct the user's actions, given that the encrypted streams are no%n to an o#server
%ho is or has #een listening in on the 'secure' transmission #et%een t%o endpoints&
In this paper I %ill sho% that for a presuma#ly large su#set of %e# applications, it is easy to infer the
length of parts of the plaintext, or certain attri#utes thereof, from a recorded stream of encrypted
messages& Having access to the private ey is not necessary& In fact, the actual ciphertexts em#edded inthe stream are irrelevant to the deduction, and entry$level arithmetic suffices&
This attac has the property of #eing entirely passive& That is, unlie attacs such as .0:CH %hich
relies on CS03 attacs(, the attacer doesn't need to interfere %ith a user's session& ?hile my attac
typically reveals less detailed information than .0:CH, its advantage of my attac lies in the fact
that it cannot raise alarm #ells, and that it can #e applied retroactively@ that is, encrypted streams
recorded years ago can still be picked apart in order to divulge confidential information&
3urthermore, the attac re"uires some information a#out the victim to #e no%n to the attacer& The
more information the attacer no%s, the more information a#out the victim's plaintexts can #e
deduced& .roadly speaing, the attac %ors #y su#tracting the lengths of no%n parts of the plaintext
from the total plaintext size& If no%ing the length of the user's pass%ord for a specific %e#site is theattacer's o#>ective, then the attacer must also no% the user name #elonging to that pass%ord, since
user name and pass%ord are often sent together in an authentication process& The attac can reveal the
length of the concatenation of the user name and the pass%ord& Su#tracting the length of the user name
from this value reveals the length of the pass%ord&
nother user property that is helpful to no% is the #ro%ser used& This aids in predicting %hich headers
a #ro%ser %ill send for various types of %e# resources& This shouldn't #e too difficult to determine in a
directed attac on a specific person, since >ust a single HTTP ie&, insecure( re"uest %ill reveal the
=ser$gent string&
The name TLS Bicycle Attack %as chosen #ecause of the conceptual similarity #et%een ho% encryption
hides content and gift %rapping hides physical o#>ects& +y attac relies heavily on the property of
stream$#ased ciphers in TLS that the size of TLS application data payloads is directly no%n to the
attacer and this inadvertently reveals information a#out the plaintext size@ similar to ho% a draped or
gift$%rapped #icycle is still identifia#le as a #icycle, #ecause cloaing it lie that retains the underlying
shape& The reason that I've named this attac at all is only to mae referring to it easier for everyone&
!. A note on TLS records and ci"er #odes
TLS record, in %hich encrypted data is encapsulated, has t%o fields that are of importance to the
attacer& !ne is the Content Type field& In this document I %ill focus exclusively on content types %ith
the values 12 <x/7(, since these types of records are used to em#ed the actual encrypted payloads& The
other field of interest to us is the 'length' field& This is a /6$#it field %hich reflects the exact size in
#ytes of the encrypted payload&
nother important property to #e a%are of is the ciphersuite #eing used, in particular %hether it
concerns a #loc cipher or a stream cipher& I %ill focus on stream ciphers, %hich have the convenient
property that their output lengths corresponds /5/ %ith the input plaintext( size, although they do not
necessarily have the same size& That is, each #yte added to the plaintext results in one #yte added theencrypted message, though the encrypted message may #e larger than the plaintext ie&, encryption may
add a constant num#er of #ytes, such as overhead(&
!nce an encrypted stream has #een intercepted, the attacer must employ some form of fingerprinting
in order to no% %hich resources in a %e# application %ere accessed& There are many %ays this can #e
achieved& This paper %ill not ela#orate extensively on fingerprinting strategies& 3or demonstration
purposes I %ill #e employing a simple fingerprinting mechanism, %hich consists of taing the full
se"uence of payload lengths of re"uests from the client to the server and calculating the Pearson
correlation coefficient of each su#$se"uence %ith a precomputed se"uence in order to locate the user's
retrieval of an authentication page&
3or instance, loading a page that consists of a couple of AavaScript, CSS and image files %ill #e
reflected as a se"uence of re"uests from the #ro%ser to the server %ith a distinct size& This se"uence of
distinct sizes must computed #y the attacer #efore the attac is executed& This means that the attacer
%ill have to load the page in their #ro%ser and record the size of each re"uest& This precomputation %ill
serve as a template& The attacer can compute the Pearson correlation coefficient from this template
se"uence and the se"uences found in the recorded encrypted stream& !nce a /5/ match is found, the
attacer can safely assume that the client has #een accessing the same page at this point in the
encrypted stream&
The method can #e summarized as follo%s5
Let S #e a se"uence of the values of the 'length' field of TLS pplication 9ata records of HTTPre"uests not responses( to a particular %e# application identified #y its IP address and the hostname
encoded in the TLS SBI extension(& :ach HTTP re"uest corresponds to a separate TLS pplication
9ata payload&
Let T #e the se"uence of re"uest sizes that is uni"ue for access to a particular resource&
Loading %ordpress*%p$login&php on a ?ordPress installation implies the loading of other resources and
results in five re"uests %ith the follo%ing sizes5
en"t( 11 && red/ces candidate poo$ to 5.398029- of ori"ina$000.00.0.00000.0.000.000.00.0.000000.00.00.00.0.000.00000.00.000.000.00.00.000.000.0.0000.000.00.0000.000.0.00000.000.0.00.00.00.0000.000.000.0000.0.00.000.00.000.0000.0.000.00000.0.0.00000.0.00.00000.000.00.0
en"t( 12 && red/ces candidate poo$ to 16.710833- of ori"ina$0.00.000.0000.000.00.00000.000.000.000.000.00.00
en"t( 15 && red/ces candidate poo$ to 13.789183- of ori"ina$000.000.000.000
If you manage to isolate an IPv- address %ith string length 7 for example /&1&2&-( em#edded in
encrypted traffic, you can no% that the plaintext IP is in the range <$8&<$8&<$8&<&8& The total IPv-
space constitutes 146O146O146O146 D -18-867186 different addresses& !#server that an IP %ith string
length 7 is sent reduces this space to /<O/<O/<O/< D /<<<<& This is only <&<<<121;2<6-264- percent
of the original space&
If the attacer manages to isolate a page on a %e# application %here an IPv- address is displayed, and
the rest of the page is static or known thin of a %e# application that displays the IPv- address from
%here the last login %as performed(, then an estimate can #e made as to %hich set of IP ranges itconcerns& The pool may #e further reduced #y mapping the resultant IP ranges to utonomous
Systems/ S( and discarding those utonomous Systems that are not Internet Service Providers in the
case the attacer no%s that the IP address they see to reveal #elongs to a home connection and not a
company server, for instance(&
/. Pre$ention
1. Hasin' before trans#ission
n o#vious one$size$fits$all solution is compute a hash of the pass%ord inside the user's #ro%ser using
AavaScript #efore it is sent off to the server&
N eco &n Opass+ordO P sa256s/m5e884898da28047151d0e56f8dc6292773603d0d6aadd62a11ef721d1542d8N eco &n Oapass+ordtatism/c$on"erO P sa256s/m36a9268776dc62211aa00e768052a628d564e3d0548c1aa65af6c0cfa6570d4
.oth pass%ords result in a hash %ith a length of 6- #ytes&
This happens to have the additional advantage that the plaintext pass%ord is never stored any%here
except temporarily in the user's #ro%ser, as opposed to collectively in the #ro%ser, encrypted( transit,
and on your server prior to storing it encrypted in your data#ase& The do%nside of this approach is that
you can't evaluate the pass%ord strength on the server side& ou could construct a list of a certain
amount of pass%ords that are no%n to #e very common or too short, compute their hashes and fail to
proceed once a user tries to change their pass%ord into one of these strings& Ho%ever, you cannot
verify %hether the user has #een using all of your re"uired sets of characters such as letters, num#ers
and special characters(& !#viously, validation can #e performed %ithin the #ro%ser using AavaScript&
=sers may undermine this #y tampering %ith the AavaScript #ut this is #eside the point #ecause it is
specifically the user %hose safety is attempted to #e strengtened #y taing these measures, and theiro%n attempts at overthro%ing our security considerations are #ut their o%n responsi#ility&
theoretical issue that automatically emerges is that the lea of pass%ord length is moved from the
spatial to the temporal domain& That is to say, no% the o#server %ill not #e a#le to infer the length of
the pass%ord, #ut the length might influence the time and resources re"uired to compute the hash&
Ho%ever, the detection of such microscopic details is usually confined to la#oratory settings, and as
long as the client$side code isn't programmed to signal the server that it is currently computing the hash
and allo%ing the o#server to discern the exact amount of time elapsed #et%een computation and form
su#mission(, this shouldn't really #e a pro#lem&
!. Paddin' te secret
n alternative to this approach is to simply pad the pass%ord right #efore form su#mission to a length
that you consider to #e a safe maximum$size constraint for a user pass%ord, say, a /<<< characters&
Ho% to actually implement the padding might not #e straight$for%ard&
Trailing spaces might #e part of the actual pass%ord the user has thought up& Trailing spaces might #e
truncated #y some #ro%sers&
Padding it %ith zeros as in the SCII character <x<<( might #rea some things as %ell&
:m#edding the zero padding in AS!B %on't %or either, #ecause AS!B %ill replace those %ith
=nicode escapes such as u<<<<, %hich again leas pass%ord lengths&
dding an additional parameter to the P!ST su#mission, say, '', %hich %ill #e represented as
'D&&&&&&' the and the is$e"ual sign are 1 characters( and padding this %ith /<<< minus 1 minus
pass%ordQlength characters is a hac using hard$coded values and it's ugly&
?hat I suggest is to pad the pass%ord string %ith zeros as in the SCII character <x<<(, then convert
to a reada#le hexadecimal representation, and su#mit&
So if the pass%ord is 'pass%ord' and the padding length is /4 for the sae of demonstration(, then5
LLL p+ = )pass+ord) ' #cr#0!00%A7%LLL $en#p+%15LLL p+2 = )).Moin#e!#ord#c%%[2(].:fi$$#2% for c in p+%LLL p+2O70617373776f726400000000000000OLLL $en#p+2%
If you decide to implement some padding mechanism, #e%are of varia#le$length padding schemes& .y
padding a string %ith a different and random( amount of characters upon each consecutive run does
not mae it safer #ut in fact introduces insecurity5
1 QR/srinen pyton2 import random34 )))5 is is te serer, +ic, in an attempt to t+art a man in te midd$e6 from inferrin" te si:e of te secret, pads it +it a aria$e and7 random amo/nt of ytes efore eac transmission.8 )))9 def serer#%(10 secret = )tesecret)1112 secret '= ) ) A random.randint#0,50%1314 ret/rn secret1516 )))17 is is te man in te midd$e, passie$y ta?in" note of a$$ pay$oad18 si:es emitted y te serer.19 Ence a s/fficient amo/nt of pay$oads +it aria$e&$en"t paddin" ae20 een transmitted, an acc/rate "/ess at te si:e of te secret can21 e made.
22 )))23 def oserer#%(24 $en"ts = []25 for i in !ran"e#1000%(26 padded>secret = serer#%27 $en"ts '= [ $en#padded>secret% ]28 proa$e>paddin">$en"t = ma!#$en"ts% & min#$en"ts%29 proa$e>secret>$en"t = ma!#$en"ts% & proa$e>paddin">$en"t30 print )en"t of te secret is proa$y( T).format#31 proa$e>secret>$en"t%3233 oserer#%
.y o#serving a page that contains a varia#le$length padded secret, a sufficient amount of encrypted
transmissions of this page in this example /<<< times( allo%s the o#server to determine the upper and
the lo%er limit of the varia#le$length padding, %hich in this case can #e expressed as the tuple <, 4<(&
!nce these have #een determined, the o#server can deduce the length of the secret&
In order to su#vert a padding scheme %ith a lo%er limit larger than < the attacer %ill first need to
o#serve no%n values of the secret to #e transmitted in order to determine the lo%er limit, or figure this
out #y scrutinizing the inner %orings of the %e# application itself&
*. 0sin' constant-len't identifiers to refer to obects
Bumeric identifiers are often used to refer to various data#ase o#>ects@ index&phppageidD/< loads adifferent page than index&phppageIdD/<<& The use of identifiers of constant length, such as ==I9's,
can prevent the lining of identifier lengths to particular sets of( resources&