Detecting network attacks with mathematical methods and OCR algorithms With novel case studies :) Chetvertakov Vitaliy Yarochkin Fedor Kropotov Vladimir
Detecting network attacks with mathematical methods and OCR
algorithmsWith novel case studies :)
Chetvertakov VitaliyYarochkin FedorKropotov Vladimir
Pattern Recognition
Pattern Recognition is the assignment of a label to a given input value. Pattern recognition algorithms generally aim to provide a
reasonable answer for all possible inputs and to perform "most likely" matching of the inputs, taking into account their statistical variation.
(source: wikipedia)
Recognizing patterns in HTTP traffic
Feature selection: identify features of HTTP request that uniquely describe the “object”
Pick a classification algorithm Refine the classification process by updating
and extending the feature set
Feature Selection
Use abstract features. Identify optimal feature set Identify features which are specific to a
particular class
http://tmffh9d6.inspectionimagination.biz:34412/8b3fb0644ab7b15a8e4934b02ac816de.html
http://tmffh9d6.inspectionimagination.biz:34412/2442801631/1383804540.tpl
http://tmffh9d6.inspectionimagination.biz:34412/2442801631/1383804540.jar
Feature selection
Feature selection
r3k29.custardpeach.biz:51423/915511fb10676aa529fc4228b67066df.html
Domain
- length of the domain name- TLD/domain level- use of non-standard port- use of constant strings in DGA- parts of domain/subdomain are fixed
URL
- length of theURL- depth- number of parameters- use of constants in URL generation
Example 01GET;http://l03dn5.presidentsdaypretty.biz:39031/04e0f64971f8e578f67c06fdca6eaa19/1/5f59cefce9266019c504ca4a71621b1c.html;HTTP/1.1
GET;http://pzsch506.groundhogdayglamour.biz:39031/b2f7b3d0871f2c12e234af9b256544cc/1/5f59cefce9266019c504ca4a71621b1c.html;HTTP/1.1
GET;http://yvuia4.purimpearl.biz:39031/2a3550cbd3647dc0ce8d1f9fba0834bf/1/5f59cefce9266019c504ca4a71621b1c.html;HTTP/1.1
GET;http://zw3mj5wj.purimpearl.biz:39031/a1ba1f03f1b2aa47616cfb3444e7cffd/1/5f59cefce9266019c504ca4a71621b1c.html;HTTP/1.1
GET;http://hza8wu.fathersdaydelight.biz:39031/ca59b25fe5eb041b0981ba47437f6165/1/5f59cefce9266019c504ca4a71621b1c.html;HTTP/1.1
GET;http://kutro.kwanzaavanity.biz:39031/d4ec6dcbc4ef5bf2a6baf84b341cea8f.html;HTTP/1.1
GET;http://mvgui.chinesenewyeartrendy.biz:39031/2f8a9fba52135e1dde57aba3073f18be/1/5f59cefce9266019c504ca4a71621b1c.html;HTTP/1.1
GET;http://edbl5bx.newyearsevegrace.biz:39031/0d33e5b8022599e83176ed9c43b2c955/1/5f59cefce9266019c504ca4a71621b1c.html;HTTP/1.1
GET;http://u68mbrep.purimpearl.biz:39031/559d030ea579f78dd8e81e394fb084e3.html;HTTP/1.1
GET;http://ouytbyb.kwanzaavanity.biz:39031/1a66d53d7d940ac8d1a5fb6302f9eb2d.html;HTTP/1.1
GET;http://wiizc.purimpearl.biz:39031/6bfa9cf7fc9b6d0f16728765a19f7ee8.html;HTTP/1.1
GET;http://a0e9nd.kwanzaavanity.biz:39031/110e3640c62e5e4cfdc1821c6cd57447.html;HTTP/1.1
GET;http://xfapcjn.purimpearl.biz:39031/665025fd70aedc98dea3939f576f2466.html;HTTP/1.1
GET;http://qr47s.kwanzaavanity.biz:39031/772cd514992254f1c052988cdaaa36e9.html;HTTP/1.1
kutro.kwanzaavanity.biz:39031/d4ec6dcbc4ef5bf2a6baf84b341cea8f.html
<method>GET</method><domain><domainlevels>3</domainlevels><domainlengthmin>20</domainlengthmin><domainlengthmax>32</domainlengthmax><domainconstants>biz</domainconstants></domain><URL><urlmin>37</urlmin><urlmax>72</urlmax><numberofdirs>3#1</numberofdirs><urlconstants>html</urlconstants><hashinurl>True</hashinurl></URL><Ports>39031</Ports><Useragent></Useragent>/* Only for POST method */ <ref>True</ref>
Example 02
GET;http://wrauac.dns-dns.com/profile.php?exp=atom&b=029c737&k=1d7b5b52cde770adae2f2de599b2c6aa;HTTP/1.1
GET;http://wrauac.dns-dns.com/profile.php?exp=atom&b=029c737&k=1d7b5b52cde770adae2f2de599b2c6aa;HTTP/1.1
GET;http://wrauac.dns-dns.com/profile.php?exp=byte&b=029c737&k=1d7b5b52cde770adae2f2de599b2c6aa;HTTP/1.1
GET;http://viwocasl.sendsmtp.com/profile.php?exp=byte&b=029c737&k=6edca1a2b6fd09b6d34b09c89338b7b1;HTTP/1.1
GET;http://viwocasl.sendsmtp.com/profile.php?exp=atom&b=029c737&k=6edca1a2b6fd09b6d34b09c89338b7b1;HTTP/1.1
GET;http://chopupruu.esmtp.biz/profile.php?exp=atom&b=2266390&k=a851e76a784b954dac7616e2e825cdb8;HTTP/1.1
GET;http://chopupruu.esmtp.biz/profile.php?exp=byte&b=2266390&k=a851e76a784b954dac7616e2e825cdb8;HTTP/1.1
GET;http://gukasoui.dns04.com/profile.php?exp=byte&b=029c737&k=e4697d1bc39e76560b8c17333efb252c;HTTP/1.1
GET;http://gukasoui.dns04.com/profile.php?exp=atom&b=029c737&k=e4697d1bc39e76560b8c17333efb252c;HTTP/1.1
GET;http://trajespaw.sellclassics.com/profile.php?exp=byte&b=029c737&k=df5583b1ef47cafe68949194862ae422;HTTP/1.1
GET;http://trajespaw.sellclassics.com/profile.php?exp=atom&b=029c737&k=df5583b1ef47cafe68949194862ae422;HTTP/1.1
GET;http://trajespaw.sellclassics.com/profile.php?exp=atom%26b=029c737%26k=f19c80074df797aa3a59aaeabc8a78fd;HTTP/1.1
GET;http://trajespaw.sellclassics.com/profile.php?exp=byte%26b=029c737%26k=f19c80074df797aa3a59aaeabc8a78fd;HTTP/1.1
http://wrauac.dns-dns.com/profile.php?exp=atom&b=029c737&k=1d7b5b52cde770adae2f2de599b2c6aa
<method>GET</method><domain><domainlevels>3</domainlevels><domainlengthmin>18</domainlengthmin><domainlengthmax>26</domainlengthmax><domainconstants></domainconstants></domain><URL><urlmin>65</urlmin><urlmax>69</urlmax><numberofdirs>1</numberofdirs><urlconstants>profile#php#exp</urlconstants><hashinurl>True</hashinurl></URL><Ports></Ports><Useragent></Useragent>/* Only for POST method */ <ref>False</ref>
Example 03GET;http://122007dd1019.kathell.com/get2.php?
c=HNCHAIMZ&d=26606B67393435363E2F676268307D3F222022222525253177757E4469747A22461516161D1444440E5C434F116E1E6B76000A7006760E06090D0A080D7F0A0674707B00007007057F087C7A756B2C263E273721696461647E31333F61683B6C520557505643070305545A4D031E180A024C472C455329031B12474A4D494C4EB8B2B0BBB5A3F6F5E7EAB7CEF4FDE2E0E2F4E0BDD1CDD3B1F4FDABC4F9A0AFB9C3CDCCD7FBC09B978EDE9C9F919D88C98D818095D0D0DAD6C1848A8B8C8D8E8FF0F0E4A0AAA1AAFA859A81FAF5F8FBE1BDA2B9FDA6BBF8A5ADFFB3BFBDA9BBE2D1D3DADBDCD7D4D6D8D9CCAEA3BF
GET;http://162007dd1009.kathell.com/get2.php?c=AFVJIJKP&d=26606B67393435363E2F676268307D3F222022222525253177757E4469747A2218191A17124343160E5C434F1168181C03017101000474750A0D7E0C097F7D7D027906027D7307700C7972720E6B2C263E273721696461647E31333F616B6C39515204015043070305545A4D031E180A024C442C455329031B12474B484A4B46B2B1B2B3B7A3F6F5E7EAB7F9F9E3EAE3FCA2A0BDF1EDF3B1F4FDABC4F9A0AFB9C3CDCCD7FBC09B978EDE9C9F919C88C98D8094C1898490D4D6DDD686F8FFF08DF5F5F4EDA9B6ADE9BAA7E4B9B9EBA7ABB1A5B7EEE5E7EEEFE0EBE8EAECEDF89AAFB3
GET;http://162007dd100a.kathell.com/get2.php?c=AFVJIJKP&d=26606B67393435363E2F676268307D3F222022222525253177757E4469747A2218191A17124343160E5C434F1168181C03017101000474750A0D7E0C097F7D7D027906027D7307700C7972720E6B2C263E273721696461647E31333F616B6C39515204015043070305545A4D031E180A024C442C455329031B12474B484A4B46B8B3B0BAB0A3F6F5E7EAB7F9F9E3EAE3FCA2A0BDF1EDF3B1F4FDABC4F9A0AFB9C3CDCCD7FBC09B978EDE9C9F919C88C98D818095D0D0DAD6C1848A8B8C8D8E8FF0F0E4A0AAA1AAFA8C8B84F9F9F9F8E1BDA2B9FDA6BBF8A5ADFFB3BFBDA9BBE2D1D3DADBDCD7D4D6D8D9CCAEA3BF
GET;http://201907dd1008.kathell.com/get2.php?c=HNCHAIMZ&d=26606B67393435363E2F676268307D3F222022222525253177757E4469747A22461516161D1444440E5C434F116E1E6B76000A7006760E06090D0A080D7F0A0674707B00007007057F087C7A756B2C263E273721696461647E31333F61683B6C520557505643070305545A4D031E180A024C472C455329031B12474A4C454D4AB6B1BAB1BCA3F6F5E7EAB7CEF4FDE2E0E2F4E0BDD1CDD3B1F4FDABC4F9A0AFB9C3CDCCD7FBC09B978EDE9C9F919D88C98D8094C1898490D4D6DDD686F1EEF58EF9F4F7EDA9B6ADE9BAA7E4B9B9EBA7ABB1A5B7EEE5E7EEEFE0EBE8EAECEDF89AAFB3
GET;http://201907dd1009.kathell.com/get2.php?c=HNCHAIMZ&d=26606B67393435363E2F676268307D3F222022222525253177757E4469747A22461516161D1444440E5C434F116E1E6B76000A7006760E06090D0A080D7F0A0674707B00007007057F087C7A756B2C263E273721696461647E31333F61683B6C520557505643070305545A4D031E180A024C472C455329031B12474A4C454D49B2B2B2B4BCA3F6F5E7EAB7CEF4FDE2E0E2F4E0BDD1CDD3B1F4FDABC4F9A0AFB9C3CDCCD7FBC09B978EDE9C9F919D88C98D818095D0D0DAD6C1848A8B8C8D8E8FF0F0E4A0AAA1AAFA859A81FAF5F8FBE1BDA2B9FDA6BBF8A5ADFFB3BFBDA9BBE2D1D3DADBDCD7D4D6D8D9CCAEA3BF
<method>GET</method><domain><domainlevels>3</domainlevels><domainlengthmin>24</domainlengthmin><domainlengthmax>24</domainlengthmax><domainconstants>kathell#com</domainconstants></domain><URL><urlmin>438</urlmin><urlmax>498</urlmax><numberofdirs>1</numberofdirs><urlconstants>get2#php</urlconstants><hashinurl>False</hashinurl></URL><Ports></Ports><Useragent></Useragent>/* Only for POST method */ <ref>False</ref>
Classification methods
Expert Systems Supervised Learning & Unsupervised Learning Probabilistic Classifiers Bayes Classifiers Neural Network Classifier Markov-chain
and so on
Expert System
Expert System - is a computer system that emulates the decision-making ability of a human expert (Thank you, Wikipedia ;-))
Neural Networks
Artificial Neural Networks (ANF) - artificial neural networks are computational models inspired by animal central nervous systems (in particular the brain) that are capable of machine learning and pattern recognition. They are usually presented as systems of interconnected "neurons" that can compute values from inputs by feeding information through the network. (wikipedia ;))
First try
Method Input data Need to recognize
Total matches Found False positives
Comparision based expert system
>15000000 lines
361 lines 667 lines 335 lines 332 lines
Weight-based expert system
>15000000 lines
361 lines 436 lines 340 lines 96 lines
Neural Network >15000000 lines
361 lines 5060 lines 294 lines 4766 lines
Correction
Add new features, or modify existing Correct the decision system logic Add whitelists Extend the training set
After correction
Method Input data Need to recognize
Total matches Found False positives
Comparision based expert system
>15000000 lines
361 lines 347 lines 345 lines 2 lines
Weight-based expert system
>15000000 lines
361 lines 350 lines 347 lines 3 lines
Neural Network
>15000000 lines
361 lines 788 lines 310 lines 478 lines
eonudr.newyearsevemagical.biz 142.4.194.2
iewc7.valentinesdaypearl.biz 142.4.194.2
m32fa1o.electiondaypretty.biz 142.4.194.2
mcojt0a.purimcharming.biz 142.4.194.2
rum49.newyearsevemagical.biz 142.4.194.2
hword.meok.info 67.211.199.230
katadod.pp.ua 82.146.58.179
tatytol.pp.ua 82.146.58.179
142.4.194.2
67.211.199.230
82.146.58.179
Выгрузка по обнаруженным IP
Метод Обнаружено различных IP
Количество строк при поиске по IP
Необходимо распознать
Экспертная система на основе сравнений
3 IP 361 строку 361 строку
Экспертная система на основе весов
3 IP 361 строку 361 строку
Нейронная сеть 3 IP 361 строку 361 строку
GET;http://img.downloadcontent.info/jquery.phtml?jsoncallback=jQuery1101008466942954555645_1383617770561&_=1383617770562;HTTP/1.1 67.211.199.230
GET;http://img.downloadcontent.info/jquery.phtml?jsoncallback=jQuery1101008466942954555645_1383617770561&_=1383617770562;HTTP/1.1 67.211.199.230
GET;http://img.downloadcontent.info/jquery.phtml?jsoncallback=jQuery1101008466942954555645_1383617770561&_=1383617770562;HTTP/1.1 67.211.199.230
GET;http://hellobodydown.info/iRDgI.gif?44662;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/viewtopic.php?p=23438&sid=8dba5ca35;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/engine/classes/js/jquery.js;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/counter.php;HTTP/1.1 67.211.199.230
GET;http://img.downloadcontent.info/jquery.phtml?jsoncallback=jQuery110106749859406100843_1383632813402&_=1383632813403;HTTP/1.1 67.211.199.230
GET;http://hellobodydown.info/BLPXdu5e.gif?179612;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/viewtopic.php?p=99505&sid=a83f64f;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/engine/classes/js/jquery.js;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/counter.php;HTTP/1.1 67.211.199.230
GET;http://img.downloadcontent.info/jquery.phtml?jsoncallback=jQuery110107773372794230158_1383647950662&_=1383647950663;HTTP/1.1 67.211.199.230
GET;http://hellobodydown.info/KeyAKRx.gif?1755;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/viewtopic.php?p=58087&sid=bf5532a8118;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/engine/classes/js/jquery.js;HTTP/1.1 67.211.199.230
GET;http://hword.meok.info/counter.php;HTTP/1.1 67.211.199.230
Kolmogorov complexity
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity (also known as descriptive complexity, Kolmogorov–Chaitin complexity, algorithmic entropy, or program-size complexity) of an object, such as a piece of text, is a measure of the computational resources needed to specify the object.
Example
'http://uhvfsd.servebbs.net/haperka.php'
>++++++++++++++++[>++++++>+++++++>+++>++<<<<-]>>--------.++++++++++++..----.>+++
+++++++.-----------..<+++++.<++++++++.>+.<--.>---.<--.>>-.<.<+.>-.++++.<.---..>---.>.<-----.<+++.>++++++.>+.<<+++.-------.>----.<++++.>++.<++++++.----------.>>-.<--.<+++++++.>.>>[[-]<]<
Example 2
http://adylody.pp.ua/hit-partner-stat/f8850826775010c65885e32c1a7f29ff?x=160
>++++++++++++++++[>++++++>+++++++>+++>++<<<<-]>>--------.++++++++++++..----.>++++++++++.-----------..<<+.+++.>+++++++++.<++++++++.+++.-----------.>.>-.<---------..>.<+++++.<---.>>+.<<+++++++.+.>-.>--.<----.<--------.>++.++.------.<++++.>++++.>.<+.+.<----.>.>++.<<+++++.>>+++++++++..---.-----.++++++++.------.++++.+..--.-----.+.-.<<---.>>++++++.-.+++..---.<<++.>>--.-.<<--.>>-.<<--.>>++++++.<<+++++.>>-----.+++++++.<<..>>++++++.<++++.>--.------------.+++++.------.>[[-]<]<