1 .NET Matthew Conover May 2002
1
.NET
Matthew ConoverMay 2002
2
What is .NET?• .NET = dumb name• .NET is a framework• .NET is OS and platform
independent• .NET is language-insensitive• .NET specs are publicly available
3
Topics of Discussion• Introduction to .NET• Assemblies and Metadata• Microsoft’s implementation of .NET• .NET Hook Library
(dotNetHookLibrary)
4
Introduction to .NET• .NET Specifications
– Partition I – Architecture– Partition II – Metadata– Partition III – Common Intermediate Language– Partition IV – Library– Partition V – Annexes– Class Library (XML specification)
5
Introduction to .NET• Base Class Library (BCL)
– Shared among all languages• Common Language Runtime (CLR)
– Hosts managed code
6
Introduction to .NETBase Class Library
• Similar to Java’s System namespace.
• Used by all .NET applications• Has classes for IO, threading,
database, text, graphics, console, sockets/web/mail, security, cryptography, COM, run-time type discovery/invocation, assembly generation
7
Introduction to .NET• Common Language Runtime (CLR)
– Common Type System (CTS)– Execution Engine (EE)
8
Introduction to .NETCommon Language
Runtime• Common Type System
– Specifies certain types required to be hosted by CLR
– Specifies rules for class, struct, enums, interface, delegate, etc.
– Everything is actually an object
9
Introduction to .NETCommon Language
Runtime• Execution Engine
– Compiles Microsoft Intermediate Language (MSIL) into native code
– Handles garbage collection– Handles exceptions– Enforces code access security (sandbox)– Handles verification
• Managed v. Unmanaged
10
Introduction to .NET
BCL
Assembly
Class Loader
JIT
Machine Code
External Assembly
CLR
11
Assemblies• .NET Library/Executable (PE file
format)• Single-file or multi-file assemblies• Modular design
– Eliminates DLL problems– Locations resolved at runtime
• Components:– Metadata– MSIL (or native) code
12
AssembliesPhysical Layout
MSDOS Header
PE Section Headers
PE Header
Includes .NET Header
Code section
Data section
Relocations section
.text (includes Metadata)
.rsrc or .data
.reloc or .rdata
Single-file Assembly
13
Assemblies• .NET Executable (PE file format)• Single-file or multi-file assemblies• Modular design• Components:
– Metadata– MSIL (or native) code
14
AssembliesMetadata
• Contains all .NET application data• Very revealing!
– Needed for MSIL compilation– Assembly can be converted to native format
• Streams or heaps (sections of related data)
15
AssembliesMetadata
Signature, Version, Flags
Stream count (n)Metadata Header
Data offset
Stream size
Name (variable length)
Stream Header 1
Stream bodies 1-n
Stream Header n…
…
16
Assemblies Streams
• #Strings (a.k.a. strings heap)– Array of strings
• #US (a.k.a. user strings heap)– Array of strings used by application at runtime
• #GUID– Array of GUIDs (16 bytes each)
• #Blob– Contains compressed binary data
• #- or #~– Contains tables of methods, fields, etc.
17
Assemblies#~ and #- Stream
Version
Heap sizes
Valid tables (n)
…
Tables Header
Table row count Valid Table 1
Valid Table n
… Tables 1-n
18
AssembliesTables in #~/#- Stream
• In a predefined order– MethodDef = table 6– Param table = table 8
• Each table contains specific types– MethodDef = method definitions– TypeDef = type definitions– AssemblyRef = assemblies references
• Tables interact with each other• Tables interact with certain heaps
19
AssembliesSample - MethodDef Table
Relative Virtual Address (RVA)
Implementation flags
Method flags
Method name offset In #Strings
Method signature offset
Parameters index
In #Blob
In Param table
Offset to method
20
AssembliesSample - MethodDef Table
Flags
Sequence number
Parameter name offset In #Strings
Flags
Parameter count
Return type
Parameter types
Method Signature Blob
Param Table
21
AssembliesSample - func(int arg)
funcMethodDef
…
Paramarg
#Strings
#Blob
“func”“arg”
func method signature
arg type signature
…
…
…
22
Assemblies• .NET Executable (PE file format)• Single-file or multi-file assemblies• Components:
– Metadata– MSIL (or native) code
23
AssembliesMSIL
• Pseudo-assembly– Converted into native code– Object “aware” intermediate language– Examples: nop, break, ret, call, callvirt,
newobj, newarr, add, mul, xor, arglist, sizeof, throw, catch, dup
• Supports up to 512 opcodes– 0xFE = first byte of two byte opcodes
• All calls are stack-based
24
AssembliesCall Stack
1
Stack topthis pointer
2
ldc.i4.1
ldc.i4.2
call ClassType::func(Int32, Int32)
`
Left-to-right ordering
ClassType a;
a.func(1, 2)
C# MSIL
25
Assemblies• Sample IL
26
AssembliesMSIL
ldc.i4.s 9
call Print(Int32)MSIL
Assembler0x1f 0x09
0x28 0x06000006
Method token
• Uses “tokens” instead of pointers
27
AssembliesTokens
• A replacement for pointers • References a row in a table
Token
Table Number Row Index
Upper 8 bits Lower 24 bits
28
AssembliesMSIL Samples
• ld = load on stack, st = store from stack• stloc
– Stores a value from the stack into local variable
• ldarg– Puts an argument on the stack
• ldelem– Puts the value of an element on the stack
29
Microsoft’s .NET Implementation
• File locations• System libraries• .NET application flow
30
Microsoft’s .NET ImplementationFile Locations
• Framework: %SystemRoot%\Microsoft.NET
• Global Assembly Cache (GAC): %SystemRoot%\Assembly +– \GAC– \NativeImages*
31
Microsoft’s .NET Implementation
• File locations• System libraries• .NET application flow
32
Microsoft’s .NET Implementation System Libraries
• mscoree.dll (execution engine)• mscorwks.dll (does most
initialization)• mscorjit.dll (contains JIT)• mscorlib.dll (BCL)• fusion.dll (assembly binding)
33
Microsoft’s .NET Implementation System Libraries
mscoree.dll
mscorwks.dll
mscorlib.dllfusion.dll mscorjit.dll
34
Microsoft’s .NET Implementation
• File locations• System libraries• .NET application flow
35
Microsoft’s .NET Implementation.NET Application Flow
Application mscoree.dll
mscorwks.dll
Main
_CorExeMain
_CorExeMain
CoInitializeEE
Entry point
36
Microsoft’s .NET Implementation.NET Application Flow
• Jumps to _CorExeMain (mscoree)• Calls _CorExeMain in mscorwks.dll• _CorExeMain calls CoInitializeEE• CoInitializeEE calls:
– EEStartup– ExecuteEXE
37
EEStartup• GCHeap.Initialize• ECall.Init
– SetupGenericPInvokeCalliStub– PInvokeCalliWorker
• NDirect.Init• UMThunkInit.UMThunkInit• COMDelegate.Init• ExecutionManger.Init• COMNlsInfo.InitializeNLS
38
EEStartup (cont.)• Security::Start• SystemDomain.Init• SystemDomain.NotifyProfilerStartup (ICorProfiler)• SystemDomain.NotifyNewDomainLoads• SystemDomain.PublishAppDomainAndInformDebug
ger (ICorPublish/ICorDebug)
39
SystemDomain.Init• LoadBaseSystemClasses• SystemDomain.CreatePreallocatedExceptions
40
LoadBaseSystemClasses• SystemDomain.LoadSystemAssembly
– Loads mscorlib.dll• Binder::StartupMscorlib• Binder::FetchClass(OBJECT)• MethodTable::InitForFinalization• InitJITHelpers2• Binder::FetchClass(VALUE)• Binder::FetchClass(ARRAY)
41
LoadBaseSystemClasses• Binder.FetchType(OBJECT_ARRAY)• Binder.FetchClass(STRING)• Binder.FetchClass(ENUM)• Binder.FetchClass(ExceptionClass)• Binder.FetchClass(OutOfMemoryExceptionClas
s)• Binder.FetchClass(StackOverflowExceptionClas
s)
42
LoadBaseSystemClasses• Binder.FetchClass(ExecutionEngineExceptionClass)• Binder.FetchClass(DelegateClass)• Binder.FetchClass(MultiDelegateClass)
43
.NET Application Flow• Jumps to _CorExeMain (mscoree)• Calls _CorExeMain in mscorwks.dll• _CorExeMain calls CoInitializeEE• CoInitializeEE calls:
– EEStartup– ExecuteEXE
44
ExecuteEXE• StrongNamesignatureVerification
– In mscorsn.dll• PEFile::Create
– Loads executable• ExecuteMainMethod• FusionBind.CreateFusionName• Assembly.ExecuteMainMethod
45
ExecuteMainMethod• Thread.EnterRestrictiedContext• PEFile::GetMDImport• SystemDomain.SetDefaultDomainAttrib
utes– Sets entry point
• SystemDomain.InitializeDefaultDomain• BaseDomain.LoadAssembly
46
ExecuteEXE• StrongNamesignatureVerification
– In mscorsn.dll• PEFile::Create
– Loads executable• ExecuteMainMethod• FusionBind.CreateFusionName• Assembly.ExecuteMainMethod
47
Assembly.ExecuteMainMethod
• Assembly::GetEntryPoint• ClassLoader::ExecuteMainMethod
– EEClass:FindMethod(entry point token)
48
EEClass.FindMethod• ValidateMainMethod• CorCommandLine.GetArgvW• MethodDesc.Call
– MethodDesc.IsRemotingIntercepted– MethodDesc.CallDescr calls
MethodDesc.CallDescrWorker– CallDescrWorker calls Main()
49
.NET Application• Main() needs to be compiled• Main() calls PreStubWorker (mscorwks)• PreStubWorker
– Compiles all MSIL methods– Calls MethodDesc.DoPrestub
50
MethodDesc.DoPrestub• MethodDesc.GetSecurityFlags• MethodDesc.GetUnsafeAddrofCode• MethodDesc.GetILHeader• MethodDesc.GetRVA• COR_DECODE_METHOD
– Decode tiny/fat format• Security._CanSkipVerification
51
MethodDesc.DoPrestub (cont.)
• EEConfig.ShouldJitMethod• MakeJitWorker
– JITFunction
52
JITFunction• ExecutionManager::GetJitForType
– EEJitManager::LoadJIT– Loads mscorjit.dll (in LoadJIT)– Calls getJit in mscorjit (in LoadJIT)
• CallCompileMethodWithSEHWrapper– Debugger.JitBeginning– CILJit.compileMethod– Debugger.JitComplete
53
CILJit.compileMethod• Calls jitNativeCode • jitNativeCode
– Compiler.compInit– Compiler.compCompile
54
Compiler.compCompile• Compiler.eeGetMethodClass• Compiler.eeGetClassAttribs• emitter.emitBegCG• Compiler.eeGetMethodAttribs• Compiler.comptInitDebuggingInfo• Compiler.genGenerateCode• emitter.emitEndCG
55
Compiler.genGenerateCode
• emitter.emitBegFN• Compiler.genCodeForBBlist• Compiler.genFnProlog• Compiler.genFnEpilog• emitter.emitEndCodeGen• Compiler.gcInfoBlocKHdrSave• emitter.emitEndFN
56
.NET Hook – What It Is• An API for hooking .NET assemblies• Includes a sample application that
will insert a NOP into all “interesting” methods
57
.NET Hook – What It Does• Reads through method table• Reads method
– Parses header, code, EH data• Hooks interesting functions
– Inserts hooked code at front of method– Stored at the end of the .text section
• Updates PE and section headers• Changes function RVAs in Metadata
58
.NET Hook - API• Load(string AssemblyName)• Hook(HookedFunction Function)• Save()
59
.NET Hook - Hook• Specifies a callback function• Callback function receives a
HookedFunction
60
.NET Hook - HookedFunction
• Name (I.e., “Main”)• FullName (I.e., “void Class1::Main(string[]
args”)• DeclaringTypeName (I.e., “Class1”)• ReturnType (I.e., “void”)• Parameters[] (includes name and type)• Header[] and HeaderSize• Code[] and CodeSize• EHData[] and EHSize
61
.NET Hook Hooked Assembly
Metadata
Functions
Import Address Table End of old .text section
.text section
Hooked FunctionsEnd of
new .text section
References both
62
AssembliesHooked Method
RVA
Implementation flags
Method flags
Method name offset
Signature offset
Parameters index
Hooked method
Original methodMethodDef table entry
63
.NET Hook Tiny Method Body
• Header size = 1 byte• Used when:
– Code size < 64 bytes– Maximum stack size is less than 8– The method has no local variables– No exceptions
Header (flags and code size)
Method body (MSIL)
64
.NET Hook Hooked Tiny MethodHeader (flags and code size)
Hooking code (MSIL)
Method body (MSIL)
Updated
Inserted
65
.NET Hook Fat Method
Header size = 12 bytesFlags
Header size
Max. stack sizeCode sizeLocal var. signature Describes local variablesMethod body (MSIL)
Extra data sections Currently only used for exceptions
66
.NET HookHooked Fat Method
Flags
Header size
Max. stack sizeCode sizeLocal var. signature
Method body (MSIL)Extra data sections
Hooking code (MSIL) Inserted
Updated
Updated
67
.NET Hook Demo
68
.NET Hook - Next Steps• Better type handling• Don’t break exception handling• More developers needed
69
Summary• .NET Framework is made up of BCL & CLR• .NET applications stored in assemblies• .NET Hook manipulates assemblies• Assemblies contain Metadata & MSIL code• Metadata contains streams• The #~/#- stream contains tables• Tables contain the important stuff
70
More Information• .NET Specifications:
– http://msdn.microsoft.com/net/ecma• SSCLI and .NET Framework SDK
– http://msdn.microsoft.com/netframework/
• .NET Hook– http://dotnethook.sourceforge.net
71
Acknowledgements• Entercept’s Ricochet Team
– http://www.entercept.com/ricochet• w00w00
– http://www.w00w00.org