Top Banner
Portable Executable From Wikipedia, the free encyclopedia Jump to: navigation , search Not to be confused with Portable application . Portable Executable Filename extension .cpl, .exe, .dll, .ocx, .sys, .scr, .drv, .tlb Developed by Microsoft Type of format Binary , executable , object , shared libraries Extended from DOS MZ executable COFF The Portable Executable (PE) format is a file format for executables , object code and DLLs , used in 32-bit and 64-bit versions of Windows operating systems . The term "portable" refers to the format's versatility in numerous environments of operating system software architecture. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking , API export and import tables, resource management data and thread-local storage (TLS) data. On NT operating systems, the PE format is used for EXE , DLL , SYS (device driver), and other file types. The Extensible Firmware Interface (EFI) specification states that PE is the standard executable format in EFI environments. PE is a modified version of the Unix COFF file format. PE/COFF is an alternative term in Windows development. On Windows NT operating systems, PE currently supports the IA-32 , IA-64 , and x86-64 (AMD64/Intel64) instruction set architectures (ISAs). Prior to Windows 2000 , Windows NT (and thus PE) supported the MIPS , Alpha , and PowerPC ISAs. Because PE is used on Windows CE , it continues to support several variants of the MIPS, ARM (including Thumb), and SuperH ISAs.
57
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PE

Portable ExecutableFrom Wikipedia, the free encyclopediaJump to: navigation, search Not to be confused with Portable application.

Portable Executable

Filename

extension

.cpl, .exe, .dll, .ocx, .sys,

.scr, .drv, .tlb

Developed by Microsoft

Type of format Binary, executable, object, shared libraries

Extended fromDOS MZ executable

COFF

The Portable Executable (PE) format is a file format for executables, object code and DLLs, used in 32-bit and 64-bit versions of Windows operating systems. The term "portable" refers to the format's versatility in numerous environments of operating system software architecture. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking, API export and import tables, resource management data and thread-local storage (TLS) data. On NT operating systems, the PE format is used for EXE, DLL, SYS (device driver), and other file types. The Extensible Firmware Interface (EFI) specification states that PE is the standard executable format in EFI environments.

PE is a modified version of the Unix COFF file format. PE/COFF is an alternative term in Windows development.

On Windows NT operating systems, PE currently supports the IA-32, IA-64, and x86-64 (AMD64/Intel64) instruction set architectures (ISAs). Prior to Windows 2000, Windows NT (and thus PE) supported the MIPS, Alpha, and PowerPC ISAs. Because PE is used on Windows CE, it continues to support several variants of the MIPS, ARM (including Thumb), and SuperH ISAs.

Contents

[hide] 1 Brief history 2 Technical details

o 2.1 Layout

o 2.2 Import Table

o 2.3 Relocations

Page 2: PE

3 .NET, metadata, and the PE format

4 Use on other operating systems

5 See also

6 References

7 External links

[edit] Brief history

Microsoft migrated to the PE format with the introduction of the Windows NT 3.1 operating system. All later versions of Windows, including Windows 95/98/ME, support the file structure. The format has retained limited legacy support to bridge the gap between DOS-based and NT systems. For example, PE/COFF headers still include an MS-DOS executable program, which is by default a stub that displays the simple message "This program cannot be run in DOS mode" (or similar). PE also continues to serve the changing Windows platform. Some extensions include the .NET PE format (see below), a 64-bit version called PE32+ (sometimes PE+), and a specification for Windows CE.

[edit] Technical details

[edit] Layout

A PE file consists of a number of headers and sections that tell the dynamic linker how to map the file into memory. An executable image consists of several different regions, each of which require different memory protection; so the start of each section must be aligned to a page boundary. For instance, typically the .text section (which holds program code) is mapped as execute/readonly, and the .data section (holding global variables) is mapped as no-execute/readwrite. However, to avoid wasting space, the different sections are not page aligned on disk. Part of the job of the dynamic linker is to map each section to memory individually and assign the correct permissions to the resulting regions, according to the instructions found in the headers.

[edit] Import Table

One section of note is the import address table (IAT), which is used as a lookup table when the application is calling a function in a different module. It can be in form of both import by ordinal and import by name. Because a compiled program cannot know the memory location of the libraries it depends upon, an indirect jump is required whenever an API call is made. As the dynamic linker loads modules and joins them together, it writes jump instructions into the IAT slots, so that they point to the memory locations of the corresponding library functions. Though this adds an extra jump over the cost of an intra-module call resulting in a performance penalty, it provides a key benefit: dynamic libraries are much more flexible and reduce code redundancy (which would occur if common libraries had to be linked statically to each program). If the

Page 3: PE

compiler knows ahead of time that a call will be inter-module (via a dllimport attribute) it can produce more optimized code that simply results in an indirect call opcode.

Texe and LordPE are tools that can be used to view the Import and Export tables of PE Files.

[edit] Relocations

PE files do not contain position-independent code. Instead they are compiled to a preferred base address, and all addresses emitted by the compiler/linker are fixed ahead of time. If a PE file cannot be loaded at its preferred address (because it's already taken by something else), the operating system will rebase it. This involves recalculating every absolute address and modifying the code to use the new values. The loader does this by comparing the preferred and actual load addresses, and calculating a delta value. This is then added to the preferred address to come up with the new address of the memory location. Base relocations are stored in a list and added, as needed, to an existing memory location. The resulting code is now private to the process and no longer shareable, so many of the memory saving benefits of DLLs are lost in this scenario. It also slows down loading of the module significantly. For this reason rebasing is to be avoided wherever possible, and the DLLs shipped by Microsoft have base addresses pre-computed so as not to overlap. In the no rebase case PE therefore has the advantage of very efficient code, but in the presence of rebasing the memory usage hit can be expensive. This contrasts with ELF which uses fully position independent code and a global offset table, which trades off execution time against memory usage in favor of the latter.

[edit] .NET, metadata, and the PE format

Microsoft's .NET Framework has extended the PE format with features which support the Common Language Runtime. Among the additions are a CLR Header and CLR Data section. Upon loading a binary, the OS loader yields execution to the CLR via a reference in the PE/COFF IMPORT table. The CLR then loads the CLR Header and Data sections.

The CLR Data section contains two important segments: Metadata and Intermediate Language (IL) code:

Metadata contains information relevant to the assembly, including the assembly manifest. A manifest describes the assembly in detail including unique identification (via a hash, version number, etc.), data on exported components, extensive type information (supported by the Common Type System (CTS)), external references, and a list of files within the assembly. The CLR environment makes extensive use of metadata.

Intermediate Language (IL) code is abstracted, language independent code that satisfies the .NET CLR's Common Intermediate Language (CIL) requirement. The term "Intermediate" refers to the nature of IL code being cross-language and cross-platform compatible. This intermediate language, similar to Java bytecode, allows platforms and languages to support the common .NET CLR. IL supports object-oriented programming (polymorphism, inheritance, abstract types, etc.), exceptions, events, and various data structures. IL code is assembled into a .NET PE for execution by the CLR.

Page 4: PE

[edit] Use on other operating systems

The PE format is also used by ReactOS, as ReactOS is intended to be binary-compatible with Windows. It has also historically been used by a number of other operating systems, including SkyOS and BeOS R3. However, both SkyOS and BeOS eventually moved to ELF.

As the Mono development platform intends to be binary compatible with Microsoft .NET, it uses the same PE format as the Microsoft implementation.

On x86, Unix-like operating systems, some Windows binaries (in PE format) can be executed with Wine. The HX DOS Extender also uses the PE format for native DOS 32-bit binaries, plus it can to some degree execute existing Windows binaries in DOS, thus acting like a Wine for DOS.

Mac OS X 10.5 has the ability to load and parse PE files, but is not binary compatible with Windows. [1]

[edit] See also

EXE a.out

Comparison of executable file formats

Executable compression

Application virtualization

[edit] References

1. ̂ Chartier, David (2007-11-30). "Uncovered: Evidence that Mac OS X could run Windows apps soon". Ars Technica. http://arstechnica.com/journals/apple.ars/2007/11/30/uncovered-evidence-that-mac-os-x-could-run-windows-apps-soon. Retrieved 2007-12-03. "... Steven Edwards describes the discovery that Leopard apparently contains an undocumented loader for Portable Executables, a type of file used in 32-bit and 64-bit versions of Windows. More poking around revealed that Leopard's own loader tries to find Windows DLL files when attempting to load a Windows binary."

[edit] External links

Microsoft Portable Executable and Common Object File Format Specification (latest edition, OOXML format)

Microsoft Portable Executable and Common Object File Format Specification (latest edition, HTML format)

Page 5: PE

Microsoft Portable Executable and Common Object File Format Specification (1999 edition, .doc format)

The original Portable Executable article by Matt Pietrek (MSDN Magazine, March 1994)

Part I. An In-Depth Look into the Win32 Portable Executable File Format by Matt Pietrek (MSDN Magazine, February 2002)

Part II. An In-Depth Look into the Win32 Portable Executable File Format by Matt Pietrek (MSDN Magazine, March 2002)

The .NET File Format by Daniel Pistelli

Creating the smallest possible PE executable (97 bytes)

Detailed description of the PE format by Johannes Plachy

Windows Authenticode Portable Executable Signature Format

LUEVELSMEYER's description about PE file format Mirror

A tool to inspect the content of any PE File

Executable compressionFrom Wikipedia, the free encyclopedia  (Redirected from EXE packer)Jump to: navigation, search

Executable compression is any means of compressing an executable file and combining the compressed data with the decompression code it needs into a single executable.

Running a compressed executable essentially unpacks the original executable code, then transfers control to it. The effect is the same as if the original uncompressed executable had been run, so compressed and uncompressed executables are indistinguishable to the casual user.

A compressed executable is one variety of self-extracting archive, where compressed data is packaged along with the relevant decompression code in an executable file. It is often possible to decompress a compressed executable without directly executing it (two such programs are CUP386 and UNP).

Most packed executables decompress directly into the memory and need no free file system space to start. However, some decompressor stubs are known to write the uncompressed executable to the file system in order to start it.

Contents

[hide]

Page 6: PE

1 Advantages and disadvantages 2 List of packers

3 See also

4 References

[edit] Advantages and disadvantages

Software distributors use executable compression for a variety of reasons, primarily to reduce the secondary storage requirements of their software; as executable compressors are specifically designed to compress executable code, they often achieve better compression ratio than standard data compression facilities such as gzip, zip or bzip2[citation needed]. This allows software distributors to stay within the constraints of their chosen distribution media (such as CD-ROM, DVD-ROM, or Floppy disk), or to reduce the time and bandwidth customers require to access software distributed via the Internet.

Executable compression is also frequently used to deter reverse engineering or to obfuscate the contents of the executable (for example, to hide the presence of malware from antivirus scanners) by proprietary methods of compression and/or added encryption. Executable compression can be used to prevent direct disassembly, mask string literals and modify signatures. Although this does not eliminate the chance of reverse engineering, it can make the process more costly.

A compressed executable requires less storage space in the file system, thus less time to transfer data from the file system into memory. On the other hand, it requires some time to decompress the data before execution begins. However, the speed of various storage media has not kept up with average processor speeds, so the storage is very often the bottleneck. Thus the compressed executable will load faster on most common systems. On modern desktop computers, this is rarely noticeable unless the executable is unusually big, so loading speed is not a primary reason for or against compressing an executable.

On operating systems which read executable images on demand from the disk (see virtual memory), compressed executables make this process less efficient. The decompressor stub allocates a block of memory to hold the decompressed data, which stays allocated as long as the executable stays loaded, whether it is used or not, competing for memory resources with other applications all along. If the operating system uses a swap file, the decompressed data has to be written to it to free up the memory instead of simply discarding unused data blocks and reloading them from the executable image if needed again. This is usually not noticeable, but it becomes a problem when an executable is loaded more than once at the same time—the operating system cannot reuse data blocks it has already loaded, the data has to be decompressed into a new memory block, and will be swapped out independently if not used. The additional storage and time requirements mean that it has to be weighed carefully whether to compress executables which are typically run more than once at the same time.

Page 7: PE

Another disadvantage is that some utilities can no longer identify run-time library dependencies, as only the statically linked extractor stub is visible.

Also, some older virus scanners simply report all compressed executables as viruses because the decompressor stubs share some characteristics with those. Most modern virus scanners can unpack several different executable compression layers to check the actual executable inside, but some popular anti-virus and anti-malware scanners have had troubles with false alarms on compressed executables.

Executable compression used to be more popular when computers were limited to the storage capacity of floppy disks and small hard drives; it allowed the computer to store more software in the same amount of space, without the inconvenience of having to manually unpack an archive file every time the user wanted to use the software. However, executable compression has become less popular because of increased storage capacity on computers.

[edit] List of packers

For Portable Executable (Windows) files:

ASPack

ASPR (ASProtect)

Armadillo Packer

AxProtector

BeRoEXEPacker

CExe

exe32pack

EXE Bundle

EXECryptor

EXE Stealth

eXPressor

MPRESS – Freeware

FSG (Fast Small Good)

HASP Envelope

kkrunchy – Freeware

MEW – development stopped

NeoLite

Obsidium

PECompact

PEPack

PKLite32

PELock

PESpin

PEtite

Privilege Shell

RLPack

Sentinel CodeCover (Sentinel Shell)

Shrinker32

Smart Packer Pro

SmartKey GSS

tElock

Themida

UniKey Enveloper

Upack (software) – Freeware

UPX – free software

VMProtect

WWPack

BoxedApp Packer

XComp/XPack – Freeware

For New Executable (Windows) files:

Page 8: PE

PackWin WinLite

PKLite 2.01

For OS/2 executables only:

NeLite LxLite

For DOS executables only:

32LiTE 624

AINEXE

aPACK

DIET

HASP Envelope

LGLZ

LZEXE – First widely publicly used executable compressor for microcomputers.

PKLite

PMWLITE

UCEXE

UPX

WDOSX

WWpack

XE

For ELF files:

gzexe HASP Envelope

UPX

For .NET assembly files:

Page 9: PE

.NETZ NsPack

HASP Envelope

For Mach-O (Apple Mac OS X) files:

HASP Envelope UPX

For Java JAR files:

HASP Envelope pack200

For Java WAR files:

HASP Envelope

[edit] See also

Data compression Disk compression

Executable

Kolmogorov complexity

UPX

Self-extracting archive

File formatFrom Wikipedia, the free encyclopedia

Jump to: navigation, search

This article includes a list of references, but its sources remain unclear because it has insufficient inline citations.Please help to improve this article by introducing more precise citations where appropriate. (October 2008)

A file format is a particular way that information is encoded for storage in a computer file.

Page 10: PE

Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for different kinds of information. Within any format type, e.g., word processor documents, there will typically be several different formats. Sometimes these formats compete with each other.

File formats are divided into proprietary and open formats.

Contents

[hide]

1 Generality 2 Specifications

3 Identifying the type of a file

o 3.1 Filename extension

o 3.2 Internal metadata

3.2.1 File header

3.2.2 Magic number

o 3.3 External metadata

3.3.1 Mac OS type-codes

3.3.2 Mac OS X Uniform Type Identifiers (UTIs)

3.3.3 OS/2 Extended Attributes

3.3.4 POSIX extended attributes

3.3.5 PRONOM Unique Identifiers (PUIDs)

3.3.6 MIME types

3.3.7 File format identifiers (FFIDs)

3.3.8 File content based format identification

4 File structure

o 4.1 Unstructured formats (raw memory dumps)

o 4.2 Chunk-based formats

o 4.3 Directory-based formats

Page 11: PE

5 See also

6 References

7 External links

[edit] Generality

Some file formats are designed for very particular sorts of data: PNG files, for example, store bitmapped images using lossless data compression. Other file formats, however, are designed for storage of several different types of data: the Ogg format can act as a container for many different types of multimedia, including any combination of audio and/or video, with or without text (such as subtitles), and metadata. A text file can contain any stream of characters, encoded for example as ASCII or Unicode, including possible control characters. Some file formats, such as HTML, Scalable Vector Graphics and the source code of computer software, are also text files with defined syntaxes that allow them to be used for specific purposes.

[edit] Specifications

Many file formats, including some of the most well-known file formats, have a published specification document (often with a reference implementation) that describes exactly how the data is to be encoded, and which can be used to determine whether or not a particular program treats a particular file format correctly. There are, however, two reasons why this is not always the case. First, some file format developers view their specification documents as trade secrets, and therefore do not release them to the public. Second, some file format developers never spend time writing a separate specification document; rather, the format is defined only implicitly, through the program(s) that manipulate data in the format.

Using file formats without a publicly available specification can be costly. Learning how the format works will require either reverse engineering it from a reference implementation or acquiring the specification document for a fee from the format developers. This second approach is possible only when there is a specification document, and typically requires the signing of a non-disclosure agreement. Both strategies require significant time, money, or both. Therefore, as a general rule, file formats with publicly available specifications are supported by a large number of programs, while non-public formats are supported by only a few programs.

Patent law, rather than copyright, is more often used to protect a file format. Although patents for file formats are not directly permitted under US law, some formats require the encoding of data with patented algorithms. For example, using compression with the GIF file format requires the use of a patented algorithm, and although initially the patent owner did not enforce it, they later began collecting fees for use of the algorithm. This has resulted in a significant decrease in the use of GIFs, and is partly responsible for the development of the alternative PNG format. However, the patent expired in the US in mid-2003, and worldwide in mid-2004. Algorithms are usually held not to be patentable under current European law, which also includes a provision that members "shall ensure that, wherever the use of a patented technique is needed for a significant purpose such as ensuring conversion of the conventions used in two different

Page 12: PE

computer systems or networks so as to allow communication and exchange of data content between them, such use is not considered to be a patent infringement", which would apparently allow implementation of a patented file system where necessary to allow two different computers to interoperate.[1]

[edit] Identifying the type of a file

A method is required to determine the format of a particular file within the filesystem—an example of metadata. Different operating systems have traditionally taken different approaches to this problem, with each approach having its own advantages and disadvantages.

Of course, most modern operating systems, and individual applications, need to use all of these approaches to process various files, at least to be able to read 'foreign' file formats, if not work with them completely.

[edit] Filename extension

Main article: Filename extension

One popular method in use by several operating systems, including Windows, Mac OS X, CP/M, DOS, VMS, and VM/CMS, is to determine the format of a file based on the section of its name following the final period. This portion of the filename is known as the filename extension. For example, HTML documents are identified by names that end with .htm (or .html), and GIF images by .gif. In the original FAT filesystem, filenames were limited to an eight-character identifier and a three-character extension, which is known as 8.3 filename. Many formats thus still use three-character extensions, even though modern operating systems and application programs no longer have this limitation. Since there is no standard list of extensions, more than one format can use the same extension, which can confuse the operating system and consequently users.

One artifact of this approach is that the system can easily be tricked into treating a file as a different format simply by renaming it—an HTML file can, for instance, be easily treated as plain text by renaming it from filename.html to filename.txt. Although this strategy was useful to expert users who could easily understand and manipulate this information, it was frequently confusing to less technical users, who might accidentally make a file unusable (or 'lose' it) by renaming it incorrectly.

This led more recent operating system shells, such as Windows 95 and Mac OS X, to hide the extension when displaying lists of recognized files. This separates the user from the complete filename, preventing the accidental changing of a file type, while allowing expert users to still retain the original functionality through enabling the displaying of file extensions.

A downside of hiding the extension is that it then becomes possible to have what appear to be two or more identical filenames in the same folder. This is especially true when image files are needed in more than one format for different applications. For example, a company logo may be

Page 13: PE

needed both in .tif format (for publishing) and .gif format (for web sites). With the extensions visible, these would appear as the unique filenames "CompanyLogo.tif" and "CompanyLogo.gif". With the extensions hidden, these would both appear to have the identical filename "CompanyLogo", making it more difficult to determine which to select for a particular application.

A further downside is that hiding such information can become a security risk[2]. This is because on a filename extensions reliant system all usable files will have such an extension (for example all JPEG images will have ".jpg" or ".jpeg" at the end of their name), so seeing file extensions would be a common occurrence and users may depend on them when looking for a file's format. By having file extensions hidden a malicious user can create an executable program with an innocent name such as "Holiday photo.jpg.exe". In this case the ".exe" will be hidden and a user will see this file as "Holiday photo.jpg", which appears to be a JPEG image, unable to harm the machine save for bugs in the application used to view it. However, the operating system will still see the ".exe" extension and thus will run the program, which is then able to cause harm and presents a security issue. To further trick users, it is possible to store an icon inside the program, as done on Microsoft Windows, in which case the operating system's icon assignment can be overridden with an icon commonly used to represent JPEG images, making such a program look like and appear to be called an image, until it is opened that is. This issue requires users with extensions hidden to be vigilant, and never open files which seem to have a known extension displayed despite the hidden option being enabled (since it must therefore have 2 extensions, the real one being unknown until hiding is disabled). This presents a practical problem for Windows systems where extension hiding is turned on by default.

[edit] Internal metadata

A second way to identify a file format is to store information regarding the format inside the file itself. Usually, such information is written in one (or more) binary string(s), tagged or raw texts placed in fixed, specific locations within the file. Since the easiest place to locate them is at the beginning of it, such area is usually called a file header when it is greater than a few bytes, or a magic number if it is just a few bytes long.

[edit] File header

First of all, the meta-data contained in a file header are not necessarily stored only at the beginning of it, but might be present in other areas too, often including the end of the file; that depends on the file format or the type of data it contains. Character-based (text) files have character-based human-readable headers, whereas binary formats usually feature binary headers, although that is not a rule: a human-readable file header may require more bytes, but is easily discernable with simple text or hexadecimal editors. File headers may not only contain the information required by algorithms to identify the file format alone, but also real metadata about the file and its contents. For example most image file formats store information about image size, resolution, colour space/format and optionally other authoring information like who, when and where it was made, what camera model and shooting parameters was it taken with (if any, cfr. Exif), and so on. Such metadata may be used by a program reading or interpreting the file both

Page 14: PE

during the loading process and after that, but can also be used by the operating system to quickly capture information about the file itself without loading it all into memory.

The downsides of file header as a file-format identification method are at least two. First, at least a few (initial) blocks of the file need to be read in order to gain such information; those could be fragmented in different locations of the same storage medium, thus requiring more seek and I/O time, which is particularly bad for the identification of large quantities of files altogether (like a GUI browsing inside a folder with thousands or more files and discerning file icons or thumbnails for all of them to visualize). Second, if the header is binary hard-coded (i.e. the header itself is subject to a non-trivial interpretation in order to be recognized), especially for metadata content protection's sake, there is some risk that file format is misinterpreted at first sight, or even badly written at the source, often resulting in corrupt metadata (which, in extremely pathological cases, might even render the file unreadable anymore).

A more logically sophisticated example of file header is that used in wrapper (or container) file formats.

[edit] Magic numberSee also: Magic number (programming)

One way to incorporate such metadata, often associated with Unix and its derivatives, is just to store a "magic number" inside the file itself. Originally, this term was used for a specific set of 2-byte identifiers at the beginning of a file, but since any undecoded binary sequence can be regarded as a number, any feature of a file format which uniquely distinguishes it can be used for identification. GIF images, for instance, always begin with the ASCII representation of either GIF87a or GIF89a, depending upon the standard to which they adhere. Many file types, most especially plain-text files, are harder to spot by this method. HTML files, for example, might begin with the string <html> (which is not case sensitive), or an appropriate document type definition that starts with <!DOCTYPE, or, for XHTML, the XML identifier, which begins with <?xml. The files can also begin with HTML comments, random text, or several empty lines, but still be usable HTML.

The magic number approach offers better guarantees that the format will be identified correctly, and can often determine more precise information about the file. Since reasonably reliable "magic number" tests can be fairly complex, and each file must effectively be tested against every possibility in the magic database, this approach is relatively inefficient, especially for displaying large lists of files (in contrast, filename and metadata-based methods need check only one piece of data, and match it against a sorted index). Also, data must be read from the file itself, increasing latency as opposed to metadata stored in the directory. Where filetypes don't lend themselves to recognition in this way, the system must fall back to metadata. It is, however, the best way for a program to check if a file it has been told to process is of the correct format: while the file's name or metadata may be altered independently of its content, failing a well-designed magic number test is a pretty sure sign that the file is either corrupt or of the wrong type. On the other hand a valid magic number does not guarantee that the file is not corrupt or of a wrong type.

Page 15: PE

So-called shebang lines in script files are a special case of magic numbers. Here, the magic number is human-readable text that identifies a specific command interpreter and options to be passed to the command interpreter.

Another operating system using magic numbers is AmigaOS, where magic numbers were called "Magic Cookies" and were adopted as a standard system to recognize executables in Hunk executable file format and also to let single programs, tools and utilities deal automatically with their saved data files, or any other kind of file types when saving and loading data. This system was then enhanced with the Amiga standard Datatype recognition system. Another method was the FourCC method, originating in OSType on Macintosh, later adapted by Interchange File Format (IFF) and derivatives.

[edit] External metadata

A final way of storing the format of a file is to explicitly store information about the format in the file system, rather than within the file itself.

This approach keeps the metadata separate from both the main data and the name, but is also less portable than either file extensions or "magic numbers", since the format has to be converted from filesystem to filesystem. While this is also true to an extent with filename extensions — for instance, for compatibility with MS-DOS's three character limit — most forms of storage have a roughly equivalent definition of a file's data and name, but may have varying or no representation of further metadata.

Note that zip files or archive files solve the problem of handling metadata. A utility program collects multiple files together along with metadata about each file and the folders/directories they came from all within one new file (e.g. a zip file with extension .zip). The new file is also compressed and possibly encrypted, but now is transmissible as a single file across operating systems by FTP systems or attached to email. At the destination, it must be unzipped by a compatible utility to be useful, but the problems of transmission are solved this way.

[edit] Mac OS type-codes

The Mac OS' Hierarchical File System stores codes for creator and type as part of the directory entry for each file. These codes are referred to as OSTypes, and for instance a HyperCard "stack" file has a creator of WILD (from Hypercard's previous name, "WildCard") and a type of STAK. The type code specifies the format of the file, while the creator code specifies the default program to open it with when double-clicked by the user. For example, the user could have several text files all with the type code of TEXT, but which each open in a different program, due to having differing creator codes. RISC OS uses a similar system, consisting of a 12-bit number which can be looked up in a table of descriptions — e.g. the hexadecimal number FF5 is "aliased" to PoScript, representing a PostScript file.

[edit] Mac OS X Uniform Type Identifiers (UTIs)Main article: Uniform Type Identifier

Page 16: PE

A Uniform Type Identifier (UTI) is a method used in Mac OS X for uniquely identifying "typed" classes of entity, such as file formats. It was developed by Apple as a replacement for OSType (type & creator codes).

The UTI is a Core Foundation string, which uses a reverse-DNS string. Common or standard types use the public domain (e.g. public.png for a Portable Network Graphics image), while other domains can be used for third-party types (e.g. com.adobe.pdf for Portable Document Format). UTIs can be defined within a hierarchical structure, known as a conformance hierarchy. Thus, public.png conforms to a supertype of public.image, which itself conforms to a supertype of public.data. A UTI can exist in multiple hierarchies, which provides great flexibility.

In addition to file formats, UTIs can also be used for other entities which can exist in OS X, including:

Pasteboard data Folders (directories)

Translatable types (as handled by the Translation Manager)

Bundles

Frameworks

Streaming data

Aliases and symlinks

[edit] OS/2 Extended Attributes

The HPFS, FAT12 and FAT16 (but not FAT32) filesystems allow the storage of "extended attributes" with files. These comprise an arbitrary set of triplets with a name, a coded type for the value and a value, where the names are unique and values can be up to 64 KB long. There are standardized meanings for certain types and names (under OS/2). One such is that the ".TYPE" extended attribute is used to determine the file type. Its value comprises a list of one or more file types associated with the file, each of which is a string, such as "Plain Text" or "HTML document". Thus a file may have several types.

The NTFS filesystem also allows to store OS/2 extended attributes, as one of file forks, but this feature is merely present to support the OS/2 subsystem (not present in XP), so the Win32 subsystem treats this information as an opaque block of data and does not use it. Instead, it relies on other file forks to store meta-information in Win32-specific formats. OS/2 extended attributes can still be read and written by Win32 programs, but the data must be entirely parsed by applications.

Page 17: PE

[edit] POSIX extended attributes

On Unix and Unix-like systems, the ext2, ext3, ReiserFS version 3, XFS, JFS, FFS, and HFS+ filesystems allow the storage of extended attributes with files. These include an arbitrary list of "name=value" strings, where the names are unique and a value can be accessed through its related name.

[edit] PRONOM Unique Identifiers (PUIDs)

The PRONOM Persistent Unique Identifier (PUID) is an extensible scheme of persistent, unique and unambiguous identifiers for file formats, which has been developed by The National Archives of the UK as part of its PRONOM technical registry service. PUIDs can be expressed as Uniform Resource Identifiers using the info:pronom/ namespace. Although not yet widely used outside of UK government and some digital preservation programmes, the PUID scheme does provide greater granularity than most alternative schemes.

[edit] MIME types

MIME types are widely used in many Internet-related applications, and increasingly elsewhere, although their usage for on-disc type information is rare. These consist of a standardised system of identifiers (managed by IANA) consisting of a type and a sub-type, separated by a slash — for instance, text/html or image/gif. These were originally intended as a way of identifying what type of file was attached to an e-mail, independent of the source and target operating systems. MIME types identify files on BeOS, AmigaOS 4.0 and MorphOS, as well as store unique application signatures for application launching. In AmigaOS and MorphOS the Mime type system works in parallel with Amiga specific Datatype system.

There are problems with the MIME types though; several organisations and people have created their own MIME types without registering them properly with IANA, which makes the use of this standard awkward in some cases.

[edit] File format identifiers (FFIDs)

File format identifiers is another, not widely used way to identify file formats according to their origin and their file category. It was created for the Description Explorer suite of software. It is composed of several digits of the form NNNNNNNNN-XX-YYYYYYY. The first part indicates the organisation origin/maintainer (this number represents a value in a company/standards organisation database), the 2 following digits categorize the type of file in hexadecimal. The final part is composed of the usual file extension of the file or the international standard number of the file, padded left with zeros. For example, the PNG file specification has the FFID of 000000001-31-0015948 where 31 indicates an image file, 0015948 is the standard number and 000000001 indicates the ISO Organisation.

Page 18: PE

[edit] File content based format identification

Another but least popular way to identify the file format is to look at the file contents for distinguishable patterns among file types. As we know, the file contents are sequence of bytes and a byte has 256 unique patterns (0~255). Thus, counting the occurrence of byte patterns that is often referred as byte frequency distribution gives distinguishable patterns to identify file types. There are many content based file type identification schemes that use byte frequency distribution to build the representative models for file type and use any statistical and data mining techniques to identify file types [3]

[edit] File structure

There are several types of ways to structure data in a file. The most usual ones are described below.

[edit] Unstructured formats (raw memory dumps)

Earlier file formats used raw data formats that consisted of directly dumping the memory images of one or more structures into the file.

This has several drawbacks. Unless the memory images also have reserved spaces for future extensions, extending and improving this type of structured file is very difficult. It also creates files that might be specific to one platform or programming language (for example a structure containing a Pascal string is not recognized as such in C). On the other hand, developing tools for reading and writing these types of files is very simple.

The limitations of the unstructured formats led to the development of other types of file formats that could be easily extended and be backward compatible at the same time.

[edit] Chunk-based formats

Electronic Arts and Commodore-Amiga pioneered this file format in 1985, with their IFF (Interchange File Format) file format. In this kind of file structure, each piece of data is embedded in a container that contains a signature identifying the data, as well the length of the data (for binary encoded files). This type of container is called a "chunk". The signature is usually called a chunk id, chunk identifier, or tag identifier.

With this type of file structure, tools that do not know certain chunk identifiers simply skip those that they do not understand.

This concept has been taken again and again by RIFF (Microsoft-IBM equivalent of IFF), PNG, JPEG storage, DER (Distinguished Encoding Rules) encoded streams and files (which were originally described in CCITT X.409:1984 and therefore predate IFF), and Structured Data Exchange Format (SDXF). Even XML can be considered a kind of chunk based format, since each data element is surrounded by tags which are akin to chunk identifiers.

Page 19: PE

[edit] Directory-based formats

This is another extensible format, that closely resembles a file system (OLE Documents are actual filesystems), where the file is composed of 'directory entries' that contain the location of the data within the file itself as well as its signatures (and in certain cases its type). Good examples of these types of file structures are disk images, OLE documents and TIFF images.

[edit] See also

Audio file format Chemical file format

Container format (digital)

Document file format

DROID file format identification utility

File (command) , a file type identification utility

File Formats, Transformation, and Migration (related wikiversity article)

FormatFactory , a free omni file format converter.

Future proofing

Graphics file format summary

List of archive formats

Image file formats

List of file formats

List of free file formats

List of motion and gesture file formats

Magic number (programming)

List of file signatures , or "magic numbers"

Object file

Open format

TrID , a freeware file type identification utility

Windows file types

[edit] References

Page 20: PE

1. ̂ Foundation for a Free Information Infrastructure. "Europarl 2003-09-24: Amended Software Patent Directive". http://swpat.ffii.org/papers/europarl0309/index.en.html. Retrieved 2007-01-07.

2. ̂ PC World. "Windows Tips: For Security Reasons, It Pays To Know Your File Extensions". http://www.pcworld.com/article/id,113758-page,1/article.html. Retrieved 2008-06-20.

3. ̂ "File Format Identification". http://www.forensicswiki.org/wiki/File_Format_Identification.

"Extended Attribute Data Types" . REXX Tips & Tricks, Version 2.80. http://markcrocker.com/rexxtipsntricks/rxtt28.2.0301.html. Retrieved February 9, 2005.

"Extended Attributes used by the WPS" . REXX Tips & Tricks, Version 2.80. http://markcrocker.com/rexxtipsntricks/rxtt28.2.0300.html. Retrieved February 9, 2005.

"Extended Attributes - what are they and how can you use them ?" . Roger Orr. http://www.howzatt.demon.co.uk/articles/06may93.html. Retrieved February 9, 2005.

Dynamic-link libraryFrom Wikipedia, the free encyclopediaJump to: navigation, search

This article may contain excessive, poor or irrelevant examples. You can improve the article by adding more descriptive text. See Wikipedia's guide to writing better articles for further suggestions. (September 2010)

This article includes a list of references, but its sources remain unclear because it has insufficient inline citations.Please help to improve this article by introducing more precise citations where appropriate. (October 2009)

Dynamic link library

Filename extension .dll

Internet media type application/x-msdownload

Uniform Type

Identifier

com.microsoft.windows-dynamic-link-

library

Magic number MZ

Developed by Microsoft

Container for Shared library

Dynamic-link library (also written without the hyphen), or DLL, is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems. These libraries usually have the file extension DLL, OCX (for libraries containing ActiveX controls), or DRV (for legacy system drivers). The file formats for DLLs are the same as for Windows EXE

Page 21: PE

files — that is, Portable Executable (PE) for 32-bit and 64-bit Windows, and New Executable (NE) for 16-bit Windows. As with EXEs, DLLs can contain code, data, and resources, in any combination.

In the broader sense of the term, any data file with the same file format can be called a resource DLL. Examples of such DLLs include icon libraries, sometimes having the extension ICL, and font files, having the extensions FON and FOT.[citation needed]

Contents

[hide] 1 Background for DLL 2 Features of DLL

o 2.1 Memory management

o 2.2 Import libraries

o 2.3 Symbol resolution and binding

o 2.4 Explicit run-time linking

o 2.5 Delayed loading

3 Compiler and language considerations

o 3.1 Delphi

o 3.2 Microsoft Visual Basic

o 3.3 C and C++

4 Programming examples

o 4.1 Creating DLL exports

o 4.2 Using DLL imports

o 4.3 Using explicit run-time linking

4.3.1 Microsoft Visual Basic

4.3.2 Delphi

4.3.3 C and C++

4.3.4 Python

5 Component Object Model

Page 22: PE

6 DLL Hijacking

7 See also

8 External links

9 References

[edit] Background for DLL

The first versions of Microsoft Windows ran every program in a single address space. Every program was meant to co-operate by yielding the CPU to other programs so that the GUI was capable of multitasking and could be as responsive as possible. All Operating-System level operations were provided by the underlying operating system: MS-DOS. All higher level services were provided by Windows Libraries Dynamic Link Libraries. The Drawing API, GDI, was implemented in a DLL called GDI.EXE, the user interface in USER.EXE. These extra layers on top of DOS had to be shared across all running windows programs, not just to enable Windows to work in a machine with less than a megabyte of RAM, but to enable the programs to co-operate amongst each other. The Graphics Device Interface code in GDI needed to translate drawing commands to operations on specific devices. On the display, it had to manipulate pixels in the frame buffer. When drawing to a printer, the API calls had to be transformed into requests to a printer. Although it could have been possible to provide hard-coded support for a limited set of devices (like the Color Graphics Adapter display, the HP LaserJet Printer Command Language), Microsoft chose a different approach. GDI would work by loading different pieces of code to work with different output devices—pieces of code called 'Device Drivers'.

The same architectural concept that allowed GDI to load different device drivers is that which allowed the Windows shell to load different windows programs, and for these programs to invoke API calls from the shared USER and GDI libraries. That concept was Dynamic Linking.

In a conventional non-shared, static library, sections of code are simply added to the calling program when its executable is built at the linking phase; if two programs use the same routine, the code has to be included in both. With dynamic linking, shared code is placed into a single, separate file. The programs that call this file are connected to it at run time, with the operating system (or, in the case of early versions of Windows, the OS-extension), performing the binding.

For those early versions of Windows (1.0 to 3.11), the DLLs were the foundation for the entire GUI.

Display drivers were merely DLLs with a .DRV extension that provided custom implementations of the same drawing API through a unified Device Driver Interface (DDI).

The Drawing (GDI) and GUI (USER) APIs were merely the function calls exported by the GDI and USER, system DLLs with .EXE extension.

Page 23: PE

This notion of building up the operating system from a collection of dynamically loaded libraries is a core concept of Windows that persists even today. DLLs provide the standard benefits of shared libraries, such as modularity. Modularity allows changes to be made to code and data in a single self-contained DLL shared by several applications without any change to the applications themselves.

Another benefit of the modularity is the use of generic interfaces for plug-ins. A single interface may be developed which allows old as well as new modules to be integrated seamlessly at run-time into pre-existing applications, without any modification to the application itself. This concept of dynamic extensibility is taken to the extreme with the Component Object Model, the underpinnings of ActiveX.

In Windows 1.x, 2.x and 3.x, all windows applications shared the same address space, as well as the same memory. A DLL was only loaded once into this address space; from then on all programs using the library accessed it. The library's data was shared across all the programs. This could be used as an indirect form of Inter-process communication, or it could accidentally corrupt the different programs. With Windows 95 and successors every process runs in its own address space. While the DLL code may be shared, the data is private except where shared data is explicitly requested by the library. That said, large swathes of Windows 95, Windows 98 and Windows Me were built from 16-bit libraries, a feature which limited the performance of the Pentium Pro microprocessor when launched, and ultimately limited the stability and scalability of the DOS-based versions of Windows.

While DLLs are the core of the Windows architecture, they have a number of drawbacks, collectively called "DLL hell".[1] Currently, Microsoft promotes Microsoft .NET as one solution to the problems of DLL hell, although they now promote Virtualization based solutions such as Microsoft Virtual PC and Microsoft Application Virtualization, because they offer superior isolation between applications. An alternative mitigating solution to DLL hell has been the implementation of Side-by-Side Assembly.

[edit] Features of DLL

[edit] Memory management

In Win32, the DLL files are organized into sections. Each section has its own set of attributes, such as being writable or read-only, executable (for code) or non-executable (for data), and so on.

The code in a DLL is usually shared among all the processes that use the DLL; that is, they occupy a single place in physical memory, and do not take up space in the page file. If the physical memory occupied by a code section is to be reclaimed, its contents are discarded, and later reloaded directly from the DLL file as necessary.

In contrast to code sections, the data sections of a DLL are usually private; that is, each process using the DLL has its own copy of all the DLL's data. Optionally, data sections can be made shared, allowing inter-process communication via this shared memory area. However, because

Page 24: PE

user restrictions do not apply to the use of shared DLL memory, this creates a security hole; namely, one process can corrupt the shared data, which will likely cause all other sharing processes to behave undesirably. For example, a process running under a guest account can in this way corrupt another process running under a privileged account. This is an important reason to avoid the use of shared sections in DLLs.

If a DLL is compressed by certain executable packers (e.g. UPX), all of its code sections are marked as read-and-write, and will be unshared. Read-and-write code sections, much like private data sections, are private to each process. Thus DLLs with shared data sections should not be compressed if they are intended to be used simultaneously by multiple programs, since each program instance would have to carry its own copy of the DLL, resulting in increased memory consumption.

[edit] Import libraries

Linking to dynamic libraries is usually handled by linking to an import library when building or linking to create an executable file. The created executable then contains an import address table (IAT) by which all DLL function calls are referenced (each referenced DLL function contains its own entry in the IAT). At run-time, the IAT is filled with appropriate addresses that point directly to a function in the separately-loaded DLL.

Like static libraries, import libraries for DLLs are noted by the .lib file extension. For example, kernel32.dll, the primary dynamic library for Windows' base functions such as file creation and memory management, is linked via kernel32.lib.

[edit] Symbol resolution and binding

Each function exported by a DLL is identified by a numeric ordinal and optionally a name. Likewise, functions can be imported from a DLL either by ordinal or by name. The ordinal represents the position of the function's address pointer in the DLL Export Address table. It is common for internal functions to be exported by ordinal only. For most Windows API functions only the names are preserved across different Windows releases; the ordinals are subject to change. Thus, one cannot reliably import Windows API functions by their ordinals.

Importing functions by ordinal provides only slightly better performance than importing them by name: export tables of DLLs are ordered by name, so a binary search can be used to find a function. The index of the found name is then used to look up the ordinal in the Export Ordinal table. In 16-bit Windows, the name table was not sorted, so the name lookup overhead was much more noticeable.

It is also possible to bind an executable to a specific version of a DLL, that is, to resolve the addresses of imported functions at compile-time. For bound imports, the linker saves the timestamp and checksum of the DLL to which the import is bound. At run-time Windows checks to see if the same version of library is being used, and if so, Windows bypasses processing the imports. Otherwise, if the library is different from the one which was bound to, Windows processes the imports in a normal way.

Page 25: PE

Bound executables load somewhat faster if they are run in the same environment that they were compiled for, and exactly the same time if they are run in a different environment, so there's no drawback for binding the imports. For example, all the standard Windows applications are bound to the system DLLs of their respective Windows release. A good opportunity to bind an application's imports to its target environment is during the application's installation. This keeps the libraries 'bound' until the next OS update. It does, however, change the checksum of the executable, so it is not something that can be done with signed programs, or programs that are managed by a configuration management tool that uses checksums (such as MD5 checksums) to manage file versions. As more recent Windows versions have moved away from having fixed addresses for every loaded library (for security reasons), the opportunity and value of binding an executable is decreasing.

[edit] Explicit run-time linking

DLL files may be explicitly loaded at run-time, a process referred to simply as run-time dynamic linking by Microsoft, by using the LoadLibrary (or LoadLibraryEx) API function. The GetProcAddress API function is used to look up exported symbols by name, and FreeLibrary — to unload the DLL. These functions are analogous to dlopen, dlsym, and dlclose in the POSIX standard API.

// LSPaper draw using OLE2 function if available on client HINSTANCE hOle2Dll ; hOle2Dll = LoadLibrary ( "OLE2.DLL" ) ; if ( hOle2Dll != NULL ){ FARPROC lpOleDraw ; lpOleDraw = GetProcAddress ( hOle2Dll , "OleDraw" ) ; if ( lpOleDraw != (FARPROC)NULL ) { (*lpOleDraw) (pUnknown , dwAspect , hdcDraw , lprcBounds ) ; } FreeLibrary ( hOle2Dll ) ;}

The procedure for explicit run-time linking is the same in any language that supports pointers to functions, since it depends on the Windows API rather than language constructs.

[edit] Delayed loading

Normally, an application that was linked against a DLL’s import library will fail to start if the DLL cannot be found, because Windows will not run the application unless it can find all of the DLLs that the application may require. However an application may be linked against an import library to allow delayed loading of the dynamic library.[2] In this case the operating system will not try to find or load the DLL when the application starts; instead, it will only try to find and

Page 26: PE

load the DLL when one of its functions is called. If the DLL cannot be found or loaded, or the called function does not exist, the operating system will generate an exception, which the application can catch and handle appropriately. If the application does not handle the exception, it will be caught by the operating system, which will terminate the program with an error message.

The delay-loading mechanism also provides notification hooks, allowing the application to perform additional processing or error handling when the DLL is loaded and/or any DLL function is called.

[edit] Compiler and language considerations

[edit] Delphi

In the heading of a source file, the keyword library is used instead of program. At the end of the file, the functions to be exported are listed in exports clause.

Delphi does not require LIB files to import functions from DLLs; to link to a DLL, the external keyword is used in the function declaration.

[edit] Microsoft Visual Basic

In Visual Basic (VB), only run-time linking is supported; but in addition to using LoadLibrary and GetProcAddress API functions, declarations of imported functions are allowed.

When importing DLL functions through declarations, VB will generate a run-time error if the DLL file cannot be found. The developer can catch the error and handle it appropriately.

When creating DLLs in VB, the IDE will only allow you to create ActiveX DLLs, however methods have been created [3] to allow the user to explicitly tell the linker to include a .DEF file which defines the ordinal position and name of each exported function. This allows the user to create a standard Windows DLL using Visual Basic (Version 6 or lower) which can be referenced through a "Declare" statement.

[edit] C and C++

Microsoft Visual C++ (MSVC) provides a number of extensions to standard C++ which allow functions to be specified as imported or exported directly in the C++ code; these have been adopted by other Windows C and C++ compilers, including Windows versions of GCC. These extensions use the attribute __declspec before a function declaration. Note that when C functions are accessed from C++, they must also be declared as extern "C" in C++ code, to inform the compiler that the C linkage should be used.[4]

Page 27: PE

Besides specifying imported or exported functions using __declspec attributes, they may be listed in IMPORT or EXPORTS section of the DEF file used by the project. The DEF file is processed by the linker, rather than the compiler, and thus it is not specific to C++.

DLL compilation will produce both DLL and LIB files. The LIB file is used to link against a DLL at compile-time; it is not necessary for run-time linking. Unless your DLL is a COM server, the DLL file must be placed in one of the directories listed in the PATH environment variable, in the default system directory, or in the same directory as the program using it. COM server DLLs are registered using regsvr32.exe, which places the DLL's location and its globally unique ID (GUID) in the registry. Programs can then use the DLL by looking up its GUID in the registry to find its location.

[edit] Programming examples

[edit] Creating DLL exports

The following examples show language-specific bindings for exporting symbols from DLLs.

Delphi

library Example; // function that adds two numbersfunction AddNumbers(a, b : Double): Double;begin Result := a + b;end; // export this functionexports AddNumbers; // DLL initialization code: no special handling neededbeginend.

C and C++

#include <windows.h> // DLL entry function (called on load, unload, ...)BOOL APIENTRY DllMain(HANDLE hModule, DWORD dwReason, LPVOID lpReserved){ return TRUE;} // Exported function - adds two numbersextern "C" __declspec(dllexport) double AddNumbers(double a, double b){ return a + b;}

Page 28: PE

[edit] Using DLL imports

The following examples show how to use language-specific bindings to import symbols for linking against a DLL at compile-time.

Delphi

{$APPTYPE CONSOLE} program Example; // import function that adds two numbersfunction AddNumbers(a, b : Double): Double; external 'Example.dll'; // main programvar R:Double; begin R := AddNumbers(1, 2); Writeln('The result was: ', R);end.

C and C++

Make sure you include Example.lib file (assuming that Example.dll is generated) in the project (Add Existing Item option for Project!) before static linking. The file Example.lib is automatically generated by the compiler when compiling the DLL. Not executing the above statement would cause linking error as the linker would not know where to find the definition of AddNumbers. You also need to copy the DLL Example.dll to the location where the .exe file would be generated by the following code.

#include <windows.h>#include <stdio.h> // Import function that adds two numbersextern "C" __declspec(dllimport) double AddNumbers(double a, double b); int main(int argc, char *argv[]){ double result = AddNumbers(1, 2); printf("The result was: %f\n", result); return 0;}

[edit] Using explicit run-time linking

The following examples show how to use the run-time loading and linking facilities using language-specific WIN32 API bindings.

[edit] Microsoft Visual Basic

Page 29: PE

Option ExplicitDeclare Function AddNumbers Lib "Example.dll" _(ByVal a As Double, ByVal b As Double) As Double Sub Main()

Dim Result As DoubleResult = AddNumbers(1, 2)Debug.Print "The result was: " & Result

End Sub

[edit] Delphi

program Example; {$APPTYPE CONSOLE} uses Windows; var

AddNumbers : function (a, b: Double): Double;LibHandle : HMODULE;

begin

LibHandle := LoadLibrary('example.dll');

if LibHandle = 0 thenExit;

AddNumbers := GetProcAddress(LibHandle, 'AddNumbers');

if Assigned( AddNumbers ) thenWriteln( '1 + 2 = ', AddNumbers( 1, 2 ) );elseWriteln('Error: unable to find DLL function');

FreeLibrary(LibHandle);

end.

[edit] C and C++

#include <windows.h>#include <stdio.h> // DLL function signaturetypedef double (*importFunction)(double, double); int main(int argc, char **argv){

importFunction addNumbers;double result;

// Load DLL fileHINSTANCE hinstLib = LoadLibrary(TEXT("Example.dll"));if (hinstLib == NULL) {

printf("ERROR: unable to load DLL\n");

Page 30: PE

return 1;}

// Get function pointeraddNumbers = (importFunction)GetProcAddress(hinstLib, "AddNumbers");if (addNumbers == NULL) {

printf("ERROR: unable to find DLL function\n");FreeLibrary(hinstLib);return 1;

}

// Call function.result = addNumbers(1, 2);

// Unload DLL fileFreeLibrary(hinstLib);

// Display resultprintf("The result was: %f\n", result);

return 0;

}

[edit] Python

import ctypes my_dll = ctypes.cdll.LoadLibrary("Example.dll") # The following "restype" method specificationis needed to make# Python understand what type is returned by the function.my_dll.AddNumbers.restype = ctypes.c_double p = my_dll.AddNumbers(ctypes.c_double(1.0), ctypes.c_double(2.0)) print "The result was: ", p

[edit] Component Object Model

The Component Object Model (COM) extends the DLL concept to object-oriented programming. Objects can be called from another process or hosted on another machine. COM objects have unique GUIDs and can be used to implement powerful back-ends to simple GUI front ends such as Visual Basic and ASP. They can also be programmed from scripting languages. COM objects are more complex to create and use than DLLs.

[edit] DLL Hijacking

Due to a vulnerability commonly known as DLL Hijacking, many programs will load and execute a malicious DLL contained in the same folder as a file on a remote system. The vulnerability was discovered by ethical hacker HD Moore, who has published an exploit for the open-source based penetration testing software Metasploit.[5]

Page 31: PE

[edit] See also

Dependency Walker , a utility which displays exported and imported functions of DLL and EXE files.

DLL Hijacking

Dynamic library

Library (computing)

Linker (computing)

Loader (computing)

Object file

Shared library

Static library

[edit] External links

dllexport, dllimport on MSDN Dynamic-Link Libraries on MSDN

What is a DLL? on Microsoft support site

Dynamic-Link Library Functions on MSDN

Microsoft Portable Executable and Common Object File Format Specification

Microsoft specification for dll files

[edit] References

1. ̂ "The End of DLL Hell". Microsoft Corporation. http://msdn.microsoft.com/en-us/library/ms811694.aspx. Retrieved 2009-07-11.

2. ̂ "Linker Support for Delay-Loaded DLLs". Microsoft Corporation. http://msdn.microsoft.com/en-us/library/151kt790.aspx. Retrieved 2009-07-11.

3. ̂ Petrusha, Ron (2005-04-26). "Creating a Windows DLL with Visual Basic". O'Reilly Media. http://www.windowsdevcenter.com/pub/a/windows/2005/04/26/create_dll.html?page=1. Retrieved 2009-07-11.

4. ̂ MSDN, Using extern to Specify Linkage

5. ̂ TechWorld: Hacking toolkit publishes DLL hijacking exploit

Page 32: PE

Hart, Johnson. Windows System Programming Third Edition. Addison-Wesley, 2005. ISBN 0-321-25619-0

Rector, Brent et al. Win32 Programming. Addison-Wesley Developers Press, 1997. ISBN 0-201-63492-9.

[hide] v • d • e

Microsoft Windows components

Core

Active Scripting (WSH · VBScript · JScript) · Aero · AutoPlay · AutoRun · ClearType · COM (ActiveX · ActiveX Document · COM Structured storage · DCOM · OLE · OLE Automation · Transaction Server) · Desktop Window Manager · DirectX · Explorer · Graphics Device Interface · Imaging Format · .NET Framework · Search (IFilter · Saved search) · Server Message Block  · Shell (Extensions · File associations · Namespace · Special Folders) · Start menu · Previous Versions · Taskbar · Windows USER · Win32 console · XML Paper Specification

Managementtools

Backup and Restore Center  · cmd.exe · Control Panel (Applets) · Device Manager · Disk Cleanup · Disk Defragmenter · Driver Verifier · Event Viewer · IExpress · Management Console · Netsh · Problem Reports and Solutions · Resource Monitor · Sysprep · System Policy Editor · System Configuration · Task Manager · System File Checker · System Restore · WMI · Windows Installer · Windows PowerShell · Windows Update · WAIK · WinSAT · Windows Easy Transfer

Applications

Calculator · Calendar · CD Player · Character Map · Contacts · DVD Maker · Fax and Scan · Internet Explorer · Journal · Mail · Magnifier · Media Center · Media Player · Meeting Space · Mobile Device Center · Mobility Center · Movie Maker · Narrator · Notepad · Paint · Photo Gallery · Private Character Editor · Remote Assistance · Windows Desktop Gadgets · Snipping Tool · Sound Recorder · Speech Recognition · Tablet PC Input Panel · WordPad

GamesChess Titans · FreeCell · Hearts · Hold 'Em · InkBall · Mahjong Titans · Minesweeper · Pinball · Purble Place · Solitaire · Spider Solitaire · Tinker

KernelNtoskrnl.exe · hal.dll · System Idle Process · Svchost.exe · Registry · Windows service · DLL · EXE · NTLDR / Boot Manager · Winlogon · Recovery Console · I/O · WinRE · WinPE · Kernel Patch Protection

ServicesBITS · Task Scheduler · Wireless Zero Configuration · Shadow Copy · Error Reporting · Multimedia Class Scheduler · CLFS

File systemsNTFS (Hard link · Junction point · Mount Point · Reparse point · Symbolic link · TxF · EFS) · WinFS · FAT32·FAT16·FAT12 · exFAT · CDFS · UDF · DFS · IFS

Page 33: PE

Server

Domains · Active Directory · DNS · Group Policy · Roaming user profiles · Folder redirection · Distributed Transaction Coordinator · MSMQ · Windows Media Services · Rights Management Services · IIS · Terminal Services · WSUS · Windows SharePoint Services · Network Access Protection · PWS · DFS Replication · Remote Differential Compression · Print Services for UNIX · Remote Installation Services · Windows Deployment Services · System Resource Manager · Hyper-V

Architecture

NT series architecture · Object Manager · Startup process (Vista/7) · I/O request packet · Kernel Transaction Manager · Logical Disk Manager · Security Accounts Manager · Windows File Protection / Windows Resource Protection · Windows library files · LSASS · CSRSS · SMSS · MinWin

SecurityUser Account Control · BitLocker · Defender · Data Execution Prevention · Security Essentials · Protected Media Path · Mandatory Integrity Control · User Interface Privilege Isolation · Windows Firewall · Security Center

CompatibilityUnix subsystem (Microsoft POSIX  · Interix) · Virtual DOS machine  · command.com · Windows on Windows · WoW64 · Windows XP Mode

DLL hijackingFrom Wikipedia, the free encyclopedia  (Redirected from DLL Hijacking)Jump to: navigation, search

DLL Hijacking is a computing term that refers to a vulnerability that is triggered when a vulnerable file type is opened from within a directory controlled by the attacker. The directory can be a USB drive, an extracted archive, or a remote network share. In most cases, the user will have to browse to the directory and then open the target file type for this exploit to work. The file opened by the user can be completely harmless, the flaw is that the application launched to handle the file type will inadvertently load a DLL from the working directory.[1]

In practice, this flaw can be exploited by sending the target user a link to a network share containing a file they perceive as safe. For example iTunes, which was affected by this flaw until patched, is associated with a number of media file types, and each of these would result in a specific DLL being loaded from the same directory as the opened file.[2] The user would be presented with a link in the form of \\server\movies\ and a number of media files would be present in this directory. If the user tries to open any of these files, iTunes would search the remote directory for one or more DLLs and then load these DLLs into the process. If the attacker supplied a malicious DLL containing malware or shellcode, the user would be rendered open to further exploits.[3]

Page 34: PE

The vulnerability was discovered by HD Moore, who has published an exploit for the open-source based penetration testing software Metasploit.[4]

Making DLLs easy to build and useBy bnn3nasdfasdfa | 8 Mar 2004 How to quickly build a DLL file from an existing class and how to easily use it.

Introduction

Trying to find out how to build DLLs for a beginner or even experienced programmer can be complicated to configure and use. This article is to post a simple method of building a DLL and then using that DLL in a project with no effort other than needing to include the header file. All you will need is the class already designed and ready to become a DLL file. Even though this is probably some novice stuff, it is believed some experience in working with the MSVS projects are needed. In addition, the advanced person might find some finer points here that could be useful.

Background

Some people might find it hard to research on the Internet useful ways of building DLLs the very simple way. The information in this document is actually a collection of information found on this site and other sites on building DLLs so it is nothing new (only pulling together their efforts). A few articles relate in detail on how to build DLLs and highlights the finer points of what a DLL actually is and how to use it. I suggest reading them first to find broader definitions.

However, it seems that a lot of effort to the novice "DLL Builder" is lacking or too complicated to understand thereof. You may have heard from someone it is easy to build DLLs but trying to figure out how it all pulls together can be an effort in its own. Some articles do not seem to also highlight the importance of directory locations. A DLL, its library and header file(s) must all be in a findable directory that can either be a Windows directory or somewhere where the compiler can find them. Otherwise you may encounter frustrating compiler problems if one or all of the DLLs file are not found, which usually results in an unusable circumstance.

Using the code

There are no code attachments other than comment areas. This should be simple enough to figure out by working alongside in your MSVS environment. What you will need to do is:

1. Have a class already prepared that needs to be a DLL file. 2. Start a new DLL project

3. Insert MACRO definitions in your �StdAfx.h� file.

Page 35: PE

4. Add easy including of the DLL libraries in the main �foo.h� file.

5. Add some project preprocessor MACROs.

6. Make batch files for the �Post Builds� - to appropriately copy library, header and DLL files to findable directory(s).

Details

1. It is assumed that you already created your class, so this step is bypassed. 2. First is the easy part. Start up �Microsoft Visual Studio Visual C++�. Select �File-

>New� to pull up the creation dialog. Under the �Projects� tab button, select �MFC AppWizard (DLL)�. Enter the project name and directory to be used then click �OK�. The rest of the options are not necessary, so select what you want to do in the rest of the Wizard and click �Finish�.

What typically happens after you started the new project is that a source code and header file has been created for you. These are not really needed if you already have the code you are going to build. Only the preset definitions for the project are needed from this step. It is suggested that you just empty the files created by highlighting and deleting the source and header files created by the Wizard. Do not edit the �StdAfx� files yet as these are needed.

The point in this step is to simply copy and paste your class (�foo�) into the header and source files created. It should be a simple concept to grasp, but if you are skeptical you can alternatively just add your �foo� source file and header file into the project so that it is built.

You will next begin the editing part. You must be able to make the build compile as a DLL file correctly for usage. DLL files use a combination of Exporting and Importing. When this is the build project you will need to �Export�. When it is being included in another project that does not use the source code, you will need to �Import�. This is probably the harder part of understanding the DLL as it is not necessarily implemented for you and requires a little effort before understanding. As a side note, the �resource.h� is only required by DLLs that have dialogs in them. In this case, you will have to remember that you may have several �resource.h� files then and will have to include a full path statement to each �resource.h� file. This article will show in the following example what is probably the best method (keeping in mind that if a project needs this resource file it must manually be added to the project including the full path to it).

3. Edit the �StdAfx.h� file and include the following MACROS and header file �resource.h�.

Collapse

Page 36: PE

StdAfx.h//

// MSVS included headers, definitions, etc.

//

// Somewhere near the bottom

//This is the project macro preprocessor definition

//you will be adding shortly.

#ifdef DLL_EXPORTS//This is to be used for the class header file.

#define DLL_BUILD_MACRO __declspec(dllexport)#else#define DLL_BUILD_MACRO __declspec(dllimport)#endif

#ifndef _DLL_BUILD_ //Why do this? Is it necessary? Yes.

#define _DLL_BUILD_ DLL_BUILD_MACRO#endif

//make sure resources are included here, if desired,

//to prevent ambiguous

//callings to different resource.h files.

#include �resource.h�

//

// Rest of file

//

4.o Next we go ahead and edit the main header file of your DLL code. Only a few

simple lines need be added to support accurate DLL building and usage:

Collapse

foo.h#ifndef _FOO_H_#define _FOO_H_

//

// Miscellanous here

Page 37: PE

//

/*This part automatically includes any libraries when called.When this file is built within the DLL project, it will not be called becauseof our preprocessor macro definition �_FOO_DLL_�.However, when this file is called from another project, not part of this build,it appropriately chooses the correct library and includes them for you.WHAT this means is that you will not have to add the library to the project link settingsfor a project that requires this DLL. This helps to avoid the tedious task oflinking to several custom DLLs.

Note there are two different libraries here and probably not necessary but giveyou an idea of how to separate debug versions from release versions.*/

#ifndef _FOO_DLL_#ifdef _DEBUG//You will be building a debug program that

//uses this file, so in this case we

//want the debug library.

#pragma comment( lib, �food.lib� ) #else//You will be building a release program that

//uses this file, so in this case we

//want the release library.

#pragma comment( lib, �foo.lib� ) #endif#endif#ifndef _DLL_BUILD_#define _DLL_BUILD_ //Makes sure there are no compiler errors.

#endif

class _DLL_BUILD_ fooprivate://

// Your members

//

public:

Page 38: PE

//

// Your functions

//

};

#endif

o To continue, edit the �foo.cpp� file and make sure you include the appropriate headers.

Collapse

foo.cpp

#include �stdafx.h� //place first

#include �foo.h�

//

//Your code

//

5.o This next step requires that you actually add the MACROS into the project

settings. In the menu tool bar, go to �Project->Settings�� or press Alt+F7 alternatively. Click the �C/C++� tab. Under �Preprocessor definitions:� add the macro definitions �_FOO_DLL_� and �DLL_EXPORTS� at the end of the list of other macro definitions. Be sure to separate each new MACRO with a coma (�,�). Make sure to do this for the release version too, as you will have to redo these next steps for each type of build. Make sure to build with any �debug info� and/or �browse info� if this is a debug version and if you want to debug the DLL later using the MSVS debugger.

o Next you will want to prepare for the last step. Go to the �Post-build step� tab under the same dialog. Under �Post-buid command(s):� click an empty space and enter �Debug.bat�. For the release version go to in the left pane and in the combo box �Settings For:� select �Win32 Release�. Enter like you did before in �Post-Build command(s)� but not �Debug.bat� and instead �Release.bat�. This is it for all the project settings.

6. Now to build the batch files. Create a blank file. You will be adding command line codes that will copy your files to a findable directory. The point here is that if you may have a ton of DLLs and it will be easier to have all the needed components in one directory. This is easier versus linking to several directories. Below are the suggestions used in this article that maybe you will want to consider for adding other options:

Page 39: PE

Collapse

Debug.batCopy �Debug\foo.lib� �c:\<libraries dir>\food.lib�REM copy the dll file to the windows system32 REM directory to make the DLL easilyREM accessible. Note that you will have to install or include theREM dll file in your distribution package for the program that uses it.Copy �Debug\foo.dll� �c:\%system dir%\system32�Copy �foo.h� �c:\<headers dir>�

Release.batCopy �Release\foo.lib� �c:\<libraries dir>�REM copy the dll file to the windows system32 directory REM to make the DLL easily accessible.REM Note that you will have to install or include the dll REM file in your distribution package.REM Unfortanetly this will overwrite any other DLL files REM such as the Release/Debug version,REM so accurate update compilations are needed. REM You will have to note this yourselfREM before distributing to the public.Copy �Release\foo.dll� �c:\%system dir%\system32�Copy �foo.h� �c:\<headers dir>�

You are completely finished. Providing you have already pre-tested your class and considered where your library, header and DLL files are being copied you should receive no problems at all. Your class should be a DLL file that can be easily used in future projects without much work. All you have to do with this information is to know that you only need to include the header file and everything is done for you. Everything is made as easy and simple as possible for you on in out.

Notes

Some reminders are that the DLL file must be accessible. This article references using the �c:\windows\system32� directory. This may be a bad idea if you want to later retrieve that DLL file and must find it in the large collection of DLLs probably existing already in that directory. It can also be annoying if you later decide to change the name of the project and build under the different name. In that case you will have to find the old DLL file and manually remove it or just leave it there.

Alternatively you can copy the DLL file(s), library file(s) and header file(s) to the project directory that will be using it. However, if you decide to use the same DLL in another project you will have to go back and add a copy command line(s) in both the �Debug.bat� and �Release.bat� files and then rebuild the project to have them copied.

Also note that in MSVS it is easy to add custom directories for new header directories to search, but unfortunately not DLL directories. Go under �Tools->Options� in the menu bar and then under the �Directories� tab button. Where �<libraries dir>� above in the batch file should

Page 40: PE

be included under �Library files� in the �Show directories for� combo box and �<headers dir>� in the �Include files�. Some MACROS may seem repetitive or not in use in this article. However, the compiler, when building, requires this type of style for both the DLL project and the project using that DLL. If you find that some are not needed you can remove them yourself. This code is designed so that they are there and readably Accessible. Testing shows that for both the DLL build and the project using that DLL build require these types of MACRO setups for usage. Other possibilities and locations exist.

Points of Interest

Figuring out Firewalls, UNIX and why mail programs show the contents automatically in Windows.

History

No reformatting necessary.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

DLL Files in Windows- What Are They?

Dynamic Link Library (DLL) files are an essential part of the Windows operating system. Although they are ubiquitous, most PC users neither know nor care what these files do. Nonetheless, a little understanding of the role that DLL files play can make the computer a little less of a mystery box. Only programmers and computer technicians need to know any of the gory details of the structure and function of a DLL, but these files are so important that all of us should know a few simple facts about them. Here is some information for the non-technical PC user.

What Do DLL Files Do?

A DLL file is indicated by the extension DLL in its name. Microsoft gives several definitions of a DLL but the one that I think has the least jargon is this:

"A Dynamic Link Library (DLL) is a file of code containing functions that can be called from other executable code (either an application or another DLL). Programmers use DLLs to provide code that they can reuse and to parcel out distinct jobs. Unlike an executable (EXE) file, a DLL

Page 41: PE

cannot be directly run. DLLs must be called from other code that is already executing."

Another way of putting it is that DLL files are like modules that can be plugged into different programs whenever a program needs the particular features that come with the DLL. The original concept behind DLL files was to simplify things. It was recognized that there were many functions common to a lot of software. For example, most programs need to create the graphical interface that appears on the screen. Instead of having to contain the code to create the interface themselves, programs call on a DLL for that function. The idea is to have a central library where everyone can obtain the commonly used functions, as they are needed. This cuts down on code, speeds things up, is more efficient, etc. They are called dynamic links because they are put to use only when a program calls on them and they are run in the program’s own memory space. More than one program can use the functions of a particular DLL at the same time.

Parenthetically, I have to say that the software developers (not least of all, Microsoft) have strayed from the path of keeping things simple. A computer today may contain a thousand or more different DLL files. Also, Microsoft seems to tinker endlessly with DLL files, giving rise to many different versions of a file with the same name, not all compatible. Microsoft maintains a database with information about various DLLs to help with version conflicts.

There are several very important DLLs that contain a large number of the basic Windows functions. Since they figure so importantly in the workings of Windows, it is worth noting their names.

Examples of Important DLL files

COMDLG32.DLLControls the dialog boxes

GDI32.DLLContains numerous functions for drawing graphics, displaying text, and managing fonts

KERNEL32.DLLContains hundreds of functions for the management of memory and various processes

USER32.DLLContains numerous user interface functions. Involved in the creation of program windows and their interactions with each other

It is the common use of these types of DLLs by most programs that ensures that all applications written for Windows will have a standard appearance and behavior. This standardization was a big factor in the rise of Windows to dominance of the desktop computer. Anyone who was working with computers in the days of DOS will remember that every program had its own interface and menus.

Error Messages involving DLLs

PC users often see DLLs (especially the ones mentioned above) mentioned in error messages. One might conclude, therefore, that something is always going wrong with DLLs. Very often, however, it is not the DLL itself that is at fault. DLL files figure prominently in the error messages

Page 42: PE

when something in the system goes awry because they are involved in the most basic processes of Windows. They are in effect the messenger of trouble, not the actual trouble. It is beyond our scope to discuss any details of error messages but there are substantial references on interpreting them. One is at James Eshelman's site.

Using Regsvr32.exe to Register DLLs

First, let it be clear that the important system file regsvr.exe should not be confused with the file regsrv.exe that is used by certain worms and Trojans.

In order for a DLL to be used, it has to be registered by having appropriate references entered in the Registry. It sometimes happens that a Registry reference gets corrupted and the functions of the DLL cannot be used anymore. The DLL can be re-registered by opening Start-Run and entering the command regsvr32 somefile.dll

This command assumes that somefile.dll is in a directory or folder that is in the path. Otherwise, the full path for the DLL must be used. A DLL file can also be unregistered by using the switch "/u" as shown below.regsvr32 /u somefile.dll

This can be used to toggle a service on and off.