DotNETPELib is an unmanaged library written in C++, which allows
generation of .NET assemblies from C++ programs. It
has full support for creating namespaces, classes, fields,
methods, method bodies, and some advanced features such as
support for explicit classes and properties. Full support
for PInvoke is available for calling unmanaged DLL entry
points. This library uses the ECMA-335 standard as a
reference for the implementation, but also supports later
versions of .NET assemblies.
DotNETPELib also natively supports the 'argument array' and
'enumeration' features of C#.
Code generated by DotNETPELib may be accessed from other .NET
assemblies, and there is also full support for importing other
assemblies in order to access their fields and
methods. In the simplest case, one can import an
assembly then search for declarations of interest using various
search functions. The declaration will then be used
later by the library when generating references needed by your
program. In more advanced case, one can iterate
through all the definitions in an assembly and put them in a
separate symbol table. For example the occil
compiler copies all the static functions in referenced
assemblies into its own symbol table then allows one to access
them in C code using standard C++ semantics.
DotNetPELib 3.0 also allows signing an assembly, if one has a
strong name key file describing the signing keys.
As an output format, DotNETPELib currently supports .NET
assemblies in both EXE and DLL format. It will both
read and write them. It will also generate a .IL
formatted file which can be further compiled with the standard
.NET ILASM program to generate an assembly.
There is also support for a simple object file format, and the
library comes with a linker called 'netlink' which will read object
files and create an output file, similar to how module-based languages
such as C link object files into an executable.
A reference implementation of a C compiler called 'occil' makes
use of this library to generate managed assemblies.
This documentation will consider the available APIs in
DotNetPELib 3.0.
Overview
The DotNetPELib api is wrapped in the C++ namespace
DotNetPELib
for isolation from other libraries.
The main header file for DotNetPELib is "DotNetPELib.h"
DotNetPELib will manage memory if that is desirable; most of the
classes described in this documentation have their constructors
wrapped by an object of class
Allocator.
When this object is destroyed, it will call the destructor for
each allocated item then release its memory. In this way
the user is freed from keeping track of every created object.
An object of type
PELibError may
be thrown during validation of MSIL code.
The main API is an object derived from the
PELib
class. PELib inherits from Allocator, which exposes all
the constructors for the other elements of the AI.
Between that and the various utility functions it exposes, PELib
is usually the main entry point for creating things with the API
or probing existing values.
AssemblyDef objects are the
high-level objects that hold the data for each
assembly. An AssemblyDef can be either internally
generated, or loaded from an external source. With
DotNetPELib there will be one 'public' AssemblyDef object that
describes the assembly being generated, and one or more external
AssemblyDef objects which describe other assemblies.
It is possible to load an external assembly into an AssemblyDef
object, or one can explicitly write the data for an external
assembly in through code. For example 'mscorlib'
could be loaded and would be considered an external assembly.
To be compatible with C# an AssemblyDef object would usually
hold one or more
Namespace
objects, however, for applications that don't need to be
compatible it is possible to just start putting fields and
methods into the main AssemblyDef. One cannot put
Properties in an AssemblyDef though
A Namespace object will normally hold one or more
Class or
Enum
objects. A Class object can hold other Class and
Enum objects, and it can also hold various other types of
endpoint objects such as a
Method, a
Field, or a
Property.
An Enum object just holds Field objects that describe the
enumerated values.
The AssemblyDef, Class, Enum, and Namespace classes all inherit
from a base class
DataContainer which
holds functionality which is common between all those
classes. A related class
Qualifier
holds qualifier flags for various containers and other objects,
such as whether the container defines an object or a value type,
whether an object is static, etc..
The Field object holds a
Type object
describing the field type, and possibly initialization
data. Fields are also used when describing
enumerated values.
The Method object holds a
MethodSignature
which describes the way the method looks to other code, and it
also holds a list of
Instruction
objects which describe the runtime behavior of the
method.
A base class
CodeContainer
actually holds most of the functionality related to MSIL
instructions. The MSIL instruction capability is somewhat
advanced. It optimizes which instructions get used in
various cases where shorter instructions can be chosen, and
checks stack balancing as a sanity check on the generated code.
It also minimizes the size of the locals
area. Live variable analysis is also performed, as
an aid to the stack checking (dead regions might be unbalanced).
The Instruction objects uniquely define MSIL instructions.
An Instruction object can hold an
Operand
object, which can hold a native object such as a number, string
or label, or a reference to a variable, type, or method
signature.
A special type of instruction object is used to create boundaries for
regions to be considered for
SEH. A try block
can be defined with a beginning and ending, and immediately following
that would be a catch block. Finally, Fault, and Filter blocks are
also supported.
Many Operand objects hold an instance of something derived from
the
Value class. The
base Value object usually gets rendered as a type (e.g. a class
instance) but the derivations get rendered different
ways. For example a
Local
object describes a local variable, a
Param
object describes a parameter, a
FieldName
object references a Field object, and a
MethodName object references a
MethodSignature object.
The MethodSignature object holds a Type object for the return
type, a list of Param objects for the main parameter list, and
optionally a second list of Param objects. This
second list is only there to support unmanaged functions which
utilize C-style variable length argument lists.
An instance of an auxilliary class
CustomAttributeContainer
holds custom attributes read in from an assembly. In
the current library implementation one can't add custom
attributes to the generated code, with the exception that the
library will automatically generate the custom attribute
required for the parameter array, e.g. the C# version of
variable length argument lists.
A special object
BoxedType is used
as an aid for boxing; it effectively transforms basic types into
their boxed version.
There are also two internal APIs used by the library; one is
used for generating the binary version of .NET assemblies, and
the other is used to load the binary version of .NET assemblies
into internal memory. These APIs will not normally
be directly used when utilizing the library to generate .NET
assemblies, and are beyond the scope of this
documentation. These APIs are described in the file
PEFILE.h for those who would like to consider the
implementation.