Saturday, January 13, 2007
A long project is comming to an end soon. After one year of writing, my book will be finished soon. 3 chapters need some minor changes. All other chapters of my book are now going to be copy edited. The book will summarize essentail parts of my research on C++/CLI in the last two years. Notice that the announcement in amazon.com is not 100% correct. The title of the book will be "Expert C++/CLI: .NET for Visual C++ programmers" and the release date will be Mid March.
1/13/2007 9:00:07 AM (GMT Standard Time, UTC+00:00)  #    Disclaimer  |   |  Trackback
 Tuesday, April 11, 2006

1) Since April, 1st I am an MVP for Visual C++.

2) My first article in the MSDN Magazine has been published: http://msdn.microsoft.com/msdnmag/issues/06/05/MixAndMatch/default.aspx

4/11/2006 1:15:43 PM (GMT Daylight Time, UTC+01:00)  #    Disclaimer  |   |  Trackback
 Tuesday, March 21, 2006

In news://msnews.microsoft.com/microsoft.public.dotnet.languages.vc Edward Diener asked an very interesting question: Why can't I call a function having arguments of a native type like std::string from another assembly?

Assume you have some code like this one:
 
// Conversions.cpp
// compile with "CL /clr /LD conversions.cpp"
// output: mixed code assembly Conversions.dll
#include <string>
 
public ref class Conversions
{
public:
  static void S2S(System::String^ s1, std::string& s2) { /* ... */ }
};
 
This code should compile as expected, however, it would not give you the expected result!
 
The code below looks like a suitable client:
 
// ConversionsClient.cpp
// compile with "CL /clr ConversionsClient.cpp"
#using "Conversions.dll" 
#include <string>
int main()
{
  std::string s;
  Conversions::S2S("asdf", s);
}
 
If you try to compile this code, you will get a disappointing error message:
 
error C3767: 'Conversions::S2S': candidate function(s) not accessible
 
Why is a public static function S2S of a public type Convesions not accessible?
 
To use the native type std::string in managed code, the compiler generates a managed value type std::string in the assembly where std::string is used. This managed wrapper value type is private, therefore, the Conversions::S2S cannot be called from outside the assembly even though it is a public function of a public type.
 
At the first view it seems, key to the solution is to make sure the compiler generates a public type for std::string, in theory this is possible, however it would not help to solve the problem. In fact the native wrapper type has been defined as a private type for some good reasons.
 
Assume the native wrapper type for std::string was public. To call S2S, one would have to pass a tracking handle to a System::String, defined in mscolib.dll, and a value of the type std::string defined in the assembly Conversions.dll. The std::string type that we pass in ConversionsClient.cpp is a different one! It is the native wrapper type defined in ConversionsClient.cpp - not the native wapper type defined in Conversions.dll. Therefore, the parameters would not match.
 
So how can we solve this problem?
 
The origin of the problem is the fact, that type identity rules of .NET do not allways mix well with the type identity rules of native C++. To solve this problem, you can switch back to the world of native code sharing without loosing you managed code features. Simply create a mixed code static library:
 
Create a static mixed code library from the code :
 
// ConversionsLib.cpp
// compile with "CL /c /clr ConversionsLib.cpp"
// make lib with "LIB ConversionsLib.obj"
// output: mixed code static library ConversionsLib.lib
 
#include <string>
void S2S(System::String^ s1, std::string& s2) { /* ... */ }
Create a client from the code below.
 
// ConversionsLibClient.cpp
// compile with "CL /clr ConversionsLibClient.cpp"
#include <string>
 
#pragma comment (lib, "ConversionsLib.lib")
void S2S(System::String^ s1, std::string& s2);
 
int main()
{
  std::string s;
  S2S("asdff", s);
}
 
Conclusion: Beware the different type identity rules. Native types are identifies by their namespace-qualified typename, managed types are identified by their assembly- and namespace-qualifies typename. If you need native type identity rules, use native code sharing features, if you need managed type identity, use managed code sharing features.
3/21/2006 5:36:54 PM (GMT Standard Time, UTC+00:00)  #    Disclaimer  |   |  Trackback
 Friday, January 20, 2006
1/20/2006 11:07:12 AM (GMT Standard Time, UTC+00:00)  #    Disclaimer  |   |  Trackback
 Thursday, January 12, 2006

When a cpp file is compiled with /clr, a managed object file is created. In most scenarios, such a managed object file contains only managed code. However, there are two scenarios that end up in a managed object file with managed and native code:

  1. When #pragma unmanaged is used
  2. When the C++ file contains a function with C++ constructs that are not mappable to IL code.

Managed object files with native code imply a certain danger: They can end up in uninitialized state, as shown in the following sample:

In the code below, i is a global variable initialized with the return value of getValue(). There are two exported functions that simply return i. One is the managed function fManaged and the other one is the unmanaged function fUnmanaged.

<code language=”C++/CLI” filename=”MixedLib.cpp” compileWith=”CL /LD /clr MixedLib.cpp”>
#pragma unmanaged
__declspec(noinline) int getValue() {
  return 42;
}

int i = getValue();

__declspec(dllexport) int fUnmanaged()
{
  return i;
}

#pragma managed
__declspec(dllexport) int fManaged()
{
  return i;
}
</code>

Before fManaged is executed the first time, the module constructor is called. The module constructor calls getValue to initialize the global variable i. However, if fUnmanaged is called before fManaged is called the first time, the variable i will be returned before it is initialized. To reproduce this scenario, you can use the client application below.

<code language=”C++/CLI” filename=”TestApp.cpp” compileWith=”CL /clr TestApp.cpp”>
#include <stdio.h>

#pragma comment(lib, "testlib.lib")

__declspec(dllimport) int fUnmanaged();
__declspec(dllimport) int fManaged();

int main()
{
  printf("fUnmanaged returns %d\n", fUnmanaged());
  printf("fManaged returns %d\n", fManaged());
  printf("fUnmanaged returns %d\n", fUnmanaged());
}
</code>

If you start TestApp.exe, you will get the following output:

fUnmanaged returns 0
fManaged returns 42
fUnmanaged returns 42

The easiest way to avoid these problems is to make sure that files compiled with /clr contain only managed code and to leave the native code in cpp files compiled without /clr. If you consider using #pragma unmanaged or unmappable C++ constructs in files compiled with /clr, you should be aware that all global variables and static member variables of that file, are initialized by the module initializer. This is even true if the variable is of a native type.

The compiler's ability to automatically compile a function with unmappable C++ constructs to native code can cause a scenario where you get mixed code object files without even realizing it. Fortunately a compiler warning C4793 can be emitted in this scenario. C4793 is a level 2 warning. To get it, you either have to set the warning level to 2, or you have to use a compiler switch to make it a level 1 warning, as shown in the following code.

<code language="C++/CLI"
     filename=" UnmappableConstructs.cpp"
     compileWith="cl /c /clr /w14793 UnmappableConstructs.cpp">
void f()
{
  __asm int 3;
}
</code>

When you compile this code, the C++ compiler will emit the warning C4793 with the following message:

UnmappableConstructs.cpp(3) : warning C4793: '__asm' : causes native code generation for function 'void f(void)'
        UnmappableConstructs.cpp(1) : see declaration of 'f'

1/12/2006 3:29:45 PM (GMT Standard Time, UTC+00:00)  #    Disclaimer  |   |  Trackback
 Saturday, November 26, 2005

.NET has a very interesting new feature regarding type identity. When I tested this feature the last time (I think it was with the RC, it did not work, so it is likely you have not yet heared about it). But let's start from the beginning:

Type identity is a very important concept of a programming infrastructure. In classic C++, types are identified by their names. COM uses GUIDs to identitfy types. To achieve a strong type safety, type identity in .NET is bound to assembly identity and assembly identity can be bound to developers using a public/private key pair. Type identity is bound to assembly identity means that 2 types with the same name in two different assemblies have two different identities. Assembly identity can be bound to developers using a public/private key pair means that developers can use unique public/private key pairs to give their assemblies unique and uncloneable names. This also gives all types in the assembly unique and uncloneable names.

The following expression returns a string containing the unique and uncloneable type identity, Microsoft has given to the 32 bit signed integer type:

int::typeid->AssemblyQualifiedName

If you are not familiar with the chosen language, the equivalent C# expression would be typeof(int).AssemblyQualifiedName.

While this type identity is very helpful to achieve verifiability, it is also had a drawback: This assembly bound type identity ment that a type could not leave the assembly where it was defined in without losing it's identity. You can define a type from exactly the same code in another assembly, but to the runtime, it would be a different type. To understand this, let's have a look at the following example.

Let's assume you have an assembly lib1.dll created from the following code:

public ref class TheType {};

Let's further assume, you have a client application like this one:

#using "lib1.dll"
int main() {
  using namespace System;
  Console::WriteLine(TheType().GetType()->AssemblyQualifiedName);

  using System::Reflection::Assembly;
  for each (Assembly^ a in AppDomain::CurrentDomain->GetAssemblies())
    Console::WriteLine(a->FullName);
}

Notice the elegant way to add an assembly reference inside your code, and the neat alternative for creating a temporary object of type TheType - two out of so many reasons why I call C++/CLI the chosen language.

If you run the application, you will see that TheType is defined in Lib1 and that the assemblies mscorlib, app, and lib1 are loaded.

Let's further assume that for the next version, you are redesigning your application and you find out, that TheType would now better fit into another assembly. Due to the type identity rules I have mentioned, this would mean that TheType would get another identity. Therefore this would be a breaking change for lib1.dll: The new version of Lib1 would not be backwards compatible with the old one, since it didn't have the public type TheType any more.

Using the type identity mapping feature in 2.0, you can reorganize your libraries without causing a backwards incompatibility in your new lib1.dll. To explain the mechanics, let's assume the source for lib2.dll now contains TheType:

//lib2.cs
public ref class TheType {};

To allow the old version of app.exe to execute even though TheType is no longer in the assembly where it is expected (lib1), you can define TheType in lib1.dll with a type identity mapping to the type in lib2.dll. These two lines are enough to achieve this:

#using "lib2.dll"
[assembly: TypeForwardedTo(TheType::typeid)];

Notice that this code creates lib1.dll with an assembly depencency to lib2.dll that defines TheType. When using this code, the compiler emits the folowing metadata for TheType in lib1.dll:

.class extern forwarder TheType
{
  .assembly extern lib2
  .class 0x02000002
}

.class 0x02000002 is to the metadata token for TheType in lib2.dll.

Almost the same metadata can be emitted by C#, too; however there is a slight difference:

[assembly: System.Runtime.CompilerServices.TypeForwardedTo(typeof(TheType))]

C# expects you to use the pseudo custom attribute System::Runtime::CompilerServices::TypeForwardedToAttribute, whereas C++/CLI uses a compiler-internal attriubte TypeForwardedTo. The philosophies of the two languages differ here: C# tries to be consistent in the way attributes are used. Regarding attributes, C++/CLI has different roots anyway: .NET attributes are, attributed ATL are two examples. Instead of trying to be consistent with one or the other attribute model, C++/CLI tries to avoid that the programmer has to explicitly use features of the runtime that are intended to be used by compiler builders only.

Whatever language you use, this simple attribute allows you to reorganize your libraries without breaking compatibility with old applications.

11/26/2005 11:06:15 AM (GMT Standard Time, UTC+00:00)  #    Disclaimer  |   |  Trackback
 Friday, November 11, 2005

For most Visual Studio .NET language integrations, app.config files are treated specially. Before the target application is started, (with or without a debug session), the language package ensures that the app.config file is automaticallly copied to the target application's configuration file. Visual C++ does not have this feature, however it is easy to get the same:

Add a file named app.coinfig to a Visial C++ project and choose the following project settings:

Command line: type app.config > "$(TargetPath).config"

Description: "Updating target's configuration file"

Outputs: "(TargetPath).config"

Notice that the file is copied via the command type and piping. I prefer this to using the copy command here, sbecause this command ensures that the date of the compiled file is automatically adapted whenever the file is copied.

 

11/11/2005 11:34:30 PM (GMT Standard Time, UTC+00:00)  #    Disclaimer  |   |  Trackback
 Wednesday, October 19, 2005

On microsoft.public.dotnet.languages.vc, bonk has asked this interesting question. This blog entry gives you a simple example and explains why this is possible. In further blogs I will discuss when this can be dangerous.

Your idea is possible, but there may be some issues. Let's start with a simple sample

<code language="CPPCLI" file="test.cpp" compileWith="CL /LD /clr test.cpp">
// pragma managed is the default
extern "C" __declspec(dllexport) void __stdcall f()
{
  System::Console::WriteLine("I am f(), a managed function that test.dll exports to native clients");
}
</code>

"dumpbin /exports test.dll" will show you that there is indeed a native exported function:

    ordinal hint RVA      name

          1    0 00001020 _f@0

What is going on here? How can unmanaged code can call managed code?

To answer this question, let's have a look at a simple application that calls managed code from unmanaged code:

<code language="CPPCLI" file="testApp.cpp" compileWith="CL /clr testapp.cpp">
void f()
{
  System::Console::WriteLine("I am f(), a managed function that can be called by native clients");
}

void f2()
{
  System::Console::WriteLine("I am f2(), a managed function that can be called by native clients");
}

#pragma unmanaged
int main() {
  f();
  f2();
}
</code>

The native function main can not call f  without some magic that is going on under the hood: Think of a scenario, where f has not even been jit compiled. On the one hand, the call "f()" is compiled to a call to an address local to the exe file, on the other hand, the code that should really be executed, is JIT compiled. So what is going on here?

"f()" and "f2()" are indeed calls a local addresses. This is an excerpt from the disassembly window:

  f();
00401053  call        f (401030h)
  f2();
00401058  call        f2 (401040h)

Notice that the calling addresses (00401053) and the called addresses (401030) are both belong to testApp.exe's code.

Here is what the disassembly window tells us about the called addresses:

f:
00401030  jmp         dword ptr [__mep@?f@@$$FYAXXZ (409000h)]

f2:
00401040  jmp         dword ptr [__mep@?f2@@$$FYAXXZ (409004h)]

"jmp dword ptr [...x...]" is an indirect jump. It means: at the address ...x..., is the address the address you have to jump to.

Here is what these addresses are in my debugger's memory window:
0x00409000:  00 CB 00 12
0x00409004:  00 CB 00 4E

Both addresses are far away from testApp.exe's base address, so they are clearly outside testApp's code. This is runtime generated code, but we have not yet reached JIT compiled code. What we see here are runtime generated unmanaged -> managed thunks. These thunks perform the managed / unmanaged transition call the managed functions (f, or f2) in the end.

As I have mentioned, these functions are runtime generated: How does the runtime know that at address 0x00409000 should be a pointer to the unmanaged -> managed thunk for f() and at address 0x00409004 should be a pointer to the thunk for f2()?

Well the new linker is much smarter than you may expect: The linker generates .NET metadata that tells the runtime exactly that! If you view testApp.exe in ILDASM and inspect the assembly's manifest, you will find so called vtFixups at the end of the manifest. Here is an expert:

.imagebase 0x00400000
.subsystem 0x0003       // WINDOWS_CUI
.corflags 0x00000000
.vtfixup [1] int32 retainappdomain at D_00009000 // 06000001
.vtfixup [1] int32 retainappdomain at D_00009004 // 06000002
...many other vtfixups elided for clarity here ...

Note that these lines contain familiar numbers: 00009000 and 00009004. If you add the .imagebase to these numbers, you will get:

00409000 and 00409004. Does this ring the bell? Using these addresses, the compiled code finds the managed -> unmanaged thunks.

So what is the other part of the vtfixup 06000001 and 06000002?

Well these are metadata tokens. Metadata starting with the 06 are always method tokens and now it is not very difficult to guess what is going on: 06000001 is the metadata token for the managed function f() and 06000002 is the metatdata token for f2(). You can prove this by adding the following code to f() and f2():

System::Console::WriteLine("{0:x8}", (gcnew System::Diagnostics::StackTrace())->GetFrame(0)->GetMethod()->MetadataToken);

The .vtfixup metadata tells the runtime:
When the assembly is loaded:
  Generate a unmanaged->managed thunk for the method f() and store a pointer to it in 00409000 and
  generate another unmanaged->managed thunk for method f2() and store a pointer to it in 00409004.

Since this is done, the unmanaged function main can call the managed functions f and f2 as if they were native functions.

In the testApp.exe sample there is a simplification, that is not true for the test.dll I have disused right at the beginning: Since testApp.exe is an exe it is guaranteed that the CLR has been initialized already. (The CLR will be initialized automatically when the EXE application starts.) This assumption is not true for managed functions exported via DLLs: The DLL's client may be a native client. To handle this case, a small stub is exported. This small stub ensures that the CLR is initialized properly and that the assembly is loaded properly into the default appdomain, before the unmanaged -> managed stub is called. This is often called delayed CLR initialization.

Although all this sounds nice there are still some things to discuss:

* Issues when combining exported managed functions with #pragma managed

* Turning the generation of managed / unmanaged thunks off in cases where they are not needed

* What happens if managed code calls a managed entry point via P/Invoke?

I hope I will find some time to discuss these things in the next days.

 

10/19/2005 10:58:59 PM (GMT Daylight Time, UTC+01:00)  #    Disclaimer  |   |  Trackback
A long time of research will come to an end soon. I have spent the last months writing the Essential C++/CLI class for DevelopMentor. I still have 8 labs to write and some slides need a redesign, but at least I am able to spend some time sharing important details about C++/CLI, "It Just Works", and .NET in General.
10/19/2005 9:39:05 PM (GMT Daylight Time, UTC+01:00)  #    Disclaimer  |   |  Trackback