On microsoft.public.dotnet.languages.vc, bonk has asked this interesting question. This blog entry gives you a simple example and explains why this is possible. In further blogs I will discuss when this can be dangerous.
Your idea is possible, but there may be some issues. Let's start with a simple sample
<code language="CPPCLI" file="test.cpp" compileWith="CL /LD /clr test.cpp">
// pragma managed is the default
extern "C" __declspec(dllexport) void __stdcall f()
{
System::Console::WriteLine("I am f(), a managed function that test.dll exports to native clients");
}
</code>
"dumpbin /exports test.dll" will show you that there is indeed a native exported function:
ordinal hint RVA name
1 0 00001020 _f@0
What is going on here? How can unmanaged code can call managed code?
To answer this question, let's have a look at a simple application that calls managed code from unmanaged code:
<code language="CPPCLI" file="testApp.cpp" compileWith="CL /clr testapp.cpp">
void f()
{
System::Console::WriteLine("I am f(), a managed function that can be called by native clients");
}
void f2()
{
System::Console::WriteLine("I am f2(), a managed function that can be called by native clients");
}
#pragma unmanaged
int main() {
f();
f2();
}
</code>
The native function main can not call f without some magic that is going on under the hood: Think of a scenario, where f has not even been jit compiled. On the one hand, the call "f()" is compiled to a call to an address local to the exe file, on the other hand, the code that should really be executed, is JIT compiled. So what is going on here?
"f()" and "f2()" are indeed calls a local addresses. This is an excerpt from the disassembly window:
f();
00401053 call f (401030h)
f2();
00401058 call f2 (401040h)
Notice that the calling addresses (00401053) and the called addresses (401030) are both belong to testApp.exe's code.
Here is what the disassembly window tells us about the called addresses:
f:
00401030 jmp dword ptr [__mep@?f@@$$FYAXXZ (409000h)]
f2:
00401040 jmp dword ptr [__mep@?f2@@$$FYAXXZ (409004h)]
"jmp dword ptr [...x...]" is an indirect jump. It means: at the address ...x..., is the address the address you have to jump to.
Here is what these addresses are in my debugger's memory window:
0x00409000: 00 CB 00 12
0x00409004: 00 CB 00 4E
Both addresses are far away from testApp.exe's base address, so they are clearly outside testApp's code. This is runtime generated code, but we have not yet reached JIT compiled code. What we see here are runtime generated unmanaged -> managed thunks. These thunks perform the managed / unmanaged transition call the managed functions (f, or f2) in the end.
As I have mentioned, these functions are runtime generated: How does the runtime know that at address 0x00409000 should be a pointer to the unmanaged -> managed thunk for f() and at address 0x00409004 should be a pointer to the thunk for f2()?
Well the new linker is much smarter than you may expect: The linker generates .NET metadata that tells the runtime exactly that! If you view testApp.exe in ILDASM and inspect the assembly's manifest, you will find so called vtFixups at the end of the manifest. Here is an expert:
.imagebase 0x00400000
.subsystem 0x0003 // WINDOWS_CUI
.corflags 0x00000000
.vtfixup [1] int32 retainappdomain at D_00009000 // 06000001
.vtfixup [1] int32 retainappdomain at D_00009004 // 06000002
...many other vtfixups elided for clarity here ...
Note that these lines contain familiar numbers: 00009000 and 00009004. If you add the .imagebase to these numbers, you will get:
00409000 and 00409004. Does this ring the bell? Using these addresses, the compiled code finds the managed -> unmanaged thunks.
So what is the other part of the vtfixup 06000001 and 06000002?
Well these are metadata tokens. Metadata starting with the 06 are always method tokens and now it is not very difficult to guess what is going on: 06000001 is the metadata token for the managed function f() and 06000002 is the metatdata token for f2(). You can prove this by adding the following code to f() and f2():
System::Console::WriteLine("{0:x8}", (gcnew System::Diagnostics::StackTrace())->GetFrame(0)->GetMethod()->MetadataToken);
The .vtfixup metadata tells the runtime:
When the assembly is loaded:
Generate a unmanaged->managed thunk for the method f() and store a pointer to it in 00409000 and
generate another unmanaged->managed thunk for method f2() and store a pointer to it in 00409004.
Since this is done, the unmanaged function main can call the managed functions f and f2 as if they were native functions.
In the testApp.exe sample there is a simplification, that is not true for the test.dll I have disused right at the beginning: Since testApp.exe is an exe it is guaranteed that the CLR has been initialized already. (The CLR will be initialized automatically when the EXE application starts.) This assumption is not true for managed functions exported via DLLs: The DLL's client may be a native client. To handle this case, a small stub is exported. This small stub ensures that the CLR is initialized properly and that the assembly is loaded properly into the default appdomain, before the unmanaged -> managed stub is called. This is often called delayed CLR initialization.
Although all this sounds nice there are still some things to discuss:
* Issues when combining exported managed functions with #pragma managed
* Turning the generation of managed / unmanaged thunks off in cases where they are not needed
* What happens if managed code calls a managed entry point via P/Invoke?
I hope I will find some time to discuss these things in the next days.