Friday, October 09, 2009

One of the benefits of managed code is sandboxed execution: To avoid that malicious code is executed by plug-ins, an application does not need to use a separate language with restricted features. Plug-ins can be written in any .NET language, because the host can execute them in a sandbox with restricted permissions. Implementing a library that can be called by sandboxed code requires the infamous AllowPartiallyTrustedCallers attribute. With this attribute, you explicitly state that you are confident that your library does not have security vulnerabilities. If your library has to interoperate with native code, you should surely know about a pitfall that can easily allow a user of your library to bypass the sandbox. This potential vulnerability can be caused by a CLR interop feature called thread promotion. In this post, I will explain this pitfall and how to avoid it in your libraries. However, before you can understand the impacts of thread promotion for CAS, I have to explain a little bit more about sandboxing first.

To execute code in a sandbox, the CLR offers an API called the simple sandboxing API. This API creates a new AppDomain in which only assemblies form the GAC and a set of explicitly specified assemblies are executed with full-trust permissions. All other assemblies are executed with permissions specified in a permission-set that is passed to this API.

Obviously, the permission-set that is granted to plug-ins should not contain several permissions. These include the skip-verification permission which allows you to execute code that is not proven to be type-safe and the permission to execute unmanaged code. Both permissions could be used to bypass the sandbox. The more permissions an app grants to a sandbox, the more features can be used by the plug-in. In my opinion, most sandboxes should only have the permission to execute code. All other features it needs should be provided by custom libraries that the host application implements.

Many libraries that are called by plug-ins have to call native code. If the native code called by such a library uses threads that have not executed managed code so far (native-only threads) or if you use technologies that can automatically switch to other threads (like COM), you have to consider thread promotion. Thread promotion is done when a native-only thread invokes a native->mananaged thunk. Normally, native->managed thunks simply stay in the AppDomain that the managed thread is executing. (Due to the .retainappdomain flag of the .vtfixup metadata in the assembly manifest.) However, when a native->managed thunk promotes a native-only thread to a managed thread, it has to pick an AppDomain. Because in this case the CLR does not have any information about which AppDomain is supposed to execute the managed function, the thread starts its managed life in the default AppDomain (the first AppDomain, which is created automatically when the CLR starts). Typically the default AppDomain executes all assemblies with full-trust. Therefore, the managed code executes with full-trust permissions now. If the libraries code calls back into the plug-in e. g. via a delegate, an event or via a virtual function call, the plug-in is now executing with full-trust permissions and not in its sandbox any more.

The following code demonstrates this unintended switch to the default AppDomain. Notice that in main, a new AppDomain called child is created and that the method Program::ExecuteInChildDomain is called in this new AppDomain. ExecuteInChildDomain calls a native function fNative and passes a pointer to a native callback method named fNativeCallback. Now fNative creates a new native thread which invokes a managed function. When created, the new thread is a native-only thread. When the thunk is invoked to call the managed function, the native-only thread is promoted to a managed thread that executes in the default domain, not in the child domain that created the thread.

// DangersOfNativThreadsInSandboxedAppDomains.cpp : main project file.
// cl /clr DangersOfNativThreadsInSandboxedAppDomains.cpp
 
#include <windows.h>
 
typedef void (__stdcall*PFN_NATIVECALLBACK)();
 
void _stdcall fNative(PFN_NATIVECALLBACK pfnCallback);
void _stdcall fNativeCallback();
DWORD WINAPI ThreadMain(LPVOID pfnCallback);
 
using namespace System;
 
ref class Program
{
internal:
  static void DumpAppDomainInfo(String^ method)
  {
    AppDomain^ current = AppDomain::CurrentDomain;
    Console::WriteLine( "Method: {0}, AppDomain: {1}",
      method,
      current->IsDefaultAppDomain() ? "(default AppDomain)"
                                    : current->FriendlyName);
  }
 
  static void Main()
  {
    Program::DumpAppDomainInfo("Program::Main");
 
    AppDomain^ child = AppDomain::CreateDomain("childDomain");
    child->DoCallBack(gcnew CrossAppDomainDelegate(&ExecuteInChildDomain));
  }
 
  static void ExecuteInChildDomain()
  {
    Program::DumpAppDomainInfo("Program::ExecuteInChildDomain");
 
    fNative(&fNativeCallback);
  }
};
 
void main()
{
  Program::Main();
}
 
void __stdcall ManagedFunctionCalledByNativeCallback()
{
  Program::DumpAppDomainInfo("::ManagedFunctionCalledByNativeCallback");
}
 
#pragma unmanaged
 
DWORD WINAPI ThreadMain(LPVOID pfnCallback);
 
void _stdcall fNative(PFN_NATIVECALLBACK pfnCallback)
{
  // this new thread starts as a native-only thread and gets promoted
  // when the new thread calls a managed function for the first time
  HANDLE hThread = ::CreateThread(NULL, 0, &ThreadMain, pfnCallback, 0, 0);
  ::WaitForSingleObject(hThread, 1000);
  ::CloseHandle(hThread);
}
 
DWORD WINAPI ThreadMain(LPVOID pfnCallback)
{
  ((PFN_NATIVECALLBACK)pfnCallback)();
 
  return 0;
}
 
void _stdcall fNativeCallback()
{
  ManagedFunctionCalledByNativeCallback();
}

Now that we know about the problem, it is time to think about a solution: How can a native library that calls managed functions from native code ensure that its managed functions end up in the right application domain? There are two fundamentally different approaches that I would like to mention here: one that is based on an explicit switch of the application domain, and another one that is based on a switch that the CLR can do implicitly.

Explicitly switching application domains can be done either via AppDomain::DoCallback or with helper functions defined in a Visual C++ header file called <msclr/appdomain.h>. This header defines several overloads of a function called call_in_appdomain, each of these functions take the identifier of the application id you want to call and a function pointer. Various overloads exist for the different calling conventions and for different numbers of arguments. To use this, you could modify your implementation of fNative as follows:

DWORD WINAPI ThreadMain(LPVOID pvCallbackInfo);
 
struct CallbackInfo
{
  PFN_NATIVECALLBACK pfnCallback;
  int appDomainId;
};
 
void _stdcall fNative(PFN_NATIVECALLBACK pfnNativeCallback)
{
  DWORD appDomainId;
  HRESULT hr = msclr::_detail::get_clr_runtime_host()
    ->GetCurrentAppDomainId(&appDomainId);
  hr; // error handling ignored for simplicity here
 
  CallbackInfo callbackInfo = { pfnNativeCallback, appDomainId };
  // this newly create thread starts as a native-only thread and gets
  // promoted when the new thread calls a managed function for the first time
  HANDLE hThread = ::CreateThread(NULL, 0, &ThreadMain, &callbackInfo , 0, 0);
  ::WaitForSingleObject(hThread, 1000);
  ::CloseHandle(hThread);
}
 
DWORD WINAPI ThreadMain(LPVOID pvCallbackInfo)
{
  CallbackInfo* pCallbackInfo = (CallbackInfo*)pvCallbackInfo;
  msclr::call_in_appdomain(pCallbackInfo->appDomainId,
    pCallbackInfo->pfnCallback);
 
  return 0;
}

This implementation passes not only the function pointer of the callback function to the thread, but a pointer to a structure that contains the callback function pointer as well as an integer value identifying the callback application domain. The thread then uses this information to invoke the callback function in the right application domain.

As I mentioned, there is a second approach which makes sure that the CLR automatically marshals the call to the right application domain. This approach is based on a method that you have probably used before. It is called Marshal::GetDelegateForFunctionPointer. This method generates a thunk that can be called by native code via a function pointer. Notice that in contrast to other thunks, this thunk is aware of the application domain where it was created. Therefore, it can automatically switch to the right application domain even when this thunk is called by a thread that has just been promoted to a managed thread. The following code shows how to use this method:

[System::Runtime::InteropServices::UnmanagedFunctionPointer(
           System::Runtime::InteropServices::CallingConvention::StdCall)]
delegate void CallbackDelegate();
 
static void ManagedCallback()
{
  Program::DumpAppDomainInfo("Program::ManagedCallback");
}
 
static void ExecuteInChildDomain()
{
  Program::DumpAppDomainInfo("Program::ExecuteInChildDomain");
 
  CallbackDelegate^ cb = gcnew CallbackDelegate(&ManagedCallback);
 
  using System::Runtime::InteropServices::Marshal;
  // this call generates a thunk that can be called by native code
  IntPtr fnPtr = Marshal::GetFunctionPointerForDelegate(cb);
  fNative((PFN_NATIVECALLBACK)fnPtr.ToPointer());
 
  // the delegate’s lifetime determines the lifetime of the generated thunk
  // => keep the delegate alife as long as native code needs the thunk
  GC::KeepAlive(cb);
}

In this post I have explained an issue that you have to understand if you want to write assemblies that use C++/CLI interop and allow partially trusted callers at the same time. In one of my next posts I will describe a safty net that you can implement in your pluggable application that ensures that no undesired bypassing of the sandbox can occur even if your plugin uses assemblies that are not aware of this issue.

10/9/2009 5:53:23 PM (GMT Daylight Time, UTC+01:00)  #    Disclaimer  |   |  Trackback
Related Posts:
Deploying native DLLs with managed wrapper assemblies into the GAC