Curiosity is bliss    Archive    Feed    About    Search

Julien Couvreur's programming blog and more

IL modification at runtime (step III)

 

Here is one more incremental step in instrumenting IL at runtime. This time, the method call that will be inserted will be implemented in a separate dll and will take an int as an input.
If you missed the previous episodes you can find them here: Step I, Step II and Step II+.

Calling a method from a separate assembly requires that some metadata be created so that we can reference the foreign method using tokens.
Multiple references (tokens) need to be generated: first the dll containing the Logger::Log method (using DefineAssemblyRef), then the Logger class itself (using DefineTypeRefByName) and finally the Logger::Log(int) method (using DefineMemberRef).
When we have the token for the Logger::Log method we can insert a call to it, with the regular IL code: 0x28 (call) followed by the method token.

When you are ready to call the Log(int) method, you need to decide what integer you want to pass it. You can either load a constant on the stack (with the ldc.i4 IL opcode to load the integer 4, for example) or pass it a copy of an int from the current method context. In our case, we'll see this second technique, were the method getting instrumented takes some integers as input parameters, and one of these inputs gets logged.

Now to the code:

HRESULT CProfilerCallback::JITCompilationStarted(UINT functionId,      BOOL fIsSafeToBlock) {   wchar_t wszClass[512];   wchar_t wszMethod[512];

  HRESULT hr = S_OK;

  ClassID classId = 0;
  ModuleID moduleId = 0;
  mdToken tkMethod = 0;

  // Only execute for the blah method
  if (GetMethodNameFromFunctionId(functionId, wszClass, wszMethod))
  {
   ProfilerPrintf("JITCompilationStarted: %ls::%ls\n",wszClass,wszMethod);
   if (wcscmp(wszClass, L"Hello") != 0 ||
     wcscmp(wszMethod, L"blah") != 0) {
    goto exit;
   }
  } else {
   ProfilerPrintf( "JITCompilationStarted\n" );
   goto exit;
  }

  // Get the moduleID and tkMethod
  hr = m_pICorProfilerInfo->GetFunctionInfo(functionId, &classId, &moduleId, &tkMethod);
  if (FAILED(hr))
   { goto exit; }

  // Get the metadata import
  IMetaDataImport* pMetaDataImport = NULL;
  hr = m_pICorProfilerInfo->GetModuleMetaData(moduleId, ofRead, IID_IMetaDataImport,
     (IUnknown** )&pMetaDataImport);
  if (FAILED(hr))
   { goto exit; }


  //
  // Metadata modification
  //
  IMetaDataEmit* pMetaDataEmit = NULL;
  IMetaDataAssemblyEmit* pMetaDataAssemblyEmit = NULL;
  mdAssemblyRef tkLoggerLib;

  hr = m_pICorProfilerInfo->GetModuleMetaData(
     moduleId, ofRead | ofWrite, IID_IMetaDataEmit,
     (IUnknown** )&pMetaDataEmit);
  if (FAILED(hr)) { goto exit; }
  hr = pMetaDataEmit->QueryInterface(IID_IMetaDataAssemblyEmit,
(void**)&pMetaDataAssemblyEmit);
  if (FAILED(hr)) { goto exit; }

  // Get the token for the Logger class and its Log method
  mdTypeDef tkLogger = 0;
  mdMethodDef tkLog = 0;

  // Create a token for the Logger.dll assembly
  ASSEMBLYMETADATA amd;
  ZeroMemory(&amd, sizeof(amd));
  amd.usMajorVersion = 0;
  amd.usMinorVersion = 0;
  amd.usBuildNumber = 0;
  amd.usRevisionNumber = 0;
  hr = pMetaDataAssemblyEmit->DefineAssemblyRef(
     NULL, 0, // No public key token
     L"Logger",
     &amd, NULL, 0, 0,
     &tkLoggerLib);
  if (FAILED(hr)) { goto exit; }

  // Create a token for the Logger class
  hr = pMetaDataEmit->DefineTypeRefByName(tkLoggerLib,
     L"DumkyNamespace.Logger", &tkLogger);
  if (FAILED(hr)) { goto exit; }

  // Create a token for the Log method
  BYTE Sig_void_String[] = {
   0, // IMAGE_CEE_CS_CALLCONV_DEFAULT
   0x1, // argument count
   0x1, // ret = ELEMENT_TYPE_VOID
   ELEMENT_TYPE_I4
  };

  hr = pMetaDataEmit->DefineMemberRef(tkLogger,
     L"Log",
     Sig_void_String, sizeof(Sig_void_String),
     &tkLog);
  if (FAILED(hr)) { goto exit; }

  //
  // IL modification
  //

#include <pshpack1.h>
  struct {
   BYTE opCode1;
   BYTE call; DWORD method_token;
  } ILCode;
#include <poppack.h>

  //ILCode.opCode1 = 0x19; // load integer '3' or CEE_LDC_I4_3 from opcode.def
  ILCode.opCode1 = 0x02; // load arg 0 opdcode (ldarg.0 or CEE_LDARG_0)
  ILCode.call = 0x28;
  ILCode.method_token = tkLog;

  InsertIL(moduleId, tkMethod, (BYTE*) &ILCode, sizeof(ILCode));

exit:
  return hr;
}


HRESULT CProfilerCallback::InsertIL(ModuleID moduleId, mdToken tkMethod, BYTE* pbNewIL, int iNewILLen) {
  HRESULT hr = S_OK;

  //
  // Get the existing IL
  //
  LPCBYTE pMethodHeader = NULL;
  ULONG iMethodSize = 0;
  hr = m_pICorProfilerInfo->GetILFunctionBody(moduleId, tkMethod, &pMethodHeader, &iMethodSize);
  if (FAILED(hr))
   { goto exit; }

  //
  // Print the existing IL
  //
  IMAGE_COR_ILMETHOD* pMethod = (IMAGE_COR_ILMETHOD*)pMethodHeader;
  COR_ILMETHOD_FAT* fatImage = (COR_ILMETHOD_FAT*)&pMethod->Fat;

  if(!fatImage->IsFat()) {
   goto exit;
  }

  ProfilerPrintf("\n");
  ProfilerPrintIL(fatImage);


  //
  // Get the IL Allocator
  //
  IMethodMalloc* pIMethodMalloc = NULL;
  IMAGE_COR_ILMETHOD* pNewMethod = NULL;
  hr = m_pICorProfilerInfo->GetILFunctionBodyAllocator(moduleId, &pIMethodMalloc);
  if (FAILED(hr))
   { goto exit; }

  //
  // Allocate IL space and copy the IL in it
  //
  pNewMethod = (IMAGE_COR_ILMETHOD*) pIMethodMalloc->Alloc(iMethodSize+iNewILLen);
  if (pNewMethod == NULL)
   { goto exit; }
  COR_ILMETHOD_FAT* newFatImage = (COR_ILMETHOD_FAT*)&pNewMethod->Fat;


  //
  // Modify IL
  //
  // Copy the header
  memcpy((BYTE*)newFatImage, (BYTE*)fatImage, fatImage->Size * sizeof(DWORD));

  // Add a call to "Log"
  memcpy(newFatImage->GetCode(), pbNewIL, iNewILLen);

  // Copy the remaining of the method
  memcpy(newFatImage->GetCode() + iNewILLen,
   fatImage->GetCode(),
   fatImage->CodeSize);


  // Update the code size
  newFatImage->CodeSize += iNewILLen;
  newFatImage->MaxStack += 1;

  // Print modified IL
  ProfilerPrintf("\n");
  ProfilerPrintIL(newFatImage);

  // Push IL back in
  hr = m_pICorProfilerInfo->SetILFunctionBody(moduleId, tkMethod, (LPCBYTE) pNewMethod);
  if (FAILED(hr))
   { goto exit; }

  pIMethodMalloc->Release();
exit:
  return hr;
}

A couple of notes:

Increasing the MaxStack might not be necessary in all cases, instead it should be updated to be the max of the stack used for the injected code and the stack used by the existing code. But the current code is safe, as the MaxStack will always be larger than the maximum stack used.

If you mistype the names of the assembly, class or method, then you get quite interesting errors at runtime, that are actually quite helpful. My thanks to the CLR team for that.


Here is the code for the assembly containing the "instrumentation" (Logger.cs):

using System;

namespace DumkyNamespace
{
  public class Logger
  {
   public static void Log(int i)
   {
   Console.WriteLine("Log!" + i);
   }
  }
}

You can compile it with "csc /t:library Logger.cs".

And as usual the code to be instrumented (Hello.cs):

using System;

public class Hello
{
  public static void Main(string[] prms)
  {
   Console.WriteLine("main!");
   blah(4,5);
  }

  public static void blah(int i, int j) {
   Console.WriteLine("blah!");
   Console.WriteLine(i);
   Console.WriteLine(j);
  }
}

When the blah method gets JITed, a call to Logger::Log(i) is added, so you get the following output:

main!
Log!4
blah!
4
5

______________________________________

MSDN Mag has a good article on IL modification at runtime at http://msdn.microsoft.com/msdnmag/issues/03/09/NETProfilingAPI/default.aspx (Rewrite MSIL Code on the Fly with the .NET Framework Profiling API -- MSDN Magazine, September 2003).

Also, I generated a new VS.Net ATL/COM project for this, to use the "standard" COM template. Matt's DNProfiler codebase was causing me some problems when trying to use some ATL includes. The VS.Net 2002 project is available via anonymous CVS access (see AnonCVS blog entry).

Posted by: Dumky (September 7, 2003 08:43 PM) ______________________________________

I wonder what happen when the dll is GACed or strongly named. I wonder if CLR will verify the byte code before calling the JITCompilationStart? if verification is performed prior to the execution of the acutal IL code, the dynamica injection of IL may fail. Do you think this is the case? if so what work around can you think of? Thanks

Posted by: xin (March 27, 2004 02:49 PM) ______________________________________

Hi,

Do you know if there exists an unmanaged component that can pass the original IL to a C# application to be instrumented?

Posted by: AlexB (May 8, 2004 03:05 PM) ______________________________________

I can't seem to get this working. I've modified the existing DNProfiler with the code suggested here and it wouldn't work, so I downloaded the project provided in part II and it wouldn't work. I'm getting a System.InvalidProgramException. All I can seem to find about the error is that it means that the MSIL is corrupt, which kind of makes sense considering I am modifying the MSIL, however the changes made here should work. Any suggestions?

Thanks.

Posted by: JoshD (May 13, 2004 04:24 PM) ______________________________________

Josh, I'd recommend that you look closely at the IL that you generate. Maybe a bytecode you introduced is wrong. Also, you may need to modify the MaxStack.
What kind of modification are you doing? Start with no modification and then include them in little by little.
Does the exception contain any more information?

Posted by: Dumky (May 14, 2004 11:11 AM) ______________________________________

Hi,

An excellent series of articles.

I've been myself experimenting with introducing IL at runtime for profling the app. So far I've been able to
- modify the offsets for correct branch instructions
- modify the SEH sections to adjust the try and catch block offsets and length

The only place where the program fails is when the an exception does occur and control moves to the exception block.

I take care that the 1st IL command in the catch block is to pop off the element (exception object) and store it in a location. It's only after that I introduce my Call for instrumentation. HOwever I receive a runtime error.

Any help / suggestions??

Rgds

Nilesh

Posted by: Nilesh Malpekar (June 26, 2005 06:26 PM)
comments powered by Disqus