Curiosity is bliss    Archive    Feed    About    Search

Julien Couvreur's programming blog and more

Modifying IL at runtime (step II)

 

A couple of days back, we tweaked the running IL a little bit. Today, let's modify it some more!

We'll insert a method call at the beginning of the body of the Main method. The method we'll call is as easy as can be: it is part of the of the same class, is static and has a void() signature.

The IL used during the modification is still hardcoded. But we'll try to start moving away from that by exploring the metadata, to try and find the method token at runtime. The first step of this is to list all the methods on the current class and print out their names, which we'll see how to do.

You'll need a running copy of the DNProfiler tool to try the code provided and I recommend that you read my previous blog on the subject if you haven't used the Profiler APIs before.

First, let's look at the class we'll play with:

using System;

public class Hello
{
  public static void Main(string[] prms)
  {
   Console.WriteLine("test");
  }

  public static void Log(){
   Console.WriteLine("log!");
  }
}

One of the goals is to modify the Main method just before it gets JITed and insert a call to Log. The other goal is to also list all the methods on the Hello class, using the unmanaged metadata API.


References
The more I look and the more information I find in the "Tool Developers Guide" that comes with the SDK. It is too bad that it isn't part of MSDN and all the documentation files are Word.
Here is my pdf copy of the \FrameworkSDK\Tool Developers Guide\docs\Metadata Unmanaged API.doc SDK document: CLR Metadata Unmanaged API (Tool Developers Guide) (1Mb).
Although I didn't spend much time digging in the samples that come in this guide, they really seem promising: a profiler, a metadata inspector, a command-line debugger and a couple compilers.

A great set of slides on MetaData internals by John Lam. It describes the metadata framework as well as the managed and unmanaged APIs.


Inserting a method call
When you look at a dis-assembled class (with "ildasm /all"), you can see the actual bytes of the IL. You'll notice that many instructions (like ldstr and call) take the form /* xx | (yy) zz zz zz */. It turns out the xx is the operation and (yy) zz zz zz is the operand. From what I understand so far, the operand is always a token that represents a resource (a string, a method). In a class, all uses of the same resource are referenced using the same token.

By looking at the Log method in "ildasm /all Hello.exe" (see below for a copy of the output), we find the method token that identifies this method. In this case it is 06 00 00 02.
As you can probably guess, we'll see that 06 00 00 01 and 06 00 00 03 are also valid method tokens for this class, and they refer to the .ctor (constructor) and Main methods.

.method /*06000002*/ public hidebysig static void Log() cil managed // SIG: 00 00 01 {   // Method begins at RVA 0x2068

  // Code size 11 (0xb)

  .maxstack 1

  IL_0000: /* 72 | (70)00000B */ ldstr "log!"

  IL_0005: /* 28 | (0A)000002 */ call void [mscorlib/* 23000001 */]System.Console/* 01000003 */::WriteLine(string)

  IL_000a: /* 2A | */ ret

} // end of method Hello::Log

The first byte (06) means this is a metadata token that represents a method (mdtMethodDef). You'll find the other prefixes in the CorTokenType enumeration.

We also notice that 28 is the IL code for call. So what we need to do is insert 28 | 06 00 00 02 at the beginning of Main.
If you modify the code in Hello.cs to call Log, you'll see exactly this sequence in the dis-assembly.

In the JITCompilationStarted code below, you notice how we generate this sequence by using a packed structure containing a BYTE and a DWORD.
Two things you need to be careful with when you insert IL instructions: you need to allocate more space for your IL (see Alloc call) and to update the CodeSize in the IL method header.

The method call that we insert is pretty simple, as it doesn't require any parameters and doesn't return any value. In a next iterations of this blog, we'll look at more complex method calls.


Reflection via metadata
In the long run, we want to avoid hardcoding the method token, so we want to explore the available methods at runtime, to figure what method token we should use.

Getting an IMetaDataImport object (via ICorProfilerInfo::GetModuleMetaData method) will enable us to query the metadata for this module.
The documentation for IMetaDataImport describes EnumMethods as a way of enumerating all the methods on a class. You can then get details on each method using the GetMethodProps call.

The only trick is that the "mdTypeDef cl" parameter on EnumMethods isn't very well documented and wasn't obvious to me what should be passed in there. It turns out you need to pass in the token for the class you want to look at. One way of acquiring this token by calling ICorProfilerInfo::GetClassIDInfo.


The code

HRESULT CProfilerCallback::JITCompilationStarted(UINT functionId,
      BOOL fIsSafeToBlock)
{
  wchar_t wszClass[512];
  wchar_t wszMethod[512];

  HRESULT hr = S_OK;

  ClassID classId = 0;
  ModuleID moduleId = 0;
  mdToken tkMethod = 0;
  LPCBYTE pMethodHeader = NULL;
  ULONG iMethodSize = 0;


  if ( GetMethodNameFromFunctionId( functionId, wszClass, wszMethod ) )
  {
   ProfilerPrintf("JITCompilationStarted: %ls::%ls\n",wszClass,wszMethod);
  } else {
   ProfilerPrintf( "JITCompilationStarted\n" );
   goto exit;
  }
  if (wcscmp(wszClass, L"Hello") != 0 || wcscmp(wszMethod, L"Main") != 0) {
   goto exit;
  }

  //
  // Get the existing IL
  //
  hr = m_pICorProfilerInfo->GetFunctionInfo(functionId, &classId, &moduleId, &tkMethod );
  if (FAILED(hr))
   { goto exit; }

  hr = m_pICorProfilerInfo->GetILFunctionBody(moduleId, tkMethod, &pMethodHeader, &iMethodSize);
  if (FAILED(hr))
   { goto exit; }

  //
  // Print the existing IL
  //
  IMAGE_COR_ILMETHOD* pMethod = (IMAGE_COR_ILMETHOD*)pMethodHeader;
  COR_ILMETHOD_FAT* fatImage = (COR_ILMETHOD_FAT*)&pMethod->Fat;

  if(!fatImage->IsFat()) {
   goto exit;
  }

  ProfilerPrintf("\n");
  ProfilerPrintf("Flags: %X\n", fatImage->Flags);
  ProfilerPrintf("Size: %X\n", fatImage->Size);
  ProfilerPrintf("MaxStack: %X\n", fatImage->MaxStack);
  ProfilerPrintf ("CodeSize: %X\n", fatImage->CodeSize);
  ProfilerPrintf("LocalVarSigTok: %X\n", fatImage->LocalVarSigTok);
  ProfilerPrintIL(fatImage->GetCode(), fatImage->CodeSize);

  //
  // Get the "Log" method token reference
  //
  IMetaDataImport* pMetaDataImport = NULL;
  hr = m_pICorProfilerInfo->GetModuleMetaData(moduleId, ofRead, IID_IMetaDataImport,
      (IUnknown** )&pMetaDataImport);
  if (FAILED(hr))
   { goto exit; }

  // Get the typeDef token for the class
  mdToken tkClass = 0;
  hr = m_pICorProfilerInfo->GetClassIDInfo(classId, &moduleId, &tkClass );
  if (FAILED(hr))
   { goto exit; }

  // Get all methods tokens for the class
  HCORENUM enr = 0; // enumerator
  const int siz = 5; // size of arrays
  mdTypeDef rToks[siz]; // array to hold returned bodies
  ULONG count; // count of tokens returned

  hr = pMetaDataImport->EnumMethods(&enr, tkClass, rToks, siz, &count);
  if (FAILED(hr)) { goto exit; }

  while(count > 0) {
   for(ULONG i = 0; i < count; i++) {
    ProfilerPrintf("tok:%X\n", rToks[i]);

    // Get metadata for this method
    mdTypeDef mdClassTok;
    wchar_t wszFunctionName[512];
    ULONG count = 0;
    DWORD dwAttr;
    PCCOR_SIGNATURE signature;
    ULONG signatureLen;
    ULONG ulCodeRVA;
    DWORD dwImplFlags;
    hr = pMetaDataImport->GetMethodProps(rToks[i], &mdClassTok, wszFunctionName, 512, &count,
      &dwAttr,
      &signature, &signatureLen, &ulCodeRVA, &dwImplFlags);
    if (FAILED(hr)) { goto exit; }

    fwprintf(m_pOutFile, L"function name: %s\n", wszFunctionName);
   }

   hr = pMetaDataImport->EnumMethods(&enr, tkClass, rToks, siz, &count);
   if (FAILED(hr)) { goto exit; }
  }
  pMetaDataImport->CloseEnum(enr);
  pMetaDataImport->Release();

  //
  // Get the IL Allocator
  //
  IMethodMalloc* pIMethodMalloc = NULL;
  IMAGE_COR_ILMETHOD* pNewMethod = NULL;
  hr = m_pICorProfilerInfo->GetILFunctionBodyAllocator(moduleId, &pIMethodMalloc);
  if (FAILED(hr))
   { goto exit; }

  //
  // Inserted IL code
  //
#include <pshpack1.h>
  struct {
   BYTE call; DWORD method_token;
  } ILCode;
#include <poppack.h>

  ILCode.call = 0x28;
  ILCode.method_token = 0x06000002;


  //
  // Allocate IL space and copy the IL in it
  //
  pNewMethod = (IMAGE_COR_ILMETHOD*) pIMethodMalloc->Alloc(iMethodSize+sizeof(ILCode));
  if (pNewMethod == NULL)
   { goto exit; }
  COR_ILMETHOD_FAT* newFatImage = (COR_ILMETHOD_FAT*)&pNewMethod->Fat;


  //
  // Modify IL
  //
  // Copy the header
  memcpy((BYTE*)newFatImage, (BYTE*)fatImage, fatImage->Size * sizeof(DWORD));

  // Add a call to "Log"
  memcpy(newFatImage->GetCode(), &ILCode, sizeof(ILCode));

  // Copy the remaining of the method
  memcpy(newFatImage->GetCode() + sizeof(ILCode),
    fatImage->GetCode(),
    fatImage->CodeSize);


  // Update the code size
  newFatImage->CodeSize += sizeof(ILCode);


  // Print modified IL
  ProfilerPrintf("\n");
  ProfilerPrintf("Modified Flags: %X\n", newFatImage->Flags);
  ProfilerPrintf("Modified Size: %X\n", newFatImage->Size);
  ProfilerPrintf("Modified MaxStack: %X\n", newFatImage->MaxStack);
  ProfilerPrintf ("Modified CodeSize: %X\n", newFatImage->CodeSize);
  ProfilerPrintf("Modified LocalVarSigTok: %X\n", newFatImage->LocalVarSigTok);
  ProfilerPrintIL(newFatImage->GetCode(), newFatImage->CodeSize);

  // Push IL back in
  hr = m_pICorProfilerInfo->SetILFunctionBody(moduleId, tkMethod, (LPCBYTE) pNewMethod);
  if (FAILED(hr))
   { goto exit; }

  pIMethodMalloc->Release();

exit:
  return hr;
}

void CProfilerCallback::ProfilerPrintIL(byte* codeBytes, ULONG codeSize)
{
  for(ULONG i = 0; i < codeSize; i++) {
   if(codeBytes[i] > 0x0F) {
    ProfilerPrintf("codeBytes[%u] = 0x%X;\n", i, codeBytes[i]);
   } else {
    ProfilerPrintf("codeBytes[%u] = 0x0%X;\n", i, codeBytes[i]);
   }
  }
}

If everything worked, calling Hello.exe in a console (with the profiler attached) should output "log!" then "test". Success !!

Here is a copy of the output in DNProfiler.out:

Initialize
JITCompilationStarted: Hello::Main

Flags: 13
Size: 3
MaxStack: 1
CodeSize: B
LocalVarSigTok: 0
codeBytes[0] = 0x72;
codeBytes[1] = 0x01;
codeBytes[2] = 0x00;
codeBytes[3] = 0x00;
codeBytes[4] = 0x70;
codeBytes[5] = 0x28;
codeBytes[6] = 0x02;
codeBytes[7] = 0x00;
codeBytes[8] = 0x00;
codeBytes[9] = 0x0A;
codeBytes[10] = 0x2A;

tok:6000001
function name: Main
tok:6000002
function name: Log
tok:6000003
function name: .ctor

Modified Flags: 13
Modified Size: 3
Modified MaxStack: 1
Modified CodeSize: 10
Modified LocalVarSigTok: 0
codeBytes[0] = 0x28;
codeBytes[1] = 0x02;
codeBytes[2] = 0x00;
codeBytes[3] = 0x00;
codeBytes[4] = 0x06;

codeBytes[5] = 0x72;
codeBytes[6] = 0x01;
codeBytes[7] = 0x00;
codeBytes[8] = 0x00;
codeBytes[9] = 0x70;
codeBytes[10] = 0x28;
codeBytes[11] = 0x02;
codeBytes[12] = 0x00;
codeBytes[13] = 0x00;
codeBytes[14] = 0x0A;
codeBytes[15] = 0x2A;
JITCompilationStarted: Hello::Log
Shutdown

You can see the inserted IL as well as the list of three methods available on the class Hello.


A little problem
One problem I ran into is that while debugging in VS.net I would often get a "No source code is available for this location" warning message and I wouldn't be able to watch variables anymore.
Let me know if you have any idea why this is occuring.
My workaround was to rely on the strings outputted in DNProfiler.out more.


To be continued...
In the next iteration, I'll try to analyze the signatures of the methods that are listed and find the token for the Log method by recognizing its signature.

All comments and suggestions are kindly welcome.

______________________________________

If you just need to get the name for a given token, you don't necessarly need to use the Get*Props call.
You can use IMetaDataImport::GetNameFromToken(token, name_pointer), as below:

MDUTF8CSTR name;
hr = pMetaDataImport->GetNameFromToken(tkMethod, &name);
ProfilerPrintf("function name: %s\n", name);

But you'll notice that instead of filling an array of wide characters (like GetMethodProps did), this call gives you a reference to the UTF8 representation of the name inside the runtime.

Posted by: Dumky (June 19, 2003 08:26 AM) ______________________________________

"No source code is available for this location"

You can get around this by bringing up the dissasembly window. That's what I do here...
http://weblogs.asp.net/nunitaddin/posts/8580.aspx

Posted by: Jamie Cansdale (July 27, 2003 05:08 AM) ______________________________________

Actually, in my case, the error occurs while debugging the C++ dll in native mode.

The Reflector is cool stuff though ;-)

Posted by: Dumky (July 27, 2003 11:58 AM)
comments powered by Disqus