Thursday, July 2, 2009

.NET corrupts heap when marshalling LPSTR

(Hi Lizzie)

I've just spent the best part of a week trying to track down why my bridge from C# to C++ was working on Windows XP but crashing sporadically on Windows 7, and the answer is that .NET marshalling is trickier than you think for strings.

Essentially, I had this:

[DllImport("mydll.dll",CharSet=Ansi,CallingConvention=Cdecl)]
[MarshalAs(UnmanagedType.LPStr)]
private static extern String LookupCorrespondingString(Int32 key);


and in the DLL, I had

__declspec(dllexport) const char *LookupCorrespondingString(int key);

Whenever I called this, it would get all the way into my DLL, I could tell it was going to return a value, but during the return operation, it would crash. When I ran it in the debugger, I got output messages about how memory was being free'd into the wrong heap.

Eventually I found this article:
https://blogs.msdn.com/dsvc/archive/2009/06/22/troubleshooting-pinvoke-related-issues.aspx
which contained the useful quote (emphasis mine):
When a string buffer allocated by native code is marshaled to managed code, CLR Interop marshaller will allocate a managed string object and copy the contents of native buffer to the managed string object. Now in order to prevent a potential memory leak, CLR Interop Marshaller will try to free allocated native memory. It does so by calling CoTaskMemFree. The decision to call CoTaskMemFree is by-design. This can at times lead to crash, if memory was allocated by the called-native function using any API other than CoTaskMemAlloc family of API’s as custom allocators may allocate on different heaps.
And there was the answer. .NET was freeing the block of memory I had passed to it to help me "prevent a potential leak". The problem being the lack of ability to communicate the 'const-ness' of the underlying DLL entry point's return value in the MarshalAs() attribute.

The solution was to declare the entry point differently:

[DllImport("mydll.dll",CharSet=Ansi,CallingConvention=Cdecl)]
private static extern IntPtr LookupCorrespondingString(Int32 key);


and then when I call it, do the marshalling explicitly.

String s = Marshal.PtrToStringAnsi( LookupCorrespondingString(k) );

After I bitched and moaned about how this stuff isn't documented anywhere, Manish Jawa kindly pointed out that it is, in fact, documented in the very first sentence on this page: http://msdn.microsoft.com/en-us/library/f1cf4kkz.aspx - you can't ask for more than that.

Actually, you can and they give it to you here: http://msdn.microsoft.com/en-us/library/x3txb6xc.aspx - the problem I was experiencing and its solution spelled out.

So, the moral of the story is: if you pass native strings to .NET via the marshalling interface, make sure you use IntPtr and PtrToStringAnsi() unless you want them to be free'd for you.

Wednesday, May 13, 2009

Embedding .NET in an otherwise native application

For reasons best known to the company I work for we embed the .NET CLR inside our otherwise native C++ MFC-based application, and for the last two weeks I've been trying to nail down why pretty much everything seems to work, except the windows where we embed the WebBrowser control via COM.  No matter what I tried, IOleObject::SetClientSite() would fail and the web browser would display as an empty white rectangle on the desktop instead of in our window.

This guy seemed to have a very similiar symptom, but no answer; however, it was a fair guess he had the same problem and it was something to do with threading.

One of the guys at AutoDESK Developer Support (thanks again) pointed out that the CLR initialises the main thread to use MTA (multi-thread-apartment) mode, whereas the WebBrowser control really only works with STA (single-thread-apartment) mode.  So, the solution was to ensure that I called CoInitialize(NULL) in our main thread before the CLR had a chance to mess things up.

Tuesday, March 31, 2009

".PDB files cannot be linked due to incompatible versions"

The message ".PDB files cannot be linked due to incompatible versions" can be displayed during builds, even if the .PDB file has just been created by the current compilation (ie, its just not possible that it was built by a different version of the compiler)

The problem is that the
mspdbsrv.exe process (which Microsoft uses to create PDB files) does not always terminate after linking.  If you have a mixed environment where you compile and link some modules with VC7.0 and then some with VC8.0, sometimes the 8.0 mspdbsrv.exe process hangs around and interferes with subsequent 7.0 compiles.

Solution: use the Task Manager to terminate it.

Building MFC applications with Visual Studio .NET 2008 Express

For reasons best known to themselves, Microsoft really really don't want you to build traditional MFC-based applications with their new Developer Studio tools - if you want the free stuff, you gotta trade in those testicles and do things with .NET or .nothing.  None of the MFC libraries come in the latest Express (read: free) versions of Developer Studio, nor do such useful tools as RC and MIDL.

However...

You can get RC and MIDL from the Platform SDK.


And you can get an old version of the MFC and ATL libraries from the Driver Development Kit.


though in true Microsoft style, those headers contain compile errors.

Line 1034 and 1036 of …\include\mfc42\afxwin1.inl will probably contain the following:
_AFXWIN_INLINE CMenu::operator==(const CMenu& menu) const
{ return ((HMENU) menu) == m_hMenu; }
_AFXWIN_INLINE CMenu::operator!=(const CMenu& menu) const
{ return ((HMENU) menu) != m_hMenu; }

which is incorrect – they generate error 4430. It should be:

_AFXWIN_INLINE BOOL CMenu::operator==(const CMenu& menu) const
{ return ((HMENU) menu) == m_hMenu; }
_AFXWIN_INLINE BOOL CMenu::operator!=(const CMenu& menu) const
{ return ((HMENU) menu) != m_hMenu; }

(And people blame Windows crashes on "bad drivers" - how hard can it be to write a driver when the DDK has such blatant errors - they literally cannot have tried to compile with these headers before shipping them)

Sadly, whats discussed here only works up to VC8.0 - at 9.0 the ATL changes are too drastic, as I found out the hard way.  If you want to use MFC or ATL, you can't use Visual Studio 2008 Express.

Friday, January 9, 2009

PDF to JPEG conversion

Various PDFs collected from around the net would be better off as individual image files. You'd think there'd be a standard tool to convert them but I couldn't find any at a price point I was interested in. Fortunate OSX Python has access to CoreGraphics which can do the heavy lifting.

#!/usr/bin/python
import sys,re,os,os.path
from CoreGraphics import *

def doit(pdfname):
  if not re.search(".pdf$",pdfname): return
  print pdfname
  dirname = re.sub(".pdf$","",pdfname)
  try:
     os.mkdir(dirname)
  except:
     print "Can't create directory '%s'"%(dirname)
     return
  pdf = CGPDFDocumentCreateWithProvider(CGDataProviderCreateWithFilename(pdfname))
  cs = CGColorSpaceCreateDeviceRGB()
  bg = CGFloatArray(5)       # create's an array of 5 0's which is good enough for me
  for i in range(1, pdf.getNumberOfPages() + 1):
     page = pdf.getPage(i)
     r = page.getBoxRect(kCGPDFMediaBox)
     h = r.getHeight()
     w = r.getWidth()
     del page

     #c = CGBitmapContextCreateWithColor(int(w), int(h), cs, (0,0,0,0))
     c = CGBitmapContextCreateWithColor(int(w), int(h), cs, bg)
     c.saveGState()
     c.setInterpolationQuality(kCGInterpolationHigh)
     c.drawPDFDocument(r,pdf,i)
     c.restoreGState()
     c.writeToFile(os.path.join(dirname, "page%04d.jpg"%i),kCGImageFormatJPEG)
     del c
  del cs
  del pdf

if __name__=='__main__':
  for a in sys.argv[1:]: doit(a)
The original version of this script was broken by Snow Leopard (which upgraded Python to 2.6.1). The call to CGBitmapContextCreateWithColor() failed with an error message about the 4th argument which it seems to think shouldn't be a 'const float[5]'.
The solution is to pass in a CGFloatArray() object instead. I haven't been able to modify one of those, but the default thats produced when you use 'bg = CGFloatArray(5)' appears to be good enough. Those objects still look leaky as hell but what are ya gonna do?
Squirrel:~ jeff$ python
Python 2.6.1 (r261:67515, Jul  7 2009, 23:51:51)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from CoreGraphics import CGFloatArray
>>> a = CGFloatArray(5)
>>> print repr(a)
<CoreGraphics.CGFloatArray; proxy of <Swig Object of type 'CGFloatArray *' at 0x2287a0> >
>>> print repr(a[0])
swig/python detected a memory leak of type 'CGFloat *', no destructor found.
<Swig Object of type 'CGFloat *' at 0x224d10>
>>>

Brave New World

Everyone else seems to be getting on to this bandwagon, so why shouldn't I?

I doubt I'll have anything interesting to say to the world - I'm far more likely to use this to record notes to myself that may conceivably be useful to others.  And unlike most bloggers, I'm going to go back and edit anything that needs fixing rather than posting new articles.