<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Justin&#039;s Code Haus</title>
	<atom:link href="http://jholewinski.org/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://jholewinski.org/blog</link>
	<description>... Compilers, Graphics, Games, Hardware</description>
	<lastBuildDate>Sat, 17 Mar 2012 04:22:35 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>UnrealScript: Brace Placement Matters!</title>
		<link>http://jholewinski.org/blog/unrealscript-brace-placement-matters/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=unrealscript-brace-placement-matters</link>
		<comments>http://jholewinski.org/blog/unrealscript-brace-placement-matters/#comments</comments>
		<pubDate>Sat, 17 Mar 2012 04:22:35 +0000</pubDate>
		<dc:creator>jholewinski</dc:creator>
				<category><![CDATA[Unreal]]></category>

		<guid isPermaLink="false">http://jholewinski.org/blog/?p=136</guid>
		<description><![CDATA[I was playing around with the Unreal Development Kit this evening, and discovered a rather interesting quirk in the handling of braces within UnrealScript.  All of the sample code I read use a syntax style that places opening braces on &#8230;<p class="read-more"><a href="http://jholewinski.org/blog/unrealscript-brace-placement-matters/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<p>I was playing around with the Unreal Development Kit this evening, and discovered a rather interesting quirk in the handling of braces within UnrealScript.  All of the sample code I read use a syntax style that places opening braces on the following line:</p>
<pre class="crayon-plain-tag"><code>event PostBeginPlay()
{
  // Do something
}</code></pre>
<p>However, my typical style places the opening brace on the current line:</p>
<pre class="crayon-plain-tag"><code>event PostBeginPlay() {
  // Do something
}</code></pre>
<p>Unfortunately, this does not seem to work for <code>defaultproperties</code> blocks. If you place the brace on the same line, the compiler will not give you any warnings or errors, but the entire <code>defaultproperties</code> block is just ignored!</p>
<p>So this code works:</p>
<pre class="crayon-plain-tag"><code>defaultproperties
{
  PlayerControllerClass=class'MyPlayerController'
}</code></pre>
<p>while the following code compiles but silently just ignores all of the contained settings:</p>
<pre class="crayon-plain-tag"><code>defaultproperties {
  PlayerControllerClass=class'MyPlayerController'
}</code></pre>
<p>I was banging my head on the wall for at least an hour figuring this one out!</p>
<p>I hope this can help prevent someone else from repeating my mistake.</p>
]]></content:encoded>
			<wfw:commentRss>http://jholewinski.org/blog/unrealscript-brace-placement-matters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Space Hogs Binary Release</title>
		<link>http://jholewinski.org/blog/space-hogs-binary-release/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=space-hogs-binary-release</link>
		<comments>http://jholewinski.org/blog/space-hogs-binary-release/#comments</comments>
		<pubDate>Fri, 17 Feb 2012 17:35:30 +0000</pubDate>
		<dc:creator>jholewinski</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://jholewinski.org/blog/?p=120</guid>
		<description><![CDATA[I&#8217;ve converted my old Space Hogs game project to XNA 4.0 (it was originally written in XNA 1.0). There were enough API changes to make it a pain, but I think I have everything working now. This game was developed &#8230;<p class="read-more"><a href="http://jholewinski.org/blog/space-hogs-binary-release/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve converted my old Space Hogs game project to XNA 4.0 (it was originally written in XNA 1.0).  There were enough API changes to make it a pain, but I think I have everything working now.</p>
<p>This game was developed by myself, Jason Kim, Joseph Ahn, Vjekoslav Kovacevic, and Daniel Guinn for a computer animation class during Winter Quarter 2007.</p>
<p><a href="http://jholewinski.org/blog/wp-content/uploads/2012/02/spacehogs-screen1.png"><img src="http://jholewinski.org/blog/wp-content/uploads/2012/02/spacehogs-screen1.png" alt="" title="spacehogs-screen1" width="1280" height="720" class="aligncenter size-full wp-image-80" /></a></p>
<p>You can find a zip file <a href="http://jholewinski.org/static/SpaceHogs.zip">here</a>.  This requires XNA 4.0 and the February 2010 DX packages to be installed on your machine.  For convenience, I&#8217;ve included both of the redistributable packages in the zip file.</p>
<p>The source can be found on <a href="https://bitbucket.org/jholewinski/space-hogs" target="_blank">BitBucket</a>.</p>
<p>Enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://jholewinski.org/blog/space-hogs-binary-release/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Direct3D 11 with Qt 4</title>
		<link>http://jholewinski.org/blog/direct3d-11-with-qt-4/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=direct3d-11-with-qt-4</link>
		<comments>http://jholewinski.org/blog/direct3d-11-with-qt-4/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 20:12:30 +0000</pubDate>
		<dc:creator>jholewinski</dc:creator>
				<category><![CDATA[Direct3D]]></category>
		<category><![CDATA[Qt]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://jholewinski.org/blog/?p=99</guid>
		<description><![CDATA[(If you&#8217;re in a hurry, the full source can be found on my BitBucket account) When it comes to GUI frameworks for C++, it&#8217;s very hard to beat Qt.  It&#8217;s modular, easy to use, and available on practically any desktop &#8230;<p class="read-more"><a href="http://jholewinski.org/blog/direct3d-11-with-qt-4/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<p>(If you&#8217;re in a hurry, the full source can be found on my <a href="https://bitbucket.org/jholewinski/qt4-d3d11" target="_blank">BitBucket</a> account)</p>
<p>When it comes to GUI frameworks for C++, it&#8217;s very hard to beat <a href="http://qt.nokia.com" target="_blank">Qt</a>.  It&#8217;s modular, easy to use, and available on practically any desktop system (and even a few mobile systems).  The MOC&#8217;ing can get a bit annoying, but IDE and command-line support is very mature at this point.  However, only OpenGL is supported currently for real-time 3D rendering. If you want to render to a Qt widget from a Direct3D 11 device, you end up having to do a lot of setup yourself.</p>
<p>Unfortunately, there is not a lot of information out on the internet about setting up Direct3D to play nice with Qt.  Most of the information is either out-dated, or only applies to Direct3D 9.  Lately, I&#8217;ve been playing around with this and I want to share my method for combining Direct3D 11 and Qt.</p>
<p><a href="http://jholewinski.org/blog/wp-content/uploads/2012/02/qtd3d11-screen1.png"><img src="http://jholewinski.org/blog/wp-content/uploads/2012/02/qtd3d11-screen1.png" alt="" title="qtd3d11-screen1" width="906" height="766" class="aligncenter size-full wp-image-87" /></a></p>
<p>&nbsp;</p>
<h3>Creating a Widget</h3>
<p>To start, we define a new widget sub-class specifically for Direct3D 11 rendering. On the Qt side, the key to eliminating flickering or UI artifacts is the <code>paintEngine()</code> method.  We need a way to tell Qt that we want complete control over drawing for our widget, so we can override <code>paintEngine()</code> in our widget definition:</p>
<pre class="crayon-plain-tag"><code>class D3DRenderWidget : public QWidget {
  Q_OBJECT
  Q_DISABLE_COPY(D3DRenderWidget)
public:
  D3DRenderWidget(QWidget* parent = NULL);
  virtual ~D3DRenderWidget();
  virtual QPaintEngine* paintEngine() const { return NULL; }
protected:
  virtual void resizeEvent(QResizeEvent* evt);
  virtual void paintEvent(QPaintEvent* evt);
};</code></pre>
<p>(Note that for ease of viewing, all of the fields have been removed from this code snippet)</p>
<p>We also need to set a few attributes on our widget, as shown in the constructor:</p>
<pre class="crayon-plain-tag"><code>D3DRenderWidget::D3DRenderWidget(QWidget* parent)
: QWidget(parent) {
  setAttribute(Qt::WA_PaintOnScreen, true);
  setAttribute(Qt::WA_NativeWindow, true);

  // Create Device
  createDevice();
}</code></pre><p><p>First, we tell Qt that we do not want it to do any draw buffering for us. Second, we require a native window handle for our widget. Otherwise, Qt may re-use the same native handle for multiple widgets and cause problems for our Direct3D rendering. You may have also noticed the <code>createDevice()</code> method call; this will be explained in a bit.</p>
<p>&nbsp;</p>
<h3>Creating the Direct3D 11 Device</h3>
<p>Now that we have a basic widget that can support Direct3D rendering, we can initialize the Direct3D 11 device we want. This procedure is mostly identical to setting up Direct3D in a raw window. The only difference is that we must use the <code>width()</code>, <code>height()</code>, and <code>winId()</code> methods to return the widget size and native window handle, respectively:</p>
<pre class="crayon-plain-tag"><code>swapChainDesc_.BufferCount = 1;
swapChainDesc_.BufferDesc.Width = width();
swapChainDesc_.BufferDesc.Height = height();
swapChainDesc_.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
swapChainDesc_.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
swapChainDesc_.SampleDesc.Count = 4;
swapChainDesc_.SampleDesc.Quality = 0;
swapChainDesc_.Windowed = true;
swapChainDesc_.OutputWindow = winId();
swapChainDesc_.BufferDesc.RefreshRate.Numerator = 60;
swapChainDesc_.BufferDesc.RefreshRate.Denominator = 1;</code></pre>
<p>Everything else remains the same&#8230; pretty easy, huh? <img src='http://jholewinski.org/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
<p>&nbsp;</p>
<h3>Handling Paint Events</h3>
<p>Remember the <code>paintEvent</code> override from the widget class definition? We can simply implement it with a call to some rendering function:</p>
<pre class="crayon-plain-tag"><code>void D3DRenderWidget::paintEvent(QPaintEvent* evt) {
  render();
}</code></pre>
<p>Here, <code>render()</code> is just some arbitrary method that uses the Direct3D 11 device to render something to the primary swap chain.</p>
<p>&nbsp;</p>
<h3>Handling Resize Events</h3>
<p>Resize events are perhaps the hardest events to handle when integrating Direct3D 11 and Qt. To resize our swap chain, we need to release all device-allocated resources, and reallocate them. The procedure I follow is:</p>
<pre class="crayon-plain-tag"><code>void D3DRenderWidget::resizeEvent(QResizeEvent* evt) {
  releaseBuffers();
  swapChain_-&gt;ResizeBuffers(1, width(), height(),
                            swapChainDesc_.BufferDesc.Format, 0);
  swapChain_-&gt;GetDesc(&amp;swapChainDesc_);
  viewport_.Width = width();
  viewport_.Height = height();
  createBuffers();
}</code></pre>
<p>We start by releasing all of the buffers we had allocated (vertex buffers, index buffers, shaders, textures, etc.).  We then issue a resize request to the swap chain, resize our rendering viewport, and then recreate all of our needed buffers. In this snippet, <code>releaseBuffers()</code> will call <code>Release()</code> on all buffers, and <code>createBuffers()</code> will create all of the needed resources (again).</p>
<p>It would probably be easier to just allow the swap chain to grow and just adjust the viewport if the widget shrinks, but this method shows how to keep the swap chain the exact same size as the widget.</p>
<p>&nbsp;</p>
<h3>Conclusion</h3>
<p>At this point, you should have a functional Direct3D 11 rendering context for a Qt widget. For brevity, I have omitted most of the Direct3D initialization code (this can be found in many places on the web).</p>
<p>If you want to check out the complete sample program, it is located on my <a href="https://bitbucket.org/jholewinski/qt4-d3d11" target="_blank">BitBucket</a> account. To build it, you need a relatively recent Qt release, the DirectX SDK, and the Qt Visual Studio Add-in.</p>
]]></content:encoded>
			<wfw:commentRss>http://jholewinski.org/blog/direct3d-11-with-qt-4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>AMD APP: Getting Device Assembly</title>
		<link>http://jholewinski.org/blog/amd-app-getting-device-assembly/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=amd-app-getting-device-assembly</link>
		<comments>http://jholewinski.org/blog/amd-app-getting-device-assembly/#comments</comments>
		<pubDate>Thu, 09 Feb 2012 18:01:39 +0000</pubDate>
		<dc:creator>jholewinski</dc:creator>
				<category><![CDATA[GPU]]></category>
		<category><![CDATA[OpenCL]]></category>

		<guid isPermaLink="false">https://jholewinski.org/wordpress/?p=48</guid>
		<description><![CDATA[Sometimes it is useful to look at the intermediate and assembly code for GPU programs.  This can lead to some interesting performance insights, especially for compiler writers.  Unfortunately, the AMD APP SDK is a bit limited on Linux, and the &#8230;<p class="read-more"><a href="http://jholewinski.org/blog/amd-app-getting-device-assembly/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<p>Sometimes it is useful to look at the intermediate and assembly code for GPU programs.  This can lead to some interesting performance insights, especially for compiler writers.  Unfortunately, the AMD APP SDK is a bit limited on Linux, and the AMD APP KernelAnalyzer, which conveniently dumps the AMDIL and Device ISA for an OpenCL kernel, is not available on Linux.  However, digging through the AMD APP OpenCL Programming Guide, one finds an environment variable that can be used for the same purpose: <code>GPU_DUMP_DEVICE_KERNEL</code>.</p>
<p>According to the programming guide, this environment variable can take one of three values:</p>
<table style="width: 500px;" border="1" cellspacing="1" cellpadding="1">
<tbody>
<tr>
<td>1</td>
<td>Save intermediate IL files in local directory.</td>
</tr>
<tr>
<td>2</td>
<td>Disassemble ISA file and save in local directory.</td>
</tr>
<tr>
<td>3</td>
<td>Save both the IL and ISA files in local directory.</td>
</tr>
</tbody>
</table>
<p>Therefore, if you run your OpenCL program with:</p><pre class="crayon-plain-tag"><code>$ GPU_DUMP_DEVICE_KERNEL=3 ./my-program</code></pre><p>
You will get two files in your local directory: <code>[kernel-name]_[device-name].il</code> and <code>[kernel-name]_[device-name].isa</code>, which contain AMDIL and Device ISA disassembly, respectively.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://jholewinski.org/blog/amd-app-getting-device-assembly/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LLVM 3.0: PTX Backend</title>
		<link>http://jholewinski.org/blog/llvm-3-0-ptx-backend/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=llvm-3-0-ptx-backend</link>
		<comments>http://jholewinski.org/blog/llvm-3-0-ptx-backend/#comments</comments>
		<pubDate>Fri, 02 Dec 2011 16:17:48 +0000</pubDate>
		<dc:creator>jholewinski</dc:creator>
				<category><![CDATA[GPU]]></category>
		<category><![CDATA[LLVM]]></category>
		<category><![CDATA[OpenCL]]></category>

		<guid isPermaLink="false">http://jholewinski.wordpress.com/?p=38</guid>
		<description><![CDATA[With the release of LLVM 3.0, the PTX back-end is now in a fairly usable state.  It even integrates with the Clang OpenCL front-end to produce correct PTX code usable by the nVidia OpenCL run-time.  However, please note that the &#8230;<p class="read-more"><a href="http://jholewinski.org/blog/llvm-3-0-ptx-backend/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<p>With the release of LLVM 3.0, the PTX back-end is now in a fairly usable state.  It even integrates with the Clang OpenCL front-end to produce correct PTX code usable by the nVidia OpenCL run-time.  However, please note that the back-end is still experimental and there are unimplemented features.  As always, please post any questions to the llvm-dev mailing list.</p>
<p>In this post, I aim to give a quick overview of how to use the back-end to compile OpenCL kernels.</p>
<p>As an example, consider the following matrix multiplication routine written in OpenCL:</p>
<pre class="crayon-plain-tag"><code>#define BLOCK_SIZE 16

__kernel
void matmul(__global float* A, __global float* B, __global float* C) {

__local float scratchA[BLOCK_SIZE][BLOCK_SIZE];
__local float scratchB[BLOCK_SIZE][BLOCK_SIZE];

int globalX = get_global_id(0);
int globalY = get_global_id(1);
int size = get_global_size(0);
int k;
float sum = 0.0f;
int numBlocks = size / BLOCK_SIZE;
int b;

int tidX = get_local_id(0);
int tidY = get_local_id(1);

for(b = 0; b &lt; numBlocks; ++b)
{
// Populate a cache for A/B
int x;
int y;

x = b * BLOCK_SIZE + tidX;
y = globalY;

scratchA[tidY][tidX] = A[y * size + x];

x = globalX;
y = b * BLOCK_SIZE + tidY;

scratchB[tidY][tidX] = B[y * size + x];

barrier(CLK_LOCAL_MEM_FENCE);

for(k = 0; k &lt; BLOCK_SIZE; ++k)
{
float myA;
float myB;

myA = scratchA[tidY][k];
myB = scratchB[k][tidX];

sum += myA * myB;
}

barrier(CLK_LOCAL_MEM_FENCE);
}

C[globalY * size + globalX] = sum;

}</code></pre>
<p>We can use the <a href="http://www.pcc.me.uk/~peter/libclc">libclc</a> library, written by Peter Collingbourne, to provide the OpenCL built-in functions for Clang.  This library will map OpenCL built-in functions to target-specific functions in the LLVM IR that the PTX back-end knows how to handle.  If <code>$LIBCLC</code> points to the download of libclc, then you can invoke Clang with:</p>
<pre class="crayon-plain-tag"><code>clang -ccc-host-triple ptx32
-Xclang -target-feature -Xclang +ptx23
-Xclang -target-feature -Xclang +sm20
-I$LIBCLC/include/generic -I$LIBCLC/include/ptx
-include clc/clc.h -Dcl_clang_storage_class_specifiers
-O3 matmul_kernel.cl -S -o matmul_kernel.ptx</code></pre>
<p>The options can be a bit verbose at the moment, but practically all of them can be placed in a wrapper script.  Clang will compile the kernel and emit the generated PTX code to <code>matmul_kernel.ptx</code>.  This code can then be loaded as an OpenCL binary kernel using the nVidia OpenCL SDK, using the <code>clCreateProgramWithBinary</code> function.  As an added bonus, the performance is about the same as if the kernel was compiled using the nVidia OpenCL compiler!</p>
]]></content:encoded>
			<wfw:commentRss>http://jholewinski.org/blog/llvm-3-0-ptx-backend/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Installing Matplotlib on OS X 10.7 with Homebrew</title>
		<link>http://jholewinski.org/blog/installing-matplotlib-on-os-x-10-7-with-homebrew/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=installing-matplotlib-on-os-x-10-7-with-homebrew</link>
		<comments>http://jholewinski.org/blog/installing-matplotlib-on-os-x-10-7-with-homebrew/#comments</comments>
		<pubDate>Thu, 21 Jul 2011 18:32:16 +0000</pubDate>
		<dc:creator>jholewinski</dc:creator>
				<category><![CDATA[Homebrew]]></category>
		<category><![CDATA[Mac OS X]]></category>
		<category><![CDATA[matplotlib]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Mac OS X Lion]]></category>
		<category><![CDATA[MacOs]]></category>
		<category><![CDATA[Matplotlib]]></category>

		<guid isPermaLink="false">http://jholewinski.wordpress.com/?p=26</guid>
		<description><![CDATA[[edit: It looks like things have changed a bit since the release of 10.7, so your mileage may vary with this method.  This was written when 10.7 was brand new and most software was not yet updated for it.] For &#8230;<p class="read-more"><a href="http://jholewinski.org/blog/installing-matplotlib-on-os-x-10-7-with-homebrew/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<p>[edit: It looks like things have changed a bit since the release of 10.7, so your mileage may vary with this method.  This was written when 10.7 was brand new and most software was not yet updated for it.]</p>
<p>For those of you that do not know, <a href="http://matplotlib.sourceforge.net">Matplotlib</a> is an excellent Python plotting library that allows you to create professional-quality plots for inclusion on web pages, Latex documents, Beamer presentations, Keynote presentations, and any other software that can import SVG, EPS, PNG, or virtually any graphic format.</p>
<p>However, getting matplotlib installed on Mac OS X 10.7 can be a bit tricky, especially if you are using <a href="http://mxcl.github.com/homebrew/">Homebrew</a> as your &#8220;package manager.&#8221;  First off, Homebrew does not have packages for matplotlib, as well as some of its dependencies.  Additionally, the current Matplotlib release version (1.0.1 as of this post) does not compile out-of-the-box against libpng 1.5, which is included in the X11 distribution shipped with Mac OS X 10.7.</p>
<p>For previous versions of Mac OS X (10.6, 10.5), the usual way to install matplotlib was to install python, pkg-config, and gfortran with Homebrew, then install numpy and matplotlib through pip, ala:</p>
<pre class="crayon-plain-tag"><code>$ brew install python
$ brew install gfortran
$ brew install pkg-config
$ easy_install pip
$ pip install numpy
$ pip install matplotlib</code></pre>
<p>Unfortunately, as previously mentioned, all is not so easy in the world of Mac OS X 10.7, and the difficulty lies with libpng 1.5, installed with Mac OS X 10.7&#8242;s version of X11. Briefly put, Matplotlib 1.0.1 is not compatible with libpng 1.5 due to a change in the API. Fortunately, the fix is already applied up-stream and will probably be a part of Matplotlib 1.0.2, or 1.1.0, or whatever the next released version is.</p>
<p>Until the next release, the Matplotlib sources in Git can be used. Instead of pulling the sources from the Matplotlib SourceForge site, you need to pull them from the Matplotlib GitHub site. I&#8217;m not sure if this GitHub site is &#8220;official,&#8221; but is looks to be.</p>
<p>All that is needed is to build Matplotlib from source instead of using pip, so the installation procedure is now:</p>
<pre class="crayon-plain-tag"><code>$ brew install python
$ brew install gfortran
$ brew install pkg-config
$ easy_install pip
$ pip install numpy
$ cd $HOME
$ git clone https://github.com/matplotlib/matplotlib.git
$ cd matplotlib
$ python setup.py build
$ python setup.py install</code></pre>
<p>And now you&#8217;re good to go! Hopefully this will become much easier with the next official release of Matplotlib.</p>
]]></content:encoded>
			<wfw:commentRss>http://jholewinski.org/blog/installing-matplotlib-on-os-x-10-7-with-homebrew/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>The Beauty of C++ Templates</title>
		<link>http://jholewinski.org/blog/the-beauty-of-c-templates/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-beauty-of-c-templates</link>
		<comments>http://jholewinski.org/blog/the-beauty-of-c-templates/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 16:50:08 +0000</pubDate>
		<dc:creator>jholewinski</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[Languages]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Templates]]></category>

		<guid isPermaLink="false">http://jholewinski.wordpress.com/?p=5</guid>
		<description><![CDATA[Every so often, I&#8217;ll get a random C++ question from a friend or colleague.  Most of the time the answers are trivial, at least for someone who has a history with the language.  Other questions make me stop and ponder, &#8230;<p class="read-more"><a href="http://jholewinski.org/blog/the-beauty-of-c-templates/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<p>Every so often, I&#8217;ll get a random C++ question from a friend or colleague.  Most of the time the answers are trivial, at least for someone who has a history with the language.  Other questions make me stop and ponder, searching for the best &#8220;C++&#8221; way to do something.  Yesterday, the question was simple and the solution turned out to be equally simple, but getting to the solution made me stop and appreciate some of the cool things one can do with C++ templates.</p>
<h3>The Problem</h3>
<p>The problem was simple.  Suppose you have a C++ template class/struct that is parameterized by a single type, e.g.</p>
<pre class="crayon-plain-tag"><code>template &lt;typename T&gt;
class my_data {
// ...
private:
  T element_;
};</code></pre>
<h3>The Solution</h3>
<p>Now, the question is, &#8220;how do I write a method for this class/struct that maps the type of T to an enumeration value?&#8221;  For context, the real problem involved mapping T to an MPI data type, e.g. (float -&gt; MPI_FLOAT), (double -&gt; MPI_DOUBLE), etc..</p>
<p>The first thought for anyone familiar with containers may be to explicitly generate a map, e.g. std::map in this case, to hold all possible mappings from the C++ type (via typeid()) to the MPI type (really just an integer).  Such a solution is certainly valid and may be the best way to approach the problem in another language such as C# or Java.  After pondering the &#8220;C++&#8221; solution to the problem for a few minutes, my colleague and I came up with a fairly elegant solution involving templates.  Or, at least I found it quite elegant.</p>
<pre class="crayon-plain-tag"><code>/**
 * This struct wrappers the MPI data type value for the given C++ type.
 *
 * Any valid MPI data type value must have a corresponding explicit template
 * instantiation below.
 */
template &lt;typename T&gt;
struct mpi_type_wrapper {
  int mpi_type;
  mpi_type_wrapper();
};

// Explicit instantiation for `float'
template &lt;&gt;
mpi_type_wrapper&lt;float&gt;::mpi_type_wrapper()
: mpi_type(MPI_FLOAT) {}

// Explicit instantiation for `double'
template &lt;&gt;
mpi_type_wrapper&lt;double&gt;::mpi_type_wrapper()
: mpi_type(MPI_DOUBLE) {}</code></pre><p><p>The mpi_type_wrapper struct is a convenient way to convert an arbitrary C++ type to an equivalent MPI type.  All one has to do is declare a local variable of type mpi_type_wrapper&lt;T&gt; (with appropriate T) and read the value of its mpi_type field.  Of course, none of this is specific to MPI in any way.  The only requirement is that an explicit instantiation of the constructor must be provided for any C++ types that are to be converted.</p>
<h3>Why This Solution?</h3>
<p>This solution strikes me as elegant for two reasons.  First, it is a solution that would be difficult, if not impossible, to express in many other languages.  Second, and most interesting to me, there is <em>no</em> run-time overhead associated with this solution.  You can even compile this with RTTI turned off.  Any reasonable compiler automatically inlines the appropriate constructor, then constant propagation replaces any uses of the mpi_type field with the appropriate MPI_* enumeration value.  There is no memory overhead associated with explicitly keeping a map at run-time, nor any time overhead of performing a map look-up.  The final code just uses the constant value!  If you do not believe me, check out this example:</p>
<pre class="crayon-plain-tag"><code>/**
 * Some template class that needs to know the MPI_DataType value for its
 * template parameter type.
 */
template &lt;typename T&gt;
struct some_type {
  void printType() {
    mpi_type_wrapper&lt;T&gt; wrap;

    printf(&quot;My Type: %d&quot;, wrap.mpi_type);
  };
};

int main() {

  some_type&lt;float&gt;  floatClass;
  some_type&lt;double&gt; doubleClass;

  floatClass.printType();
  doubleClass.printType();

  return 0;
}</code></pre><p><p>
<p>And the generated code?</p>
<pre class="crayon-plain-tag"><code>_main:                                  ## @main
## BB#0:
	pushq	%rbx
	leaq	L_.str(%rip), %rbx
	movq	%rbx, %rdi
	xorl	%esi, %esi
	xorb	%al, %al
	callq	_printf
	movl	$1, %esi
	movq	%rbx, %rdi
	xorb	%al, %al
	callq	_printf
	xorl	%eax, %eax
	popq	%rbx
	ret</code></pre>
<h3>Conclusion</h3>
<p>While this example is probably trivial for most experienced C++ programmers out there, including myself, I always find myself stopping and appreciating such solutions.  In this case, C++ templates provide such an elegant and efficient solution that I cannot help feeling giddy.</p>
]]></content:encoded>
			<wfw:commentRss>http://jholewinski.org/blog/the-beauty-of-c-templates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

