<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.2" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>Lars at the OSL</title>
	<link>http://staff.osuosl.org/~lohnk/blog</link>
	<description>Confessions of a Python Nut Case</description>
	<pubDate>Mon, 14 Apr 2008 02:49:18 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2</generator>
	<language>en</language>
			<item>
		<title>it&#8217;s a geeky meme</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=30</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=30#comments</comments>
		<pubDate>Mon, 14 Apr 2008 02:46:07 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Python]]></category>

		<category><![CDATA[OSUOSL]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=30</guid>
		<description><![CDATA[lars@bozeman:~$ history&#124;awk &#8216;{a[$2]++} END{for(i in a){printf &#8220;%5d\t%s\n&#8221;,a[i],i}}&#8217;&#124;sort -rn&#124;head
133   cd
114   ls
44   svn
31   vi
28   python
24   ssh
21   ./ConfigurationManager.py
17   make
13   rsync
It looks to me like I spend too much time moving around the file system.  I should try to type [...]]]></description>
			<content:encoded><![CDATA[<p>lars@bozeman:~$ history|awk &#8216;{a[$2]++} END{for(i in a){printf &#8220;%5d\t%s\n&#8221;,a[i],i}}&#8217;|sort -rn|head<br />
133   cd<br />
114   ls<br />
44   svn<br />
31   vi<br />
28   python<br />
24   ssh<br />
21   ./ConfigurationManager.py<br />
17   make<br />
13   rsync</p>
<p>It looks to me like I spend too much time moving around the file system.  I should try to type more pathnames and stick around in one place&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=30</wfw:commentRss>
		</item>
		<item>
		<title>Sanity Compromised by Firefox and ssh X Forwarding</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=29</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=29#comments</comments>
		<pubDate>Fri, 07 Mar 2008 03:48:08 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Mozilla]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[OSUOSL]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=29</guid>
		<description><![CDATA[Try this in Linux: open Firefox on your local machine.  Then open a terminal window and ssh to another machine using the -X option for X forwarding.  On the remote machine, start Firefox.  The behavior I get is so bizarre that it cannot be a bug &#8212; somehow this looks intentional.  [...]]]></description>
			<content:encoded><![CDATA[<p>Try this in Linux: open Firefox on your local machine.  Then open a terminal window and ssh to another machine using the -X option for X forwarding.  On the remote machine, start Firefox.  The behavior I get is so bizarre that it cannot be a bug &#8212; somehow this looks intentional.  The Firefox process on the remote machine sits for a few moments and then <em>dies.</em>  Then a new <em>local</em> Firefox window opens.  WTF?</p>
<p>I thought I was going insane.  The people at the OSL that I told about this thought I was insane.  The Mozilla developers that I work with and tried to explain this to thought I was insane.</p>
<p>Some research shows this: the remote Firefox actually starts and communicates with the X server running on the local machine.  The X server tells the remote Firefox that there is already a process called Firefox running.  The remote Firefox then sends a message to the local one to open a new window and then the remote Firefox dies.  This protects a user from creating too many instances of a Firefox process on their machine. Clever, huh?  But totally <em>WRONG</em> and counterintuitive!</p>
<p>Apparently you can stop this behavior if you start the remote Firefox with the intuitively named &#8220;no-remote&#8221; switch.  That prevents the remote Firefox from &#8220;connecting&#8221; to the local Firefox.</p>
<p>Sigh, there goes an hour of my life that I&#8217;d like to have back&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=29</wfw:commentRss>
		</item>
		<item>
		<title>Python Generators: Searching Java Jar Files</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=26</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=26#comments</comments>
		<pubDate>Mon, 03 Mar 2008 02:02:56 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<category><![CDATA[OSUOSL]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=26</guid>
		<description><![CDATA[Here is an example of a utility that uses a recursive generator.  It is a command line utility that assists Java programmers in finding missing classes.  I wrote this script several pears ago when I was dragged kicking and screaming into a Java project.  The script recursively searches a directory tree for jar [...]]]></description>
			<content:encoded><![CDATA[<p>Here is an example of a utility that uses a recursive generator.  It is a command line utility that assists Java programmers in finding missing classes.  I wrote this script several pears ago when I was dragged kicking and screaming into a Java project.  The script recursively searches a directory tree for jar files.  When it finds a jar file, it scans the file&#8217;s directory for the target Java class.</p>
<pre>
#!/usr/bin/python

import sys, os, os.path
import fnmatch

def findFileGenerator(rootDirectory, acceptanceFunction):
  for aCurrentDirectoryItem in [ os.path.join(rootDirectory, x) for x in os.listdir(rootDirectory) ]:
    if acceptanceFunction(aCurrentDirectoryItem):
      yield aCurrentDirectoryItem
    if os.path.isdir(aCurrentDirectoryItem):
      for aSubdirectoryItem in findFileGenerator(aCurrentDirectoryItem, acceptanceFunction):
        yield aSubdirectoryItem

if __name__ == "__main__":
  rootOfSearch = '.'
  if sys.argv[1:]:
    rootOfSearch = sys.argv[1]
  if sys.argv[2:]:
    classnameFragment = sys.argv[2].replace('.', '/')
    def anAcceptanceFunction (itemToTest):
      return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar') and
             classnameFragment in os.popen('jar -tf %s' % itemToTest).read()
  else:
    def anAcceptanceFunction (itemToTest):
      return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar')

  try:
    for x in findFileGenerator(rootOfSearch, anAcceptanceFunction):
      print x
  except Exception, anException:
    print anException</pre>
<p>The focus is on the generator function findFileGenerator.  It creates an iterator for the results of a recursive search through a directory tree.  It accepts as parameters a path to begin the search and a function to determine if a given file satisfies the search parameters.</p>
<p>Generators can be kind of confusing because even though they look like a function, they do not execute immediately when called.  They return a reference to an object that works like an iterator.  The code defined in the generator function is executed by that iterator object.  The first time that the iterator&#8217;s &#8216;next&#8217; function is called, execution begins at the beginning of the code and goes until it encounters a &#8216;yield&#8217; statement.  The &#8216;yield&#8217; statement returns the next value of the iterator.  The next time the &#8216;next&#8217; function is called, execution resumes at the next statement after the &#8216;yield&#8217;.</p>
<p>Let&#8217;s examine this example closely.  Imagine that the first call to the iterator has happened and we&#8217;ve got the resultant iterator-like object.  The first call on that object to &#8216;next&#8217; starts execution at this line:</p>
<pre>
for aCurrentDirectoryItem in [ os.path.join(rootDirectory, x) for x in os.listdir(rootDirectory) ]:</pre>
<p>Here we&#8217;re getting our first directory listing of all the files in the current directory.  Because the call to &#8216;os.listdir(rootDirectory)&#8217; returns a list of file names with their paths stripped off, we&#8217;re going to have to re-attach them.  The list comprehension (the code between the [ &#8230; ]) welds the current directory path to each of the files in the list and returns a new list.  The for loop then sets us up to iterate through that list.</p>
<pre>
   if acceptanceFunction(aCurrentDirectoryItem):
      yield aCurrentDirectoryItem</pre>
<p>Here&#8217;s where we decide if the current entry in this directory is interesting or not.  We call the acceptance function on the item.  Since the acceptance function is passed in when we originally called this generator, it could be anything the programmer desired.  In the case of this particular utility, we&#8217;re looking for Java Jar files that meet a certain criteria.  But it really could have been anything at all: find all files that have vowels in their name, or all files that have a specific type or content.<br />
If the acceptance function returns &#8216;True&#8217;, then we yield.  The current file is returned by the iterator and execution stops until the &#8216;next&#8217; function is called.</p>
<pre>
  if os.path.isdir(aCurrentDirectoryItem):</pre>
<p>If the acceptance function rejected the item, this is immediately the next line to execute.  If the acceptance function accepted the item, this line won&#8217;t be called until after the next call to &#8216;next&#8217;.  In either case, our goal is to find the next item for the iterator to return.</p>
<p>Since we&#8217;re iterating through a list of entries in a directory, some of those will be directories themselves.  The item that we sent to the acceptance function could have been a subdirectory.  Regardless of the outcome of the acceptance function, we need to recurse into subdirectories.</p>
<pre>
   for aSubdirectoryItem in findFileGenerator(aCurrentDirectoryItem, acceptanceFunction):
        yield aSubdirectoryItem</pre>
<p>Hang onto your hat, here&#8217;s where your brain may explode.  We&#8217;ve got a sub-directory and we need to recurse into it and iterate through its entries.  Well, we&#8217;ve got this handy generator that does exact that: it returns an iterator that will cycle through the contents of directory. &#8216;for&#8217; statements in Python have a special relationship with iterators.  You can provide one instead of a list and the &#8216;for&#8217; loop will dutifully iterate through them for you.  We recursively call the generator, passing in the subdirectory and the acceptance function.  The generator returns an iterator to us and the for statement starts the iteration by silently calling the next function.  Remember that the iterator returns only items that have passed the acceptance function, so each item that we get here we&#8217;re just going to pass on as the next item in our iterator.  Hence, we yield every item that we get in this loop.</p>
<p>The rest of the file is in the problem domain: a command line utility that will find Java Jar files with certain classes in them.</p>
<pre>
if __name__ == "__main__":</pre>
<p>Perhaps someday in the future, we&#8217;ll want to use the generator in another application.  By putting the code of the command line utility under this &#8216;if&#8217;, we&#8217;ll prevent it from executing when we use the &#8216;import&#8217; statement on this file.</p>
<pre>
  rootOfSearch = '.'
  if sys.argv[1:]:
    rootOfSearch = sys.argv[1]</pre>
<p>The root of the path that we&#8217;re to search is option on the command line.  If no path is specified, we&#8217;ll assume that we&#8217;re to start in the current working directory.</p>
<pre>
  if sys.argv[2:]:
    classnameFragment = sys.argv[2].replace('.', '/')
    def anAcceptanceFunction (itemToTest):
      return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar') and
             classnameFragment in os.popen('jar -tf %s' % itemToTest).read()</pre>
<p>The name of the class that we&#8217;re to search for is also optional.  If the user does not provide one, then we&#8217;ll assume that we&#8217;re to just find all jar files regardless of their content.</p>
<p>This code fragment is the other case: a fragment of a class name has been given.  It is our task here to create an acceptance function that meets the criterion.</p>
<p>First thing to do is cook the class name a bit.  In Java, class names are qualifies with paths.  Inside Java code, &#8216;.&#8217; is used as a separator.  However, inside jar files, &#8216;/&#8217; is the separator.  To be friendly, we want Java programmers to be able to use either notation.  We make sure the command line argument is converted to the &#8216;/&#8217; notation and stored in &#8216;classnameFragment&#8217;.   Next we define an acceptance function that receives a pathname as a parameter. All we have to do is subject that pathname to some tests and give it either a thumbs up or down.  In this case, we test to see if the pathname represents a directory, then test to see if it is a jar file and finally we run the command line function &#8216;jar \-tf&#8217; to give us a listing of the jar to see if our class name fragment is in there.  Since Python can do &#8220;short-circuit&#8221; expression evaluation, if any of the earlier tests fail in the boolean expression, the other tests do not get executed.</p>
<pre>
  else:
    def anAcceptanceFunction (itemToTest):
      return not os.path.isdir(itemToTest) and fnmatch.fnmatch(itemToTest, '*.jar')</pre>
<p>In the case where the user did not provide a class name fragment, we assume that we&#8217;re looking for all jar files.  The acceptance function here just drops the additional criterion where we looking into the content of the jar file.</p>
<pre>
  try:
    for x in findFileGenerator(rootOfSearch, anAcceptanceFunction):
      print x
  except Exception, anException:
    print anException</pre>
<p>Finally, we &#8216;re ready to actually use the tools.  We call the generator function with the path from which to start the search and our acceptance function.  That returns an iterator that we loop through and print the matching jar files.</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=26</wfw:commentRss>
		</item>
		<item>
		<title>a Pythonic Ospid</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=23</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=23#comments</comments>
		<pubDate>Mon, 11 Feb 2008 22:35:52 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Python]]></category>

		<category><![CDATA[OSUOSL]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=23</guid>
		<description><![CDATA[
 	 	
I&#8217;m suffering an ospid, I wrote some code last weekend that I keep looking at over and over again because I like it so much.
I&#8217;ve got relational database schema that looks like this:

For this blog posting, I am only interested in the first six tables of the top cascade of tables and the [...]]]></description>
			<content:encoded><![CDATA[<p><meta http-equiv="CONTENT-TYPE" content="text/html; charset=utf-8" /><title></title><meta name="GENERATOR" content="OpenOffice.org 2.3  (Linux)" /></p>
<style type="text/css"> 	<!-- 		@page { size: 8.5in 11in; margin: 0.79in } 		P { margin-bottom: 0.08in } 	--> 	</style>
<p style="margin-bottom: 0in">I&#8217;m suffering an <a href="http://www.rdrop.com/%7Ehalf/Creations/Writings/Notebook/Notebook2004.html#Ospids">ospid</a>, I wrote some code last weekend that I keep looking at over and over again because I like it so much.</p>
<p style="margin-bottom: 0in">I&#8217;ve got relational database schema that looks like this:</p>
<p style="margin-bottom: 0in"><img src="http://staff.osuosl.org/~lohnk/images/mozilla.diagram.5.png" /></p>
<p style="margin-bottom: 0in">For this blog posting, I am only interested in the first six tables of the top cascade of tables and the &#8216;updateParamters&#8217; table just below them.</p>
<p style="margin-bottom: 0in">I&#8217;m trying to populate this schema with its initial data by walking a filesystem tree.  I search for files within the filesystem fetching each file&#8217;s pathname.  The   directories in a pathname correspond to values in the cascading tables.</p>
<pre>listOfTables = ['product','version','buildTarget','buildId','locale','channel']</pre>
<p style="margin-bottom: 0in">I wrote a function that takes the name of a directory as an argument.  The function&#8217;s objective is to put the directory name into an appropriate table whenever the value isn&#8217;t already there.  I could have written the function such that the target table name is also a parameter to the function, but I took a different path instead.  I decided that each table should have its own function.  This didn&#8217;t mean that I had to individually write the function for each table, I could get Python to do that for me.</p>
<pre>
def getInsertFunctionForTable(tableName,  databaseConnection, cache,
                              insertSqlTemplate = genericInsertSql,
                              fetchSqlTemplate = genericFetchIdSql):
  insertSql = insertSqlTemplate.replace('TABLENAME', tableName)
  fetchSql = fetchSqlTemplate.replace('TABLENAME', tableName)
  def insertIntoTable(value):
    try:
      return cache[tableName][value]
    except KeyError:
      databaseConnection.executeSql(insertSql % value)
      id = databaseConndatabaseInsertFunctionsection.singleValueSql(fetchSql % value)
      cache[tableName][value] = id
      return id
  return insertIntoTable</pre>
<p style="margin-bottom: 0in">In this code, I define a function that, when given the name of table, will return another function.  This second returned function is the one that I defined earlier.  If I take my list of table names, and use a list comprehension to create a second list of functions appropriate for handling each of the directories in a pathname.</p>
<pre>
databaseConnection = ...
cache = collections.defaultdict(dict)
databaseInsertFunctions = [ getInsertFunctionForTable(x, databaseConnection, cache) for x in listOfTables ]</pre>
<p style="margin-bottom: 0in">Now I can take a pathname and my list of functions and use another list comprehension to process them:</p>
<pre>
pathname = 'firefox/2.00.12/linux-gcc3.1/2008020101/en/somechannel/file.txt'
idForPathname = [x[0](x[1]) for x in zip(databaseInsertFunctions, pathname.split('/'))]</pre>
<p style="margin-bottom: 0in">The result is a list of the database&#8217;s id for each of the directory names in their respective tables.</p>
<p style="margin-bottom: 0in">As it happens, this is the value that I need to populate the next table in my diagram.  Now I can use the same function again for this next table:</p>
<pre>
updateParametersInsertFunction = getInsertFunctionForTable('updateParameters', databaseConnection, cache,
                       updateParametersInsertSql, updateParametersFetchIdSql)</pre>
<p style="margin-bottom: 0in">Using that idea, I can process entire tree of data, inserting all the values into all the tables with this loop:</p>
<pre>
for path, name, pathname in cse.FileSystem.findFileGenerator(root,lambda a: a[1] == 'complete.txt' ):
  updateParametersId = updateParametersInsertFunction(tuple([x[0](x[1]) for x in zip(databaseInsertFunctions, path.split('/'))]))</pre>
<p style="margin-bottom: 0in">I keep looking at this over and over again.  I really like it.</p>
<p style="margin-bottom: 0in">The actual software that I wrote was a touch more complicated.  I added the capability to translate values in the tables with a reference to a translation function.  I also took into account the rest of the tables that I&#8217;ve not mentioned in this posting.</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=23</wfw:commentRss>
		</item>
		<item>
		<title>Old dog, new tricks</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=20</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=20#comments</comments>
		<pubDate>Sat, 02 Feb 2008 04:05:48 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=20</guid>
		<description><![CDATA[Last Fall, I decided to learn the dvorak keyboard layout.  I didn&#8217;t take it very seriously and didn&#8217;t do so well.  After yet another bout of wrist and shoulder pain, I splurged and bought one of those expensive Kenesis &#8220;two scoops&#8221; keyboards with the switchable qwerty/dvorak layouts.  It&#8217;s been a month and I&#8217;m still really [...]]]></description>
			<content:encoded><![CDATA[<p>Last Fall, I decided to learn the dvorak keyboard layout.  I didn&#8217;t take it very seriously and didn&#8217;t do so well.  After yet another bout of wrist and shoulder pain, I splurged and bought one of those expensive Kenesis &#8220;two scoops&#8221; keyboards with the switchable qwerty/dvorak layouts.  It&#8217;s been a month and I&#8217;m still really rotten at dvorak.  I use dvorak as the default, and only switch to qwerty when I&#8217;m desperate to get something done.</p>
<p>The toughest part are the letters: o, e, i, d, c, r.  I&#8217;m always swapping those in pairs: o for e, i for d and c for r.  I run exercises to get them into muscle memory, but they just won&#8217;t stick.</p>
<p>This posting has taken an inordinate amount of time to write.</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=20</wfw:commentRss>
		</item>
		<item>
		<title>A Left Shifted Zero is Still a Zero</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=17</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=17#comments</comments>
		<pubDate>Fri, 12 Oct 2007 03:58:23 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Helix Producer]]></category>

		<category><![CDATA[C++]]></category>

		<category><![CDATA[OSUOSL]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=17</guid>
		<description><![CDATA[I had to look at the line twice because I couldn&#8217;t believe what I was seeing.  On the second inspection, it was clear that I would need to look several more times: I just couldn&#8217;t actually be seeing what it looked like I was seeing.

#3  0xb7992978 in HXAssertFailedLine (
pszExpression=0xb7a97e60 "((HRESULT) (((unsigned long)(0)&#60;&#60;31) &#124; [...]]]></description>
			<content:encoded><![CDATA[<p>I had to look at the line twice because I couldn&#8217;t believe what I was seeing.  On the second inspection, it was clear that I would need to look several more times: I just couldn&#8217;t actually be seeing what it looked like I was seeing.<br />
<code><br />
#3  0xb7992978 in HXAssertFailedLine (<br />
pszExpression=0xb7a97e60 "((HRESULT) (((unsigned long)(0)&lt;&lt;31) | ((unsigned long)(0)&lt;&lt;16) | ((unsigned long)(0))) ) == pContext-&gt;QueryInterface(IID_IHXScheduler, (void**) &amp;m_pScheduler)", pszFileName=0xb7a97e3e "hxpref.cpp", nLine=172) at hxassert.cpp:471</code></p>
<p>This is a line from a debugger.  The code that I&#8217;m debugging stopped with an assertion failure on that line.  While the code that interested me was further on the in the stack trace, this line was so absurd that I had to investigate it.</p>
<p>The code takes the constant zero, casts it to an unsigned long and then left shifts it 31 bits.  It then takes another zero cast to an unsigned long and shifts it 16 bits to the left.  These two results are bit-wised or&#8217;d together with a third zero cast to an unsigned long.  The result, which is 0 as an unsigned long, is then cast to some type called HRESULT.  Digging into the nested macro definitions, I see that HRESULT is defined as LONG32.  LONG32 is a typedef for FIXED32.  FIXED32 is a typedef for INT32.  INT32 is a typedef for int.  We&#8217;re taking a unsigned long zero, and then unceremoniously truncating it to be a signed integer.  It&#8217;s a miracle, three signed integer zeros have magically become one signed integer zero.</p>
<p>The big question is why?  How could code like this get written?</p>
<p>First I must say that any optimizing C++ compiler is going to resolve this absurdity at compile time.  The compiler will throw all that crap away and just replace it with a zero.  It may take even one step farther and get rid of it, too.  So ultimately, it doesn&#8217;t matter to the compiler or compromise runtime efficiency.  This essay is just an entertaining excursion into what seems like a comic farce.</p>
<p>The original line in the source code looks like this:<br />
<code><br />
HX_VERIFY(HXR_OK ==   pContext-&gt;QueryInterface(IID_IHXScheduler, (void**) &amp;m_pScheduler));<br />
</code></p>
<p>The offending token is the HXR_OK.  This is a macro invocation.  The macro is defined as:<br />
<code><br />
#define HXR_OK                          MAKE_HRESULT(0,0,0)<br />
</code></p>
<p>The  MAKE_HRESULT macro is defined as:<br />
<code><br />
#define MAKE_HRESULT(sev,fac,code)                                           \<br />
((HRESULT) (((unsigned long)(sev)&lt;&lt;31) | ((unsigned long)(fac)&lt;&lt;16) |   \<br />
((unsigned long)(code))) )<br />
</code></p>
<p>This whole absurdity is the result of bit-packing.  The coders wanted to take three values that described an error, the sev (severity), fac (facility) and code (error code), and pack them into one integer.  In the header file that defines both HXR_OK and MAKE_HRESULT, I can see a number of other constants defined:<br />
<code><br />
#define HXR_ABORT                       MAKE_HRESULT(1,0,0x4004)                    // 80004004<br />
#define HXR_FAIL                        MAKE_HRESULT(1,0,0x4005)                    // 80004005<br />
#define HXR_ACCESSDENIED                MAKE_HRESULT(1,7,0x0005)                    // 80070005<br />
</code></p>
<p>So in each case, the code defines a constant in such an obfuscated manner that the programmer saw fit to add a comment to each line to show the hexadecimal value that he really intended and hopefully, what the macro eventually expresses.</p>
<p>There are several hundred constants defined in this manner.  I see no facility to take these return codes apart to access any of the three values so painstakingly pieced together.  Why go to all the trouble to tightly pack an encoding of these three values if you never unpack them?  Is simply a case of over engineering?</p>
<p>There is a clue in the same header file in which these abominations are defined.  It turns out that the macro definition of MAKE_HRESULT is actually in one branch of a conditional compilation unit.  The other branch does not define MAKE_HRESULT.  Instead, it includes &lt;winerror.h&gt;.  Ah, it all becomes clearer now.  This an artifact of compatibility with Microsoft Windows.  Microsoft must have defined a system of function return status codes that follows this form.  So here we are with an Open Source project to be used on the OLPC, Linux systems, OS/X and numerous mobile platforms bound to a  convention required by Microsoft.  Is there no escape from this evil octopus?</p>
<p>Looking back to my debugger, I see now that the purpose of the shifted zeros.  It just happens to be the degenerate case of what is supposed to be a flexible system to stuff multiple pieces of information into one scalar variable.  Someday a future programmer can change how these status codes are assembled simply by changing the definition of the macro.  Of course, that will make the comments incorrect.  I will wager that the refactoring day will never come and this piece of flexible engineering will never get  the chance to flex.</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=17</wfw:commentRss>
		</item>
		<item>
		<title>No matter how much I wash, my hands are unclean</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=14</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=14#comments</comments>
		<pubDate>Tue, 10 Jul 2007 00:06:43 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Helix Producer]]></category>

		<category><![CDATA[C++]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=14</guid>
		<description><![CDATA[Here&#8217;s a C++ issue that&#8217;s been stumping me all morning.  I&#8217;ve got a compiler error in a header file that states that a particular symbol is not declared within the current scope.  I&#8217;ve been able to figure out is that the compiler is right - I certainly cannot find any way that this [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a C++ issue that&#8217;s been stumping me all morning.  I&#8217;ve got a compiler error in a header file that states that a particular symbol is not declared within the current scope.  I&#8217;ve been able to figure out is that the compiler is right - I certainly cannot find any way that this symbol is available anywhere that any compiler would be able to find it.  A mitigating factor is that the missing symbol is within the declaration of a template.</p>
<p>Grepping across all files in the project, I can find the symbol repeated in many files as a local static constant.</p>
<p><code>static const char kValueDescription[] = "Writes Null Files";</code></p>
<p>This same definition appears in many files, each file has its own unique string.  So these definitions of my missing symbol are available only after the inclusion of my header file.  Just to test, I moved the include of the header from the top of the cpp file to a point below the alleged definition of my missing symbol: the missing symbol error goes away.  It just seems plain wrong to require something to be defined prior to inclusion of a header file.  Header files should be self contained: it should forward declare classes it can&#8217;t know details about or include other headers to get the declarations that it needs.</p>
<p>So here&#8217;s what I think is going on: the code is banking on the assumption that the compiler will blindly treat the template as if it were a macro.  The original author wanted the compiler to not process the template until it is actually instantiated (coincidentally immediately after the the declaration/definition of the missing symbol). In my case, the compiler is not cooperating and as far as the source file is concerned, compiling the template prematurely.</p>
<p>I know that the GNU C++ compiler has had a bug in the past that where templates were instantiated prematurely. I&#8217;m not yet sure if this is an instance of this problem.  I&#8217;ve seen the premature template instantiation problem manifest as an unlawful symbol redefinition, not a missing symbol.  The question is, should a compiler fully check a template prior to instantiation?  I would say &#8220;yes&#8221;, unfortunately, this code says, &#8220;no.&#8221;</p>
<p>I need to find a solution.  Either I have to make the new location of the include permanent, an option I find distasteful, or I need to find a way to forward declare the static const char array in the header file.  Unfortunately, forward declarations are for classes, not data items.  How do you forward declare data?  </p>
<p>So I am forced to be unclean.  I&#8217;ve moved the header inclusion from the top of the file where it belongs down into the middle of the source file.  This makes me ill with foreboding that there will be more trouble in the future from this.</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=14</wfw:commentRss>
		</item>
		<item>
		<title>The return of the language lawyer</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=13</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=13#comments</comments>
		<pubDate>Fri, 06 Jul 2007 06:26:34 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Helix Producer]]></category>

		<category><![CDATA[C++]]></category>

		<category><![CDATA[OSUOSL]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=13</guid>
		<description><![CDATA[Now that the nursery work is virtually complete for me, I can reallocate that part of my brain to my real career.  Today, I reconnected with a part of my past: I woke the C++ language lawyer in me.  I haven&#8217;t seen him in ten years.
I have this big pile of C++ code [...]]]></description>
			<content:encoded><![CDATA[<p>Now that the nursery work is virtually complete for me, I can reallocate that part of my brain to my real career.  Today, I reconnected with a part of my past: I woke the C++ language lawyer in me.  I haven&#8217;t seen him in ten years.</p>
<p>I have this big pile of C++ code in front of me representing a development project from the late nineties.  It is a wild mixture of coding styles embracing complex preprocessor systems for class declarations simulating templates, real templates, multiple inheritance, private inheritance, a roll-it-your-own runtime type identification system, all wrapped around a chewy Microsoft COM API.  There are several hundred classes in several hundred files riddled with conditional compilation directives switching on everything from Win16 to Solaris.  Few declarations are not wrapped in defines  while most types are typedef&#8217;d or macro&#8217;d within an inch of their lives.  Oh yeah, it&#8217;s got some threading, too.</p>
<p>While a branch of this code allegedly compiles on Linux under GNU 3.x, I am trying to use version 4.1.x to compile it.  I am told that the 4.x GNU compilers are significantly pickier than the version 3x compilers.  The actual code hasn&#8217;t been touched in at least four years.  Honestly, it looks as if the team of programmers assigned to this code were unexpectedly escorted from the building during a mass layoff.  Several files look to be half way through a refactoring effort.  I find undeclared variables, misspelled enumerations, missing and ambiguous scoping, unused parameters and many other problems.</p>
<p>My task is to make it all compile because it must be drafted into use.  I&#8217;m all for recycling and have embraced the object oriented code reuse credo since 1988, but I am taken aback by the complexity of this task.  It may be that the original coders were too clever several times over.  There are some nether regions of the C++ standard that are terrifyingly beautiful with fractal complexity: but I would think twice about using them in production code.  I must say that the coder&#8217;s intent is rendered rather opaque by the language.</p>
<p>I&#8217;ve been here before.  However, I was on the other side: I have written opaque code while enthralled with the brilliant yet twisted beauty of the underlying structure.  I wrote it during the same era that this code originates. No documentation would ever be needed for it, it&#8217;s so obvious, &#8220;a child could do it&#8221;.</p>
<p>With age comes a modicum of wisdom.  I know where these people and, indeed, I myself, have gone wrong in the past.  Code must be written knowing that the high priest will pass on.  It is time for me to pay for the follies of my youth. I walk to my task of atonement willingly and with my eyes open.  When my delayed penance is served and I am free once more, I will return to Python coding with an eye for those who will come after me.</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=13</wfw:commentRss>
		</item>
		<item>
		<title>Seeking Code That Might Not Exist</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=11</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=11#comments</comments>
		<pubDate>Mon, 18 Jun 2007 17:08:18 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Helix Producer]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=11</guid>
		<description><![CDATA[I am a master of poorly defined goals.  In this project, I am to locate some specific functionality within the Helix Producer&#8217;s massive pile of source code.  The specific functionality, though, is not well defined.  At a meeting last week, I was told to look for code that actually captures video.
So I [...]]]></description>
			<content:encoded><![CDATA[<p>I am a master of poorly defined goals.  In this project, I am to locate some specific functionality within the Helix Producer&#8217;s massive pile of source code.  The specific functionality, though, is not well defined.  At a meeting last week, I was told to look for code that actually captures video.</p>
<p>So I thought to myself, &#8220;how does a program communicate with a device?&#8221;  The answer is, of course, through device drivers.  How do you talk to device drivers?  My shaky memory dredges up the word &#8220;ioctl&#8221;.  I grep for &#8220;ioctl&#8221; in the ProducerSDk code.  It appears only once in code dedicated to output filtering.  That looked like a dead end.</p>
<p>The <a href="https://producersdk.helixcommunity.org/docs/ProducerSDK11.pdf">Helix Producer SDK Developer&#8217;s Guide</a> on page 37 states: &#8220;Capture devices are plug-ins that wrap operating-specific capture subsystems,<br />
such as DirectShow or Video4Linux.&#8221;  This states directly that the ProducerSDK delegates capturing to an underlying system.  DirectShow is a Microsoft thing, so I can disregard it.  Video4Linux (V4L) is my target.</p>
<p>Somewhere in the ProducerSDK there is code that interacts with V4L - it should be simple enough to find.  I downloaded an example of C code that uses V4L (specifically: <a href="http://v4l2spec.bytesex.org/v4l2spec/capture.c">capture.c</a>).  Looking inside, I see lots of calls to &#8220;ioctl&#8221; to interact with the driver.  I see no code in ProducerSDK that looks like this code.</p>
<p>Hmmm, I&#8217;m not sure what this means.  I wonder if I have the right version of the ProducerSDK.</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=11</wfw:commentRss>
		</item>
		<item>
		<title>Is This Thread Safe?</title>
		<link>http://staff.osuosl.org/~lohnk/blog/?p=10</link>
		<comments>http://staff.osuosl.org/~lohnk/blog/?p=10#comments</comments>
		<pubDate>Mon, 11 Jun 2007 21:02:40 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[Helix Producer]]></category>

		<category><![CDATA[C++]]></category>

		<guid isPermaLink="false">http://staff.osuosl.org/~lohnk/blog/?p=10</guid>
		<description><![CDATA[In my quest to understand the structure and functioning of the Helix Producer SDK code, I am looking at it one tiny piece at a time.  Since it is our goal to extract useful code from Helix Producer to include in our project, someone needs to understand the dependencies that we might encounter.  [...]]]></description>
			<content:encoded><![CDATA[<p>In my quest to understand the structure and functioning of the Helix Producer SDK code, I am looking at it one tiny piece at a time.  Since it is our goal to extract useful code from Helix Producer to include in our project, someone needs to understand the dependencies that we might encounter.    Today I started examining the system of smart pointers.</p>
<p><strong>Digression - what is a smart pointer ?</strong></p>
<blockquote><p> Smart pointers, sometimes referred to as reference counted pointers, are a memory management technique used by C++ programmers.  Unlike Java, C# or Python, unused memory is not automatically garbage collected in C++.  By adding a layer of indirection, an instance of a special pointer class wraps a traditional C style pointer.  This class instance has special copy and assignment constructor semantics: when it is copied, a counter on the thing it is pointing to is incremented.  When one of these pointer wrappers is destructed, the counter is decremented.  If the counter goes to zero, then it is known that the thing pointed to is no longer referenced by anything and can itself be deallocated.</p></blockquote>
<p align="left">Smart pointers must be very careful in multi-threaded environments.  The act of copying itself must be atomic - <a href="http://www.imdb.com/title/tt0045464/quotes">very atomic</a>.  How does a smart pointer copy itself?  Using a copy constructor, it first allocates a memory for a new instance of the smart pointer class and then initializes the instance variables.  The only instance variable is the pointer to the thing of interest.  Once the pointer is copied, then it increments the reference counter in the pointed-to object.   What would happen if the original smart pointer&#8217;s destructor were called in the instant between the copy of the pointer and incrementing the reference counter?  It is possible that the referenced object could be destructed before the copy gets the chance to increment the reference counter.</p>
<p align="left"> <strong>Helix Producer Smart Pointers Appear to be Vulnerable</strong></p>
<p align="left"> Check out this the copy constructor from Helix Producer&#8217;s Smart Pointer Code in <a href="http://staff.osuosl.org/~lohnk/helixProducerSDK/hxtsmartpointer_8h-source.html">producersdk/common/include/hxtsmartpointer.h</a>:</p>
<blockquote>
<pre>
00113 template&lt;class T&gt;
00114 inline CHXTSmartPtr&lt;T&gt;::CHXTSmartPtr( const CHXTSmartPtr&lt;T&gt; &amp;spCopy ) :
00115         m_ptr( spCopy.m_ptr )
00116 {
00117         if ( m_ptr )
00118         {
00119                 m_ptr-&gt;AddRef();
00120         }
00121 }</pre>
</blockquote>
<p align="left">The actual pointer is copied in the initializer on line 00115.  Then, in the body of the copy constructor, if the pointer is not zero, the referenced object&#8217;s counter is incremented.  The member function &#8220;AddRef&#8221; is implemented in <a href="http://staff.osuosl.org/~lohnk/helixProducerSDK/hxtunknown_8h-source.html">producersdk/common/include/hxtunknown.h</a> as:</p>
<blockquote>
<pre>
00084         STDMETHOD_(ULONG32, AddRef) (void)
00085         {
00086                 return InterlockedIncrement( &amp;m_lCount );
00087         }</pre>
</blockquote>
<p align="left">A little spelunking reveals that &#8220;InterlockedIncrement&#8221; is an atomic operation defined in a Microsoft API.  This function call is properly protected in a multi-threaded environment.</p>
<p align="left"> As I stated above, though, I believe that both the copy of the referenced pointer and the increment must be protected.  How is data shared between threads? It doesn&#8217;t seem to matter which thread invokes the copy constructor, the code is vulnerable.</p>
<p align="left">Scenario 1: If a reference to a smart pointer is given to a thread and that thread invokes the copy constructor to make its own copy, then it is vulnerable to accessing deallocated memory.  The original smart pointer could have been destructed between the time of the internal pointer&#8217;s assignment and incrementing the reference counter.</p>
<p align="left">Scenario 2: If a master thread makes its own copy of the smart pointer object for use by a child thread, then the master thread must explicitly take care to not allow the copy go out of scope while the thread still lives.  If the master thread lets the smart pointer destructor be invoked, then the thread has an invalid copy.  If the child thread were to make its own copy, then we&#8217;re right back to scenario 1.</p>
<p align="left"><strong>Does This Really Happen in Helix Producer?</strong></p>
<p align="left">I do not know.  I will need to study how the threading model works in our target application.  Perhaps we&#8217;ll be lucky and referenced counted objects are never shared between threads, but I doubt it.</p>
<p align="left"><strong>Further problems with CHXTSmartPtr </strong></p>
<p align="left">CHXTSmartPtr&#8217;s assignment operator fails to take into account the tautological &#8220;<a href="http://www.parashift.com/c++-faq-lite/assignment-operators.html#faq-12.1">self assignment</a>&#8220;.</p>
<blockquote>
<pre>
CHXTSmartPtr&lt;SomeCHTXUnknownDerivative&gt; p(new SomeCHTXUnknownDerivative);
CHXTSmartPtr&lt;SomeCHTXUnknownDerivative&gt;&amp; q(p);
p = q;  //deallocated memory referenced</pre>
</blockquote>
<p align="left">Granted, code such as direct as this would not be written.  However, it is not unfathomable for the provenance of the R-value to be unknown.  In such a situation, a self assignment could be completely inadvertent but ultimately disasterous.</p>
<p align="left">&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://staff.osuosl.org/~lohnk/blog/?feed=rss2&amp;p=10</wfw:commentRss>
		</item>
	</channel>
</rss>
