Monday, February 02, 2009

Test Driven Design examples

I had a post planned out to cover TDD in some detail, but I got about halfway into it when I realized that since I don't get to do enough of it, I'd be better off (actually, you'd be better off) if I pointed you at someplace where it gets done more often, recent events notwithstanding.

Two of the big problems I have with TDD are
  1. How to get started on a project with it, and
  2. How to get it going if you've got a mess of code already

Now, some languages make it a little easier - Java and Python, for example, allow you to add a main() to every file, so that you can invoke each class with a set of test parameters from the start.  This will serve to bootstrap your test scaffolding in most cases, since you can make all your classes callable from their own main(), and have some tests there.  As your application grows, you can eliminate the test main() from more of the classes as you pull methods into patterns, allowing you to move the tests from the original file to the new one.

If you're using C++. however, you're in a bind.  You can't have more than one main, and you need it for your actual program.  A way around this is to make a class that is the core of the process, and have the main() be a shell, calling the constructor of that class, and possibly a start method, as the whole of the function.  Again, as your codebase grows, you can move the test code around as necessary.

If you've got a mess of code already, with now tests, your best bet is to start with the interfaces.  Anywhere there is a API - RPC, SOAP, TCP/IP, etc - you can write a program to provide either end of the connection, and start building a test harness from that.  Use scripts to run the basics, testing every message with good and bad data.  If you have an API at each end of the process, you can connect both end to a single program to allow it to confirm success/failure of any given message.  A tool like netcat can be useful for this - your individual test programs can have a socket sending text datagrams to each other with results.

A caution - this can get interesting enough to consume all your spare time - it's a lot more fun to make things that you can see working than to be wading through spaghetti code that needs  refactoring but lacks tests for checking functionality, so enforce some limits on your test scaffold building, unless you have a mandate from your management to build the test sets.




Technorati Tags --
, , ,
HTTP

Sunday, February 01, 2009

How DO You Get To Carnegie Hall?

I've mentioned Code Katas before, and practice, and I found this recently, from ORA, and while it wanders into general learning theory, it covers the idea that you need to practice code to get better, and that you need to be reviewing your code to see where you went wrong (rather like sports teams reviewing the game footage for pointers, or fighter pilots reviewing dogfight radar records).

So when you have a moment, try rewriting a small module of your current project, using a new data structure, or with only library functions.

(Also, a milestone for the blog - 100 posts!)


Technorati Tags --
, , ,
HTTP

Pithy Sayings

Here's a few good quotes about software.  I especially like #s 1, 3, and 6, notwithstanding #6 is from Bill Gates

Given Microsoft's code bloat, I wonder if he's changed his mind?



Technorati Tags --
, , ,
HTTP

Saturday, January 31, 2009

Playing Hurt - Can you sprain your corpus callosum?

As s long-time reader of the C2 wiki, I was pleasantly surprised to find another blogger who referenced it in a post about Playing Hurt.  Now, in comparison to, say, pro football players (American football), who play in freezing weather with sprained ankles, cracked ribs, and minor concussions, the "hurt" a programmer experiences is small potatoes.  On the other hand, since what we do is nearly 100% mental
("95% of this game is half mental" -- Yogi Berra), our "hurt" is all mental. 

The leading issues are burnout and overload.  Burnout is the end result of chronic overload, but can also be caused by other things, like continual dismissal of complaints about bad process, lots of time working on projects that go nowhere, lack of recognition, the host of typical workplace issues, and
conflicting priorities (aka matrix management)

Now, there is not anything conclusive that shows that playing hurt in Software Development leads to drastically worse software.  It's fairly clear that your product will be lower than your best, but nothing that shows it will be in the bottom half of your ability.  However, you will find that over time, your lower quality will snowball into larger problems.

Possible solutions are:
  1. discuss better work conditions with your boss - more input into schedules, process improvements, better recognition
  2. find another job - not that it's any better anywhere else, but the change in environment and the learning you have to do will make you feel better.
  3. a side project - pick something you want to fix at your job, and start carving out an hour or two a week to work on it.  Maybe it's test harnesses for your systems, or automating a build, or just learning about another part of the system.  Try to find something fun to do.





Technorati Tags --
, , ,
HTTP

Friday, January 30, 2009

Probabilities and Innumeracy

As part of the National Too Much Information Week, I'm explaining some of my blogging habits.  I have a few searches looking for interesting posts in the blogosphere, and a feed reader across a number of programming-relevant sites, and I glean interesting posts from them as fodder for my posts.  It's my small part of making sure the blogs are a self-referential recursive rabbit hole.

I read this a few weeks ago, and having taken a number of math and biology classes in my years, immediately saw that the correct answer was 2/3s.  But reading the comments really depressed me.  Not because so many people jumped to the naive conclusion of 1/2, but that they are seemingly resistant to learning why the naive answer is wrong.

I'm not talking about the people who take the "commonsense" approach that the father saying "One of them is a girl" meaning "only one" is a girl - most of them seem to agree that is a semantic issue.  But those that continue to insist that with 4 outcomes, the odds of 2 of them don't add is just dumbfounding.

Just for those not wanting to just through the long discussion, here's the short form correct answer:

Dad says "I have 2 kids, and one of them is a girl.  What are the odds that I have one son?"

the grid:
             1st B     1st G
2nd B     BB         GB
2nd G     BG        GG

That is the sum total of possibilities for the family makeup

Now, the odds of a given child being male is 50% (overall, ignoring the slight differences of survival, etc), so each "box" of the grid has a 25% chance of being the "real" family makeup.  But Dad says he's got one girl, so we can exclude the upper left box, leaving 3 boxes each with 25% chance.  Now, note that there are 2 boxes with mixed kids, and one with only girls, so the odds of having a boy are 2/3s.

[Edit]
I think I have found the problem that everyone who says 50% is running into - they are flipping the meaning of "one of them is a girl" - they are taking this to mean they get to pick one of the BG or GG cells, but they can't do that with an even probability - all they can really do is exclude the BB cell.  In other words "one of them is a girl" only uniquely identifies a single cell for exclusion.

[later edit]
Here's a small Python program that details the logic, and shows how the bad assumptions work.  It's especially useful because the bad assumptions play directly to the numbers of boys and girls in the generated families. (some lines got folded, so if you cut&paste this, be awa




# 2kids - run simulations to test
# generate 1000 random 2-kid families
# 0 = boy, 1 = girl

import random

# generate child
def genChild():
     sex = random.random()
     if sex < 0.5:
         return 0
         return 1

# define Family class
class Family:
     def __init__(self):
         self.genFamily()

     # generate family
     def genFamily(self):
         self.child1 = genChild()
        self.child2 = genChild()

     # see if there is at least one girl
     def hasAGirl(f):
         if(f.child1 == 1 or f.child2 == 1):
             return 1
         return 0

     # see if there is a boy
     def hasABoy(f):
         if(f.child1 == 0 or f.child2 == 0):
             return 1
         return 0

random.seed()
familyList = []
for f in range(1000):
familyList.append(Family())
allBoys = 0
allGirls = 0
for family in familyList:
     if(family.hasAGirl()):
         if(family.hasABoy()):
             pass
         else:
             allGirls += 1
     else:
         if(family.hasAGirl()):
             pass
         else:
             allBoys +=1
print "Family breakdown: BB = ", allBoys, " GG = ", allGirls," mixed = ", 1000 - (allBoys + allGirls)

brothers1 = 0
for family in familyList:
     if(family.hasAGirl()):
         if(family.hasABoy()):
             brothers1 += 1
print "filter by hasAGirl first, then hasABoy, number of familes with boys = ", brothers1

brothers2 = 0
for family in familyList:
    if(family.hasAGirl() == 0):
         pass
    if(family.hasABoy()):
         brothers2 += 1

print "filter by !hasAGirl first, then hasABoy, number of familes with boys = ", brothers2

print "filtering by eliminating all-boy families first"
brothers3 = 0
familiesWithGirls = []
for family in familyList:
     if(family.hasAGirl()):
         familiesWithGirls.append(family)
print "# of families that have 1 or more girls = ", len(familiesWithGirls)

for family in familiesWithGirls:
     if(family.hasABoy()):
         brothers3 +=1
print "# of families with 1 or more girls that have one boy = ",brothers3
print "% of families with 1 or more girls that have one boy = ", ((1.0 * brothers3) / en(familiesWithGirls)) * 100






Technorati Tags --
, , ,
HTTP

Friday, January 23, 2009

Why We Need Programmers

Every few years, it seems, there's another push to obsolete programmers.  Sometimes it's overt, like COBOL was ("We'll make it so the businessmen can just say what they want done!"), and sometimes it's more subtle, but it always gets attention from the industries that depend on programmers.

The problem with all these efforts is that we're still a long way from being able to take a fuzzy natural language description of the desired solution and turn that into a working program.  That means that someone who is not a programmer has to write the code for the program in some programming language.  And that leads to all sorts of issues:

  1. Naive code - someone who has not learned about programming practices will write code that is difficult to learn and modify.
  2. Incomplete or inefficient programs - someone who has not studied programming will lack knowledge of algorithms, and will lack experience of gap analysis.
  3. Just plain shoddy work - the erstwhile programmers will not want to be programming, because they want to be solving their real problems.
The explosion of spreadsheets and database query languages exposed many domain experts to a small facet of programming, and as studies have shown, most people think they are more competent at things than they really are, and the least competent are really bad at seeing their cluelessness.

Programmers will have the background to recognize O(n2) algorithms; they will be familiar with the tools like code versioning systems and parsers; they will have passion for the code itself, so that it will be easy to maintain and modify.

I guess it's rather like plumbing.  It's not hard for a homeowner to run a little length of drainpipe, but when it gets to plumbing a whole house, or handling a sewer  connection, they prefer to use a professional.


Technorati Tags --
, , ,
HTTP

Sunday, January 18, 2009

More on Offshoring and Its Perils

As my reader(s) know(s) [Hi Brian!], I'm way behind on my "interesting idea for a blog post" list, so I may be pumping a lot of relatively short posts out to try and get back up to the current set of items.

Anyway, I saw this post about offshoring, written by a developer in India, and I felt it worthy of mentioning, as it rings true - contractors never feel as close to project as full-time employees do, and the more remote the home office, the less involved even the employees feel. (look for the letter from the guy from London)

Adding to this is that nowadays, most of the Indian developers are available through the big consulting houses, so you get even less identification with the project, and more opportunities for corporate shenanigans that derail projects.

Of course, there is always the option of hiring directly, but that runs into the problem of scale.  In the US, a job posting might get 100 resumes; 50 can be dumped right away for totally wrong qualifications, 25 for inadequate skills, 15 weeded out in the phone interviews/standardized tests, and the final 10 interviewd face-to-face and the best picked.  However, in India, that same job posting might bring in 1000 candidates, and unless you have 10X the interview staff, you cannot get it winnowed down to the best 10 people - you might get it down to 100, interview 50, and hope for the best.






Technorati Tags --
, , ,
HTTP