Ninja Software Development

Wednesday, February 08, 2006

Developer Tip - One Code Stream To Rule Them All

Here's something that I recommend that might fly in the face of conventional wisdom regarding the use of your CVS to manage multiple branches of code in the corporate world - if at all possible, DON'T!

It's not because the tools can't handle them - most modern CVS tools are adequate for managing separate code streams. It's the overhead in managing the streams that will kill you. Every bug found will need to be tracked once for each stream, fixed once for each stream, and tested once for each stream. If you have different code in modules along the branches, the fixes are more complex, since you can't just plug in the same lines of code.

If you have a fluid feature set for a given release, as is unfortunately common, you'll find that features jump between releases depending on:

Progress - if the work can't be done in time, the feature moves to the next release.
Contractual obligations - someone signed papers promising it earlier than previously scheduled.
Hail Mary business deals - the future of the company depends on closing this deal, and this feature is the only way to do that.
Customer release cycles - a common problem. The customer wants the feature by time N, and the schedules are drawn up, and suddenly the customer decides they need 6 weeks of certification testing to accept the new code you give them, meaning that you have to pull the important features back into the current release to get it to the customer before the certification deadline.
Customer A doesn't want Feature X, but Customer B does. Several releases later, Customer A wants Feature X, after a lot of divergence in the code streams.

So how to solve this?

For every feature in the product, have an activation flag. You may need more than one flavor of this, to account for the differences between optional features that everyone gets to choose to activate or not, and bespoke features that only some customers will be allowed to activate, or even know about. In any case, build a library that will have a set of simple functions that will determine if a given feature is in use at runtime. Don't forget multi-valued options, like a staged conversion between file formats - you'll have a value for pre-conversion, mid-conversion [when both old and new formats are valid], and post-conversion.

Once a feature has been defined, write all the code to handle that feature in all of its incarnations, so that there is no need to branch your code stream. You may need to branch for customer-specific configurations to present the options in a desired manner, but that code should be very small, and managing it on a per-customer basis will not be taxing.

Another important facet of this is to concentrate the decision-making for a feature in as few places as possible, and have the rest of the code handle all possibilities without checking for feature activation. An example of this would be a system that talks to some specific type of equipment only if licensed. The communication code should be written to handle messages from all types of equipment, and the next layer up would be the code that decides to send messages to that type only if licensed, and to respond to any message from that type only if licensed. The fact that messages from an unlicensed type of equipment are passed up to the higher layer instead of being rejected at the lower layer is not a problem - the logic of licensing has no business in the communications layer of a system.

This will develop your system as both flexible and simple, and spare you the headaches of trying to port features into radically different code streams. Your CVS tool will present simple trees instead of briar patches of inheritance. If you use a database, the schema will easier to upgrade, since there will be no releases that have different parts of a later schema - they will all be the same, or newer ones will be supersets of older ones (barring removal of tables, in which case they will be subsets, but functionally complete subsets).

Technorati Tags --
Software, SoftwareDevelopment, Computers, Programming

Tuesday, February 07, 2006

In on the ground floor

Let's talk for a minute about the ideal software developer situation - when you are there before the product exists. Getting in on the ground floor is exhilarating; there is no cruft in the code, there is no backwards compatibility to maintain, there is no build-up of sludge in the process. But it's also a challenge - there is no structure to work from, no library of existing code that you know fit your problem space. So what do you do?

If you've got a team already, then you probably have a work style that the team is comfortable with. Take a few days to meet and analyze this, to see if there are issues that need to be handled, and if there is anything the team really wants to change. Get consensus, but you don't need unanimity - as long as the dissenters are recognized and taken seriously, things can proceed.

From that point, you have a number of things to decide:

pick your code management system. Make sure it fits your development process, and your company's release style. If you don't plan to have more than one code stream, things like RCS and SCCS may be perfectly suitable. Otherwise, something like CVS, or Subversion may be more suitable. For large companies, something like Telelogic's Synergy or IBM/Rational's ClearCase might be mandatory. In any case, settle the question and set up the project in the system.
System startup and shutdown. Decide how the system will start and stop. If the product is a single program, this may be a no-brainer - a command line or desktop shortcut may be all that is needed. If the product is a full suite of programs that need to be running all the time, you may need to interact with the operating system in some way - inittab for UNIX-style OS's, for example. Don't forget to examine the need to stop the system for upgrades!
Configuration. Determine how your applications will get their configuration. The two obvious choices are a database or a configuration file. If the application does not have a DB, then file(s) are your only choice. If you have a DB, then you can choose. The factors influencing you will be the type of data you need for configuration, and the amount of data needed. Be sure to make your programs capable of re-reading configuration on demand, to make runtime changes possible and easy.
Logging. Your programs will need to log abnormal conditions, errors, and other information. Decide how you will need to report this data. On UNIX-like systems, syslog is a good choice. if you want to piggyback on the OS facilities. One thing to consider with your logging system is rollover - you do not want to fill up the disk with one large file. Design the logging system to allow the log file to be moved out and replaced, and manage the saved files so you don't fill the disk. Another consideration is whether or not you have multiple copies of a single program running - your file naming scheme should make allowances for this, as well as the rest of your logging scheme - if you have a single logging process that connects to the work processes by name, or some such.
Interprocess communication. You've got a lot of options - RPC, CORBA, plain sockets, shared memory. Don't overlook some newer options, given the versatile libraries of modern languages, of email and instant messaging; also don't discount such old-school options like using the database, or files on the disk as communications channels. Build your libraries using the minimum number of options, but keep the interfaces clear (see my previous entries on this )
Coding Standards. I'll address this more some other time, but work out a coding standard for your project with the developers, and if possible, make that format automatic - use hooks in the CMS to convert files into that standard upon check-in. This will prevent disagreements from stubborn developers - they can code their own way if they insist, and the code gets dropped into the common area in the standard form.

That's at least a few of the things that I feel important about a starting project.

Technorati Tags --
Software, SoftwareDevelopment, Computers, Programming

Thursday, February 02, 2006

Developer Tip: Defining an API

Ok, enough blather about theory, company politics, and other crap. How about something useful to an everyday developer?

When defining an API, make it as simple and as obvious as possible what data is actually being passed across the API.

This will make the API much easier to document - the names of the fields in the structures/objects/messages will be mnemonic, and you can document them as comments in the IDL file.
You can use an IDL translator to convert from one IDL to another, like RPC to CORBA, or from one language to another, like C to Python
You can automate the argument marshalling code if you want to write one or both sides of the API in another language.

An example I have seen that violates this was an API that wrapped all the arguments inside functor objects that were templates typed by the number of arguments they took. It's a great utility mechanism, because all your calls across the API are compact, taking one argument, and no return value because the functor inside will handle all that when the callback is made.

HOWEVER, this API was aboslutely miserable to develop with, because you had to go deep into the code on both client and server sides to discover what the arguments being passed really were. We wanted to build a test framework to exercise a server, but we had to manually build the structures populate the arguments, instead of being able to build them automatically from the IDL files.

A counterexample is a typical RPC interface to C programs. It will have a number of structures defined in the .x file, and it's a small matter of programming to build, buy, or Google a program to parse the .x file and produce the other files to build a test scaffold from in the language of your choice.

Technorati Tags --
Software, SoftwareDevelopment, Computers, Programming

Monday, January 30, 2006

A bit of explanation

Why is this blog called "Ninja Software Development", you may ask?
(Well, you might ask. Someone might. Ok, nobody would, but I'm going to explain it anyway)

It's not from a fannish infatuation with the Ninja as super-stealthy assassins, or an anime fixation. The key elements that I wished to convey were the secrecy of the ninja, and their resistance to a stronger force. The forces of business are arrayed against the development of good software, and the developers need to work behind the scenes to counter the greater army.

A colleague of mine once commented that it was a sad state of affairs when developers had to resort to skunkworks methods to improve software quality. He was right, in that the business risk management practices often require the reduction of change, which mean that quality changes will be delayed, and the business case of quality improvements is hard to make. And that, given the current business climate of making the numbers for the current quarter, means that nobody wants to spend time developing things with no new features. So the only way to improve software quality is to do so stealthily, since process improvements are equally difficult to make a business case for these days. Refactoring is done as part of a new feature; algorithmic changes are slipped in as a bug fix. And thus are we able to keep entropy at bay.

Technorati Tags --
Software, SoftwareDevelopment, Computers, Programming

Sunday, January 29, 2006

Lots of programmers == lots of trouble!

Yesterday I backranted about offshoring, particularly to India. I discussed a number of things that I feel are difficulties for corporations offshoring to India. I forgot a more fundamental difficult for the company wanting to hire India programmers.

If they are intending to hire directly, they will be running into a problem of scale.

For a typical American job opening, a company may get (for example) 100 applications. About 50 will be dropped immediately for various reasons. Around 30 will be dropped at first resume review for insufficient capabilities. From the remaining 20, 10 will be picked as the best qualified, and get phone interviews. The field will be narrowed to 4, and the interviews will pick the best 2 of them to make offers to. When all is done, the company can feel reasonably certain that they have made a good choice in hiring the best qualified candidate.

Now consider this in India - where the number of incoming resumes is an order of magnitude larger. Instead of 100 applications, there are 1000. Assuming a typical bell curve (India may have a lot of good programmers, but statistics says that they will have an equivalent number of bad ones!), 500 of these will be dropped immediately, for the same sort of reasons as the previous example. Another 300 will fail to pass a resume review. This leaves 200, which will be sorted to 100 "best fit" candidates. Now, phone interviews for 100 people will be much less effective at narrowing down the field, compared to interviews for 10 people (assuming that phone interviews are even done in India, in which case this point is even more valid), so it will be a much more random selection out of the final 100 to get the 10 to interview. This boils down to a less-good fit for the selected candidate. And this does not even consider the amount of time to interview the larger number of "final" candidates!

Now, some would suggest that this can be countered by using local firms to do the hiring. I don't think that this will improve the situation much, because if the people hiring are not those who know what the position requires, they will not find people who match the need.

Technorati Tags --
Software, SoftwareDevelopment, Computers, Programming

Saturday, January 28, 2006

Software Infrastructure and You

In a recent rant, Rockford Lhotka tears into developers as spending too much time on the software infrastructure and not enough on delivering business value. He uses a Visual Basic application written some time ago as an example of something that worked just fine then, and is even better now that PCs have increased speed by at least one order of magnitude.

As a developer, I quite naturally take offense to this. We are not, by and large, avoiding improving the product by playing with the background stuff. Here are a few counterpoints:

Business app are rather pedestrian to write, but the bigger question is why are we still writing them? Because someone wants something a little bit different from anything else out there, and instead of living with the existing applications, they want someone to build an application that fits their precise needs. IBM had a small business suite back in 1985 that covered everything, and I mean everything, so all businesses should be using that app now, right? But instead, a business-type wants to be creative, and requests new software.
He says that the customer doesn't care about infrastructure. That doesn't mean infrastructure is not important. The customer may not care about the condition of the roads when he asks for his supplies to be delivered to him and his products to be delivered to the buyers, but bad roads will damage trucks, old narrow roads will delay trucks, and low overpasses will stop trucks, and the customer will lose money. The infrastructure of the dialup days will not handle today's DSL/cable modem traffic levels; today's high-perfomance network no longer needs the heavy error-correction protocols required by the slow and noisy networks of old. A prime example of this is email - remember when an email address was a series of machine names separated by "!"s, with other odd characters in the last part? It was no worse an email system then that it is now (as far as getting a message from one computer to another), but the infrastructure changes (DNS, STMP/IMAP/POP everywhere, and generally universal TCP/IP connectivity) have made email much better for the end-user. [And due to some original decisions about this infrastructure, worse, from SPAM, because the infrastructure was not developed with expectations of anti-social use. To fix this, there is ongoing infrastructure work in a number of projects]
He bemoans the trends to web services and similar things away from the older methods. In my experience, one of the biggest forces in driving changes in infrastructure is the business press. I recall one boss asking I investigate moving our application to a CORBA interface, because he had read about CORBA in the business press and thought to jump on the bandwagon. There was no reason to move to CORBA at that point - our application was only talking to other applications we wrote, and the socket-based protocol was adequate for our expected needs. Other times, the move to new technology is mandated by the government, like the demise of analog TV signals.
Another case is that sometimes the customer drives these changes. The big draw of web-based services is that the customer does not need to install something on every PC in the organization in order to use the application - each user has a PC with a web browser already on it. Or the customer favors something else. I know of one vendor whose application was chosen over another vendor's because it had drop shadows on the buttons, while the application not chosen used Motif widgets that "only" had bevels on its buttons.

Sure, given the choice, I'd rather develop something cool, instead of Yet Another Accounts Payable system. But plenty of customer-value things are cool - Firefox/Thunderbird, flickr, deli.cio.us, blogging, P2P software, VOIP, and so on - we developers hardly need to play with infrastructure to be doing cool stuff.

Technorati Tags --
Software
SoftwareDevelopment
Computers
Programming

Offshoring Software - a Developer's Opinion

A brief scan of Google's Blog Search looking for other blogs on software development and I happened upon a post by a Rockford Lhotka about how software is too hard, and that the developers have brought offshoring down on themselves because they are always re-inventing things that already work well enough for the business purposes.

Well, I have an opinion about that (qu'elle suprise)

Offshoring is driven almost totally by costs. Companies look at what it costs to develop software, and it's significantly cheaper to do it offshore in India (or sometimes China), because of the much lower cost of living over there. The additional expenses of getting a reliable development center set up, sending managers over there to hire and spin up the offshore team, and communicate are smaller than the saving on salaries and benefits. So the company have a lower total cost to develop, and that looks good to them.

There are 2 flies in this profitable ointment, however: the offshoring boom is driving Indian developer salaries up fast; and dealing with a remote team from a different culture, with significant communication barriers, makes it difficult to develop good software.

Let's take a quick look at the first point - salaries. Indian cities like Bangalore, Hyderabad, and Chennai (formerly Madras) all compete for the local developers. The boom has brought many US companies to the country, where they have lots of jobs to fill. So they offer better salaries to attract people. Then the next big company arrives, and has to offer bigger salaries to draw people their way. Sure, there's a long way to go before they cost as much as Americans, but they are cutting into the "big savings" with every raise. A corollary to this is turnover - as salaries rise in Bangalore, the Hyderabad developers see a chance to better their lot by finding a job over there; 2 months later, Chennai salaries create envy in Bangalore developers.

Now for the second point - developing with a remote team. I often find it ironic that American managers who won't let their staff telecommute "because if I can't see them, how do I know they're working?" are willing to create an entire department half a world away where they will meet face-to-face no more then once a month at best. Indians in software all speak English fairly well, in theory. But quite often the combination of accent and speech patterns conspire to make them difficult for Americans to understand, particularly over the phone. This make collaboration less efficient, and truly harder to resolve misunderstandings.

The time difference is another factor. If the finishing touches to a product must be made in the US, the Indian team will need to have finished them the previous workday, meaning 36 hours ago, to allow for their changes to percolate through most CMS systems. Even if your CMS system can take changes rapidly, the 10.5 hour time difference means that the morning is spent pulling in the India work and rebuild the application, which for local development could have been done in the overnight hours. And if there are bugs, the Indian team has gone home for the night, making it harder to get their input to fix things. So the Indian team has a deadline that may be 2 days earlier than the local team.

Another issue is the development culture. American developers, especially after the dot-com boom, are wont to question everything about a design. While this can be very frustrating for the designers, it does tend to flush out any problems with the design, the design documents, and sometimes even with the project itself. I have found a fair number of errors in the documentation of a design by questionsing the intent. From anecdotal evidence, significant numbers of Indian developers do not question such things, for whatever reason, leading to incorrect software. It may be that they see the inconsistency, but do not feel comfortable bringing it to the attention of the designer, or that they assume that the designer intends it for a reason, but does not explicitly state the reason. Whatever the cause, it does not further the project. Add to this the reluctance the remote team may have for reporting difficulties they are having back to the home office - few people like to report bad news, and the cultural divide makes this reluctance greater.

A large proportion of Indian developers have come out of longer academic careers than typical for American developers. This is problem in a production environment, regardless of the location of the developer, because the rules of software development in academia are quite different than in business. Many former academic programmers have difficulty transitioning to an environment where the projects are usually quite detailed on how it looks, and less detailed on how it works, yet requiring high-perfomance output. I've seen fresh-out developers struggle over what to name database columns, because their boss did not give them a detailed naming plan. Other common academically-induced quirks are oddball variable naming schemes (sports teams, etc), and a really rigid view of commenting style.

[I hasten to add here, that I'm not in the least saying that Indian developers are not smart, or capable of developing killer software, but rather that in the context of developing software as part of an American company, they have some hurdles to reaching peak performance, most of which are purely part of the distance and differences in language and culture. Such issues arise between American and European development teams, as well, but the greater commonality of history makes the cultural issues a little less complex.]

And another thing - one of the key issues with a remote team is being able to provide them with good documentation - requirements, architecture, design, and so forth. Many (most?) American development shops, frankly, suck at development documentation. So, no matter how great the Indian team may be, they may not have any chance of developing the desired product because they don't know features to develop!

Part 2 of this will address the other side of Rockford's rant - that developers spend too much time fiddling with the internals of software and not enough time working on business value.

Technorati Tags --
Software
SoftwareDevelopment
Computers
Programming