« Grade Inflation at Cornell | Main | Vishy's Indian English Dictionary: parcel »

May 11, 2005

The influence of organizational structure on software

It isn't a stretch to say that the high-level structure of software is influenced by the structure of the organization that produces it. Most successful software today is organized around several high-level components or modules. Each of these modules display the following characteristics:

1. It has a published, well-defined interface for other modules and processes to access its functionality.
2. It tries to minimize external dependencies so that as an application evolves, changes occur in few sharply defined parts rather than diffusively through the application.
3. It hides internal implementation details so that changes and extensions can occur without breaking code that already depends on it.

When the initial design for an application is created, it makes sense for an independent programming team to take on bringing a module from idea to product. Note that this doesn't always mean there are as many teams as there are modules. All it means is that every module typically has one team working on it and that not every team needs to know the internal details of modules it does not own. Thus, in early stages of the application, the design of the application influences the structure of the teams that build it. As the application evolves, this organizational structure becomes more entrenched and the incentive to reorganize the application development team grows smaller because things are working well presently. The cause-effect relationship begins to change and the structure of the application begins to reflect who talks to whom (more on this below).

The question this essay tries to address is how close this organizationally-influenced module structure comes to the technically optimal structure for an application.

(Before you read any further, I would like to note that the idea of organizational structure influencing application structure is not an original idea from me. I had heard it being mentioned in my software engineering class at MIT, but when I looked for it recently on Google, I was unable to find a good attribution for it. If anybody could point me to a good formal statement of this principle, I'd greatly appreciate it.)

Inherent in a company's organizational structure is the organization of its knowledge and intellectual capital. Despite working within the same company, ostensibly towards the same goals, it is hard to eliminate redundant work entirely and make every team perform tasks that perfectly complement other teams' tasks. Personal and team competitiveness ensure that some amount of work gets duplicated between teams that have similar goals. Furthermore, the same competitive spirit leads teams working on similar projects to be secretive and not share details with each other. Several examples of this occurred when HP merged with Compaq under the watch of Carly Fiorina. According to some accounts, for a while, teams from HP and Compaq that were in the same field competed against each other to ensure their survival.

Let's consider how exactly a company's organization can affect the organization of its software products over the course of time. In companies that have a product-centric organization, such as Microsoft, the structure of their software is related to who talks to whom. We have seen some products that worked well, that were designed for each other, such as Windows 3.1 and MS-DOS 5. Other well-integrated technologies like the Office suite work well together because there were direct lines of communication between those development teams. The depth of communication between the various Office teams led to technologies like Object Linking and Embedding, and indirectly influenced the later development of COM, COM+ and the .NET runtime.

Yet, from the same company, we have seen two data access technologies -- DAO and RDO around the same time. I am not familiar with the intimate details of their programming models, but at first glance, it would seem that there is significant overlap in their functions. (Of course, we are not mentioning ADO here, which was another data access library that came later.) Why should all this redundancy in data access libraries have occurred in the first place? My best guess is that the development teams didn't talk to each other all that much. Even today, when I peruse MSDN Blogs, I see comments from Microsoft developers about how they don't know enough about what other groups are doing in such a large software company. When components from teams who don't communicate are brought together to form an application, the components won't always fit well together. In other words, such an application would not always have a technically optimal design. Once a application is released, opportunities to fix these chinks in the proverbial armor become scarce. Kludges and hacks are introduced that fix the problem while maintaining compatibility with the status quo. The evolution and technical optimality of an application are thus influenced by the communication (or lack thereof) between development teams within a software company -- a factor that is often influenced by contingencies having nothing to do with the technical aspects of an application.

Another factor influencing the evolution of an application is its present state. The way an application currently behaves is surely the most obvious source from which to harvest ideas for what can be added, removed or improved in it. Looking closely at an application and analyzing how closely its behavior approximates its idealized requirements may lead to refactoring and code-level interface enhancements. Adding new features to the application may involve clever reuse of code from other components. All together, these activities may boost the technical optimality of an application's design. There isn't much need to belabor the point that an application's evolution is highly influenced by its present design.

So far, we have concluded that the technical optimality of an application's design is influenced by some combination of its present design and the communications structure of the organization that built it. Let us assume for a moment that there exists a truly ideal technical design for an application. It is worth taking a moment to consider which of the two factors we have considered is more likely to push the application's current design towards this truly ideal design. Development teams not talking to each other may lead to some missed opportunities in improving a design technically. Some ideas may never make through the pipeline. As market dynamics and organizational priorities change, these missed opportunities 'stored' somewhere in the collective brain of a team may fall out of focus and never be realized in future iterations of the application. On the other hand, technical shortcomings in the current design of an application will always be present and are governed by such contingencies to a lesser degree. In other words, the current source code of an application and associated documentation acts as a viable and sufficiently stable medium of communication between developers and among development teams, without its being adversely influenced by communications shenanigans within and among development teams.

The above question becomes relevant in the case of open source software, where there are usually no well-defined product teams. Control is decentralized among multiple developers, who may be spread across multiple timezones. Far from even a semblance of a team working in a development cycle, all the developers have to work off is the current source code snapshot of the application and associated documentation. With such a development model, intellectual capital that isn't written out isn't intellectual capital at all. As a result, hardly anything goes without being written down somewhere, where other developers might read it and act upon it. All communication between developers happens in only two ways: explicitly in the aforementioned rigid, restricted way or implicitly through the current state of the application as expressed by its design and source code. Because there is no intellectual capital lost to organizational communications structure, a decentralized, open source software development model is more likely to result in a technically optimal software design.

Posted by Vishy at May 11, 2005 12:44 AM

Comments