We’ve been talking to government IT departments at many levels about how to use open technologies, and how to release their in-house code for other jurisdictions to use. In that process, we’ve made a couple of profound discoveries:
Everyone would love to open up their code, as long as it doesn’t cost anything to do so.
That is usually followed by an even more profound thought:
But we all know that there is a cost to going open. So the real question is: what are the benefits? What’s our return on investment going to be?
The answer is “Possibly significant — if you do it right.”
To explain why, I’ll start with a quantitative example, then look at the harder-to-quantify picture behind it.
One of the collaborative projects I’ve worked the most on is Subversion (a system for tracking changes — ”versions” — made to files and folders; hence the name). Subversion was started by my then-employer, CollabNet. They needed a better version control system for their customers, as part of a larger hosted online collaboration service, and realized that ubiquity and clear lack of lock-in would be strong assets. So CollabNet decided to release Subversion as open source software from the very beginning, and they knew, from past experience with open source projects, that they’d need to put some effort into drawing a community around the code and making it easy to collaborate on the project.
At different points in the project — the first time starting three or four years in, if I recall correctly — various people at CollabNet have tried to measure the “amount” of contribution coming from outside. I put “amount” in quotes because this is a very tricky thing to quantify, and before I give the numbers, I want to put a huge warning label on them: evaluating software productivity is hard, your mileage may vary, past performance is no guarantee of future gains, etc.
But I do think what they found is meaningful (I’ll explain why in a moment):
As near as we could tell, 75% of the code came from contributors who were not paid by CollabNet; the remaining 25% came from CollabNet’s own developers — including the ones who spent part of their time on engagement with the development community, e.g., evaluating contributed changes, prioritizing bug reports, etc.
If you take that at face value, it’s a 4-to-1 return on investment for open-sourcing their code and stewarding the community with care. Not bad — cash-strapped civic IT departments take note! But should we take the figure at face value? Are lines of code a reasonable thing to measure here?
In this circumstance, I think they are. While lines of code are not recommended for measuring individual programmer productivity, they are a decent way to measure relative amounts of code, in aggregate, in a mature project. The code in Subversion is pretty well vetted, precisely because it’s an active open source project. There aren’t a lot of unnecessary lines in it — anything that makes it into the codebase and stays there has passed the sniff test of a developer collective.
And actually, that’s the real story here. The quantifiable contribution ratio — 3-to-1, 2-to-1, 4-to-1, whatever — might vary based on a lot of factors. The true “RoI of open” usually shows itself before a given piece of code makes it into the project. Many times one of us, the CollabNet-salaried developers, would post a proposal for a feature design, or even post a concrete implementation, and the non-CollabNet community would find bugs and potential improvements in it. They would also contribute new features themselves, in some cases quite major ones. They contributed documentation, and frequently handled user feedback, often integrating it into the project in the form of actionable bug reports. All of this directly benefitted the software and should be counted as part of the return on investment too. (In fact, a great measurement would be to look at which developers respond to bug reports — that is, who has the necessary conversation with the original reporter to turn the report into something useful — and then correlate that with which bugs actually get fixed. I’ve not yet had a chance to do that kind of study, but my educated guess is that the wider development community’s constructive role would loom pretty large.)
These sorts of effects are ones you can see in almost every open source project, as long as it’s active and has users. You can’t always easily figure out exactly what percentage of the code was contributed by whom, but you can see, just from watching the project logs and forums and bug database, that having more people join the project usually benefits all the participants. And because development communities tend to evolve mechanisms for absorbing their own communications overhead, it is normal in a healthy project that the costs of collaboration grow more slowly than the benefits do. This is true for every participant, not just the originator of the technology.
What does this mean for a tax-funded, budget-conscious civic IT department trying to figure out whether it’s worthwhile to open up some in-house code?
It means that if your jurisdiction
- plans to use the software for a long time, and
- therefore plans to maintain the software anyway, and
- it’s something other jurisdictions might need too
…then there’s a good chance that opening it up could be a responsible decision. We can’t say it always will: every project can have idiosyncracies and should be looked at on a case-by-case basis. But if you’re considering opening something up, and are examining the costs of doing so, maybe the above will help clarify the potential benefits. When someone from another jurisdiction finds and fixes a bug, your users benefit as well as theirs. There are well-established techniques for forming these development communities and making sure that the bugfixes and feature enhancements flow into the core code and eventually back out to all the users. Part of Civic Commons’ mission is to make sure those techniques are widely known and properly adapted to civic technology projects.
In the best case, the more jurisdictions use the technology, the more each of them wins, as maintenance and development are amortized over all of them. The benefits of that can be great enough to outweigh the costs incurred by setting up the collaboration in the first place.
So there can be real return-on-investment in releasing your code. With projects like the Enterprise Addressing System and the IT Dashboard, we’re trying to show how in detail, and to adapt open collaboration to the particular circumstances of civic IT projects.
-Karl Fogel