Web 2.0 ohjelmistotuotannossa

Pidin asiantuntijapuheenvuoron 14.4.2010 Digitaalinen Suomi -seminaarissa Jyväskylässä otsikon mukaisesta aiheesta. Tässä 45 minuutin pituinen slidecast-tallenne puheestani. Käsittelen teemoja sekä web 2.0 -ohjelmien tekemisen näkökulmasta että web 2.0 -palveluiden hyödyntämisestä ohjelmistotuotannossa yleensä. Joitain teknisiäkin asioita mukana on, kuten pilviarkkitehtuuri ja NoSQL-tietokannat. Oli ihan hauska välillä puhua ihan teknisestä aiheesta – sisäistä nörttiä pitää silloin tällöin ruokkia. Niin ja minut voi tilata puhumaan muihinkin tilaisuuksiin. [Read more…]

Failures of the scientific method and the waterfall method (again)

I originally discussed the waterfall method in a previous blog post. Pranav just made a comment that I feel needs some further discussion:

I disagree with the following statement: “This is how science (unfortunately) often works – researchers just cite something, because everyone else does so as well, and don’t really read the publications that they refer to. So eventually an often cited claim becomes “fact”.”

I would like to say that this is generally not how science works. Claims(predictions) made by any scientific theory is verified by multiple laboratories/people over a period of time and they are always falsifiable.

The problem with software development is that it is not based on a scientific theory yet (I am not talking about computer science here, only software development). It has been said many times that it is more of an art than science.

Problems with science

Pranav, thanks for your comment. Of course you are right in that science normally should work by verifying findings, and falsifying those that are found to not hold up to evidence. However, my experience as a researcher has shown that this is unfortunately not often what happens. I’ll present a few examples.

First, the convention of citing another article. If you follow the rules strictly, you should only cite someone when you are referring to their empirical findings, or to the conclusions they draw from that. Nothing else. Nothing. But, unfortunately, most scientific articles do not follow these rules, but liberally make citations like “X used a similar research design (X, 2006)”. Wrong! While X may very well have done so, you’re not supposed to cite that. You only cite empirical results and conclusions thereof. But this is what happens. Because citation practice has degenerated across all fields, it is quite difficult to see the quality of a citation – is the claim (that is backed by a citation) based on empirical evidence or is it just some fluff? It has become too easy to build “castles in the sky” – researchers citing each other and building up a common understanding, but with no empirical evidence in what everyone is citing to. This is what has happened to the Waterfall model – researchers in the whole field have cited to Royce (1970) because that’s what everyone else has done. And the citation isn’t valid – there’s no empirical evidence to back the claim that a linear process works, and actually there’s not even a fluffy “i think so” claim to that effect, but rather Royce considers the idea unworkable. This hasn’t stopped hundreds of researchers from citing Royce. I consider this an excellent example of the failure of the scientific method. Of course the method in theory is still solid, but the degenerate practices have made it prone to failure.

In my field of educational technology I see this only too often. Cloud castles being built as researchers cite each other in a merry-go-round, with no-one realizing that whoever started the idea did not in fact have empirical evidence for it, but just a fluffy idea. And shooting down these established cloud castles is not easy, because the whole scientific publishing industry is also skewed:

  • You usually can’t get an article published unless to cite the editors and reviewers of the journal, and support their findings. Therefore contraversial results are hard to publish in the journal where they would be most useful.
  • Researchers often do not publish zero-findings. That is, when they try to find a certain pattern, and they fail, then they just scrap the whole thing and move on. But the scientific method actually needs also these zero-findings, because there are many fluffy theories that may occasionally get statistically significant empirical evidence (because with a 95% margin of confidence there’s 5% of studies that will find a connection even if there is none), but cannot be easily proved wrong. Therefore the numerous studies that fail to find a connection would show that the theory isn’t very strong. But these almost never get published.

And let’s make it clear that this is not a problem of “soft science” alone. Let’s take for example statistical science. Those who do statistics will know Cronbach’s alpha as a reliability estimator. It’s the most commonly used estimator, being in wide use for decades. Unfortunately, it doesn’t work for multidimensional data, which most data actually is. It’s still being used for that, because “That’s what everybody is using”. Here’s the article where professor Tarkkonen proves that Cronback’s alpha is invalid for multidimensional data, and in fact most data (pdf). You’ll notice that it is not published in a peer-reviewed article. I’m told it was submitted years ago to a journal, and the editor replied something to this effect:

“Look, your article is sound and I cannot find any fault within it, but we can’t publish this. I mean we’ve been using Cronback’s alpha for decades. Publishing this would make us all look silly.”

Waterfall and software engineering

OK, back to the waterfall method and software engineering. I like Pranav’s comment that making software is more of an art than science. And I agree. Creating new software is not like producing another car off the assembly line, it’s like designing a new car. Creating copies in bits is easy and cheap, since the engineering of computers is good enough. But making software is more of an art, and a design process.

In fact I have a problem with the term “software engineering”. Because software isn’t engineered, it’s envisioned, explored, protyped, tried out, iterated, redone, and designed. Researchers and managers have been trying to get software engineering to work for several decades now, and what are the outcomes? Can we build software as reliably as we can build a new bridge? No. But, if the bridge builder was given for each new bridge requirements like “use rubber as the construction material”, changing the requirements, tools and crew each time, maybe building a bridge would not be so predictable either.

But this doesn’t mean that software development can’t be a scientific method. If we can do empirical data gathering (like metrics of source code, and budget/schedule overruns), and can find patterns there, then there is a science. I mean there’s science in arts as well – compatible colour combinations, the golden ratio and other rules (of thumb) are based on experience, and currently many of them have proper studies to back them up, not just folklore.

So also software development can be studied scientifically, although much of it is unpredictable and challenging. My understanding is that most of the studies published in the last decade quite uniformly find that practices such as iterative life cycle models, lean management, intuitive expert estimation instead of automated model estimation, and common sense, actually are well correlated with successful software development projects. So there are some rules of thumb with real empirical backing.

Even the waterfall method has empirical studies to show that it is a really horrible way to do software except in some extremely specific circumstances, which almost never coincide in a single project. If they would, it would be the most boring project in the world. I would not want to have any part it it. Remember, NASA flew to the moon in the 1960s with iteratively developed software and hardware. And I’d bet the software guidance system for the Patriot missiles (which never hit anything in the 1990s) were based on a linear model.

GPL v3 julkaistu – mitä se tarkoittaa vapaille ohjelmistoille

GPLv3
Uusi GPL suojaa ohjelmistoja patenttisodilta ja uusilta väärinkäytöstavoilta, taaten ohjelmistojen käyttäjille ja kehittäjille sekä reilut oikeudet että riittävät vapaudet.

70% kaikista vapaista ohjelmistoista on lisensoitu Gnu General Public License v2:n eli GPL:n alaisena. GPL:n version 1 kirjoitti Richard Stallman, vapaan ohjelmistoliikkeen pioneeri, vuonna 1989. Version 2 hän kirjoitti vuonna 1991. Molemmat olivat hänen itsensä (ja lakimiesten) tuotoksia.

Nyt 16 vuoden tauon jälkeen Free Software Foundation on 1,5 vuoden työn tuloksena saanut aikaan version 3. Kun ottaa huomioon kuinka paljon teknologia on kehittynyt 15 vuodessa, GPL v2 on kestänyt ajan hammasta yllättävän pitkään.

Useimmat ohjelmistot on tarkkaan ottaen lisensoitu “GPL v2:lla tai uudemmalla”, mikä tarkoittaa, että kuka tahansa ohjelmiston levittäjä saa päättää, seuraako versiota 2 tai 3. Tämä tarkoittaa, että valtaosa nykyisistä vapaista ohjelmistoista siirtyy puoliautomaattisesti uuden lisenssin piiriin.

Pieni osa ohjelmistoista on kuitenkin nimenomaan lisensoitu versiolla 2. Merkittävin tällainen poikkeus on Linuxin ydin, jota Linus Torvalds ylläpitää. Viimeisen vuoden ajan ovatkin nörtit ympäri maailman saaneet seurata eräänlaista nörttien kauniita ja rohkeita, kun Linus ja Richard ovat väitelleet asian ympärillä.

Noin yleensäkin GPL v3:n laatiminen on ollut todella avoin prosessi, johon kuka tahansa asiasta kiinnostunut on voinut tasavertaisesti osallistua kommentein ja ehdotuksin. Ja tulokset näkyvät: Oikeastaan kenelläkään (paitsi Microsoftilla) ei ole mitään negatiivista sanottavaa uudesta GPL:stä. Sen ovat hyväksyneet useimmat suuret ohjelmisto- ja laitevalmistajat, vaikka edellisen vuoden aikana monet ovatkin olleet epäilevällä kannalla.

GPL v3:ssa on kaksi erittäin merkittävää muutosta:

  1. Monet kuluttajille suunnatut laitteet, kuten matkapuhelimet, digiboksit ja GPS-navigaattorit, sisältävät avoimia ohjelmistoja. Jotkin laitevalmistajat ovat estäneet laitteissaan olevien ohjelmistojen muokkaamisen ja laitteet erikseen tarkistavat, että niissä on valmistajan oma ohjelmisto. GPL v3 kieltää tämän käytännön. Jos valmistajalla on vapaus laittaa laitteeseensa vapaita ohjelmistokomponentteja, on käyttäjälläkin oltava vapaus ottaa ne pois tai korvata ne toisilla.
  2. GPL v3 toteaa, että sen alaiset ohjelmistot eivät ikinä voi olla “tehokkaita suojakeinoja”, jolloin niiden poistamista ei voida estää Jenkkien DMCA:lla tai kotimaisella Lex Karpelalla (josta saatiin jo ensimmäinen oikeuden päätös).
  3. Microsoft on aloittanut avoimen patenttisodankäynnin vapaita ohjelmistoja vastaan tämän vuoden toukokuussa. Hiljaisempia uhkauksia on nähty jo vuosien ajan ja kun Microsoft tänä keväänä teki ensimmäiset liikkeensä, GPL:ään otettiin mukaan klausuulit vastaavaa hyväksikäyttöä vastaan. Jo vanha GPL v2 kyllä toteaa, että ohjelmiston tekijä antaa ohjelmiston käyttäjälle kaikki neljä vapautta (vapaus käyttää, tutkia, jaella ja parantaa), muttei erikseen sano, että jälkikäteen ei saa tulla lyömään patenttisalkulla päähän. Joten GPL v3 erikseen toteaa, että jos jakelet GPL:n alaisia ohjelmistoja, lupaudut samalla ettet hyökkää patenteillasi ketään kyseisten ohjelmistojen käyttäjiä vastaan. Microsoftin hämärä diili Novellin ja muiden yritysten kanssa nimenomaan pyrkii hajoita-ja-hallitse-periaatteella sirpaloimaan avointen ohjelmistojen yhteisön, antamalla patenttisuojan esim. vain Novellin asiakkaille, muttei muille. GPL v3:n alaisen ohjelmiston tapauksessa tällainen epäreilu patenttisuoja ei kelpaa – suoja annetaan kaikille.

GPL:ssä on vielä liuta pienempiä parannuksia, kuten että se erikseen sallii peer-to-peer-verkkojen käytön ohjelmistojen levitykseen (koska P2P:ssähän ohjelmaa kopioiva samalla jakaa sitä muille, jolloin hänen GPL v2:n alla tulisi erikseen tarjota myös lähdekoodit saataville – hyvä esimerkki siitä, että uudet teknologiat voivat helposti rikkoa lisenssin tai lain kirjainta vastaan, vaikka lisenssin henkeä ei rikotakaan).

Todetaan vielä jälkikirjoituksena, että Microsoftin väitteet 235 patenttirikkomuksesta avoimissa ohjelmistoissa eivät ikinä tule kestämään päivänvaloa. Jos patentit olisivat oikeasti hyviä, ne olisi jo nimetty ja oikeusjutut isoja firmoja vastaan käynnistetty. Koska Microsoft tyytyy vain pelottelemaan eikä edes suostu kertomaan mitä patentteja muka rikotaan, tilanne on klassinen FUD. Avoimia ohjelmistoja tekevät tahot eivät voi itsekään alkaa tonkia Microsoftin patentteja, vaikka ne julkisia ovatkin, sillä tällöin riskinä on, että heitä voidaan syyttää selvemmin jonkin patentin tahallisesta rikkomisesta, jos heillä on siitä jotain tietoa. Ja muistetaan nyt sekin, että Bill Gates vuonna 1991 oli itse ohjelmistopatenttien vastustaja. Mutta ei taida Bill enää pystyä luomaansa kolossia hallitsemaan.

Otetaan esimerkiksi patentti 7016055, jonka Microsoft sai vuoden 21.3.2006. Nimeltään patentti on “Synchronization of Plugins”, mutta tiivistelmässä todetaan, että

“A system and process for ensuring the smooth flow of electronic ink is described. Dynamic rendering is give priority over other event handlers. Priority may be the use of one or more queues to order when events occur and may be performing dynamic rendering prior to other steps.” (sic)

Kuulostaa hienolta – miten virtuaalisen musteen saa valumaan luonnollisen oloisesti esimerkiksi piirto-ohjelmassa. Siihenhän voi kehitellä vaikka minkälaisia hienoja tekniikoita ja algoritmeja, mutta tuo tiivistelmä viittaa lähinnä etuoikeusjonoon (priority queue). Patentin lukemalla voi todeta, ettei patentti sisällä oikeasti mitään muuta. Microsoft on siis saanut patentin idealle, että digitaalinen muste leviää kivemmin, kun sitä tekee korkeammalla prioriteetilla kuin muita järjestelmän toimintoja. Samalla tavallahan Windowsin hiiren kursoriakin on liikuteltu yli 10 vuotta ja priority queue lienee tuttu käsite kaikille ohjelmointiin tutustuneille – ehkä siksi, että se on ollut peruskauraa jo vuosikymmenten ajan. Eli jos ja kun muutkin Microsoftin patentit ovat samaa tasoa, ei niiden kanssa ikinä uskalleta oikeuteen asti, josta ne naurettaisiin ulos alta aikayksikön.

Enemmän yllä oleva esimerkki kyllä kertoo USA:n patenttitoimiston surkeasta tilasta. Onneksi meillä EU:ssa ei vielä ole softapatentteja – Microsoftilla on kyllä pitkä jono niitä odottamassa läpipääsyään EU:ssakin. Lisätietoja softapatenteista on vaikkapa EFFIn sivuilla.

The agile way of starting a company

I just listened to Greg Gianforte’s presentation on bootstrapping a company. What he says is that the way of starting a company that’s taught in all business schools is not the right way. Of the hundreds of thousands of startups every year, only one percent follow the business school dogma of writing a business plan, finding investors, and then starting to develop a product.

Greg’s message is also available in his book Bootstrapping Your Business: Start And Grow a Successful Company With Almost No Money. The basic message is to start by contacting potential customers and asking them would they be interested in the product or service. And if not, what would make them interested. After a good amount of these inquiries you’ll have a very good idea of what you should develop to get interested customers. And it’s a lot sooner than with the traditional method, where the first contact with potential customers can be 1-2 years after starting to draft the business plan. And at that time it’s a lot harder to accomodate customer requests that you haven’t thought up. Whereas with bootstrapping you can respond to all requests, because you have no product – yet.

This whole approach seems quite reasonable, and has a good analogy to the agile software development method. The method taught in software engineering schools is to start with requirements gathering, followed by analysis and design, followed by implementation, then testing, and finally delivery. The agile approach is to start with a minimal set of requirements, first do tests, then implement those tests, do analysis and design as necessary, refactoring the code while you’re at it, delivering every 2-4 weeks, and gathering more requirements after each delivery. The benefits are pretty much the same: you have customer contact with every release, not just at the end of the project, you’ll have a working product after 1-2 iterations, and you can respond to customer requests more easily.

The essence of web 2.0

Web 2.0 has many names and a lot of hype – social web, community web, interactive web, and what have you. Nick Gall boiled down the hype quite nicely in his OSCON keynote (see a summary by zdnet) which is listened to on IT conversations. Not exactly news to me, but Nick very nicely boiled down the essence of web 2.0:

  1. It’s read-write, instead of read-only. OK, Tim had the POST method in his http protocol, and was meaning to make the second version of his browser read-write, but others started cloning his original browser and the read-write functionality was pretty much forgotten for a decade. Now with the help of the POST method and Ajax UI gimmicks, read-write is here to stay – wikis are becoming even wikier (faster) to use.
  2. Links aren’t one-way, but symmetric. This is what trackbacks are all about – seeing the links from both directions.
  3. After going global, the web is now extending to processes. So in addition to serving requests made by people, the web now can accomodate calls made by applications and processes (think of web services and xmlrpc). This allows the building of new services as mashups of existing services (like the all-too-popular Google Maps mashups).

What this means for software development? Use the hour-glass metaphor – wide base (build on or accept anything), narrow waist (do a simple thing), wide top (provide results in a generic way so others may take advantage of it). In practice:

  1. Extend existing services using the hooks they provide via the web. Design your software so that is accepts generic standard formats (like xml, xmlrpc).
  2. Do something valuable to the data your service has been given. Do only one thing at a time, so that the results can be reused after any phase. Simple is beatiful.
  3. Provide hooks in your services so others can extend the results you’re producing. Provide the results using standard formats (like xml, xmlrpc).

Corrections to the Joel test of software development quality

I just listened to an interview of Joel Spolsky on It Conversations, and checked out his Joel Test. While Joel has good argumentation for his opinions, I think a different perspective is needed for a few of the steps in his test, especially concerning testing and user involvement.

Here’s my version of the Joel test. It starts with the same 12 steps (although modified more or less), but continues with a few additional steps that I consider important.

  1. Do you use source control?
    This is too obvious to go into detail – even a one man team needs source control. It gives you history traceability, undo, coordination of simultaneous work, and backups. I personally use it even for all of the documents that I write.
  2. Do you have automatic builds? (was: Can you make a build in one step?)
    The one step build that Joel wants is a good start, but I’d rather see an automatic build done every night, and a big fat red warning on the frontpage of the development site whenever something breaks.
  3. Do you make daily builds?
    If you can’t build the software every day, you won’t build it until you need to. That means a convergence period before releasing, which is in fact just fixing technical bugs that could have been fixed when you wrote the code. So build automatically, and daily. And make each developer build and verify his code before he commits to source control.
  4. Do you use a ticket system? (was: Do you have a bug database?)
    Again, an obvious necessity. In addition to bugs, tracking all enhancement and auxiliary tasks should be carried out in the same ticket system, especially if you’re doing distributed and/or dispersed development.
  5. Do you prioritize bugs? (was: Do you fix bugs before writing new code?)
    As Joel says, bugs shouldn’t be left lying around, since they’re getting more expensive to fix as time goes by. But a black-and-white rule of “fix all bugs before doing features” won’t work – most bugs can be tracked to one developer, who should be the one to fix it (since he’s most likely the most efficient on that bug). Others shouldn’t wait, but continue with other tasks. Related to this, if you do frequent releases, then you usually have a one day convergence period, where all the remaining bugs are squashed.
  6. Do you have an up-to-date schedule?
    No disagreement here. Having limits on what is done are needed. They discourage feature creep, for one. So timebox everything – two weeks between releases, for example.
  7. Do you have a plan that you can change? (was: Do you have a spec?)
    Joel wants a spec. I don’t. In the majority of development projects, the customer cannot exactly specify what he wants. Thus instead of a spec, use a format that the customer also understands, like user stories. The collection of accepted user stories is the spec. But it’s a spec that can be changed by the customer, and that is refined as needed. Of course, you need development practices that support a changing spec, like automated unit testing.
  8. Do you provide developers with conductive working conditions? (was: Do programmers have quiet working conditions?)
    Here I completely disagree with Joel. Most of the problems are best solved by more than one mind. The flow that Joel talks about isn’t necessarily an individual experience, but can even more easily be experienced by a pair or team of developers. Therefore: have all developers in a single room, preferrably facing the walls, with an open area in the center, so they can at any time turn around and talk to whomever. Of course, there are those people who will need 15 minutes to get their focus back on what they’re doing after being interrupted, as Joel says, but I wouldn’t want those people on my team.
  9. Do you use appropriate tools? (was: Do you use the best tools money can buy?)
    Not the best money can buy, as Joel says, but appropriate. There is no added value in getting the compile done in 5 seconds instead of a couple of minutes. Having time to relax your mind and getting some distance to the code generally improves the result. Plus it’s critical to have time to reflect on the work you’ve done. Waiting a few minutes for the compilation or the unit test suite to run is a good time for all this. Plus it makes the code that much better, since the developers actually need to think their code through instead of just trying quick fixes and seeing if the compiler likes it or not.
  10. Do you do quality assurance? (was: Do you have testers?)
    Not dedicated testers, as Joel says, but quality assurance. This can come in many forms, including dedicated testers, or by doing automated unit tests by the programmers themselves. Even Joel admits that good testers are hard to find and harder to keep, since testing is not good work. He argues that it’s still cheaper to use testers than have developers do testing, since testers have lower wages. I on the other hand argue that having developers write unit tests is cost-effective. Automated unit tests have several advantages: they make sure that any defect can only be created once (since once found, a test is written for it), they improve the quality of the code (not allowing laziness to cause bugs), and they allow you to respond to a changing spec by changing your architecture as needed. After the initial phase (after you have the first 50-100 tests done) the unit tests actually save developer time, instead of consuming it. Plus you don’t need as many testers as you otherwise would.
  11. Do you only hire competent team members? (was: Do new candidates write code during their interview?)
    Instead of just making the potential new recruits write code, as Joel suggests (which is an excellent idea), it’s also important to let the development team meet the candidates and see how the chemistry works. Since development is a team effort, it’s best to let the team have their say in who should be hired.
  12. Do you do usability testing? (was: Do you do hallway usability testing?)
    Joel’s suggestion of hallway testing is not enough, if the company is filled by developers. What you need are representatives of the focus group. No developer can say if a primary school teacher can understand a certain functionality, and few developers actually know the general usability guidelines that do exist. So using usability experts to iron out general usability issues, followed by a focus group test, where the vocabulary, approach, and priorities of the software can be tuned to the focus group, are necessary.
  13. Do you do user-centric design?
    One reason Joel doesn’t like XP is the need for an on-site customer. In Joel’s opinion customers can’t provide you with good specs – they either say the obvious, or ask for features that they really don’t need. This is where user-centric design comes to play. Instead of asking the customer what he wants, find out what the customer needs. This is not an easy task, and involves a number of practices, one of which is a representative of the customer, who has the authority to make decisions, and the expertise to find out things. This representative is not enough alone, of course. You need to do participatory design, for example.
  14. Do you have a heterogonous team?
    Unlike Joel claims, I don’t believe that a software developer can learn to understand the customer’s field in just a few hours. You need the customer’s expertise in some form. But you also need the expertise on the theoretical backgrounds and advanced technologies that are to be applied in development. It’s not enough to just have developers. You need user interface logic specialists, you need graphic designers, you need usability professionals, and most importantly instead of separate code drones and architects you need people who are excelled programmers, great architects, and that have some expertise in the field being worked on, all wrapped into a single human. Most of the failures in software development are due to miscommunication between the customer and the programmer. Misunderstandings are cut radically when you have more roles packed into a single mind. Not all of your developers need to have problem domain expertise, mind – it’s usually enough to have just one, because he can translate between customer-lingo and programmer-jargon.

In summary: This is just my experience of how people and code work. Joel’s test is an excellent resource for job seekers and managers, but in some aspects I disagreed so much that I had to write this response – Joel test as it is is biased away from some very good new practices that I consider a comptetitive advantage. Comments are welcome. And thanks to Joel for sharing his thoughts with us all.

Distributed design and development using agile methods and Trac (XP2006 presentation)

I’m hosting a session at the XP2006 conference and here’s the material for that session.

Also, the title should probably talk about Dispersed, not distributed, since that’s what we’re doing (not just having teams in separate locations, but having team members in separate locations).

Making money with open source

I had a conversation with a new media student, and was surprised to hear that the curriculum of the Medialab of the University of Art and Design of Helsinki does not include anything about business and open source.

So here’s the 101 of open source business models:

Provide something that has a demand: This applies to business in general, in all fields. Do something that has enough value for others, and make them aware of what you’re offering.

Bits are free: Charging for digital data doesn’t work, since most people consider bits to be free (as in beer). This means that you can’t make your business model about selling software packages, or selling digital content (like music, images, or whatever). Many companies of course try, and do so with mixed success. It is, however, a battle they’re losing, since more and more software and content is becoming free (as in beer and as in speech), so as customers find their needs satisfied with freer alternatives, they stop coming to you. Of course, it’s a very lucrative thought to just develop software once and then keep on selling it – free money, you may think. And yes, it is. This is what Microsoft is doing with its software. But now that the free alternatives are becoming good enough (a disruptive innovation), Microsoft’s business model is in trouble.

Nobody owns the software: Since the software is open source, no-one owns it. Well, of course someone has copyright on it (or parts of it), but being free/libre open source software, you (or anyone) has the right to make any changes to it (with the possible exception of the license and list of authors) and publish and distribute the changed version. This has some wide-ranging consequences, which give new possibilities for businesses.

Marketing by word-of-mouth: In addition or instead of traditional marketing ways, providers of valuable contributions around an open source product gain fame. This is why Linus Torvalds gets paid (quite well, I imagine) just for hacking the Linux kernel, even though much of what Linus does doesn’t directly benefit Linus’s employer. This is why open source gurus don’t need other jobs in addition to hacking the code and travelling around the world in conferences (and, again, getting paid quite nicely). The catch here is that they have shown that they know what they’re doing – by providing open and free stuff, that anyone can see. If they provided closed, commercial stuff, no-one would bother paying just to see whether the content is any good. But free content gets you a huge audience, and if you do a good job, everyone will see it. And suddenly the door is open to all those “expert” possibilities, like conference keynote speaking and consulting.

Services are always needed: While software may be free, people still need services around them. The traditional services are things like installation services, tech support, and maintenance. But open source (since you also have the source, and the license to change it) provides other possibilities as well:

  • Localization: Some customer may be ready to pay for you to provide a translation of the software’s user interface and/or the manual into their language. Of course, after the localization is complete, it will be available for anyone else for free. But whoever wants it the most needs to do it themselves, or pay for someone else to do it. That someone else could be you.
  • Documentation: Most open source projects don’t have very good documentation, but the customers need good documentation. Someone could make a business out of providing it. Of course, you don’t need to license your manual (if you’ve done it from scratch) as open, so you get to charge everyone who wants to use it. Or you could operate in the spirit of open content and license it under a Creative Commons license, meaning you get paid only once. There are other, fairer ways, of getting extra revenue for a good manual: you can print it as a book and sell that – it’s surprising how many people actually want a printed book in addition to a free online version. But maybe the best reason of putting all “finished bits” available for free is this: as a known producer of high quality content (software patches, documentation, whatever), you gain fame. And fame gets you places.
  • Training: Even printed manuals aren’t enough – most companies want training for their staff in using a new system. You need some fame to be considered a valid trainer (read: provide quality content that build up your fame as an expert on the system in question). Training is a good business, since even if the training material is freely available, the trainer is always needed.
  • Software development / consulting: If the customer need to integrate their new open system with other systems, or make major customizations to it, they need a consultant to their development team, or an entire software development team, to do the required modifications. This works brings very good revenue, but you need lots of fame to be considered a worthy partner.

Summary: It’s about the services, not the bits. Provide services around the software that the customers need. Take advantage of the freedoms that open source grants you: you’re not limited to just providing services around the software, but you can modify, localise, customize, and improve the software to fit your customers’ needs. Build up fame by putting everything you do openly available, and do only high-quality stuff. Leverage that fame as a free marketing tool to get even more opportunities to use your expertise in your business.

Finally, a nice example: www.openoffice.fi. It’s a Finnish company that was built around providing services around OpenOffice.org (OpenOffice in Finnish). They have a good domain so customers can find them. They provide manuals, FAQs, and a help desk. All manuals, templates, patches, and instructions that they’ve produced are openly available for anyone (which gives the company a lot of good karma in the eyes of all Finnish people who want to use OpenOffice in their homes), but commercial support with e-mail and phone help desks is also available, and that’s what’s producing the revenue. Many people are familiar with the company’s templates and manuals, including CEOs (and families of CEOs) of companies that are considering switching MS Office to OpenOffice.org. So when the switch becomes imminent, there’s a high probability that the company will at least consider OpenOffice.fi as a partner. Fame gives you business opportunities.

Managing an agile software team in 4 countries

We’re now two months into CALIBRATE, a huge software development project funded by the European Schoolnet. It has close to 20 partners in different EU countries, and our team is coordinating one of the work packages, where we have 30 months to build a jazzy web portal for creating, browsing and using learning material in the EU. We’re calling it the “Toolbox” for now.

From the software development point of view, the most interesting aspect of this project is that we’re attempting to do agile development (using a modification of Scrum and XP) with a team of 8 developers, which reside in four countries (Estonia, Finland, Hungary, and Norway). And one of the most important principles in agile is to keep the team together. Well, also in a waterfall splitting the team apart is always an issue that needs addressing.

So the problem is that there’s a few people in each country. No location has enough people to form a working development team. So somehow we need to create a team that is present in all four locations simultaneously. Having attended Agile Finland’s seminar in Helsinki I was convinced that one of the key make-or-break issues in this project will be communication and team dynamics.

A good book to read was Organizational Patterns of Agile Software Development by James O. Coplien and Neil B. Harrison, which is basically a collection of best practices (or patterns) for software development, from the point of view of the people doing the work.

A plan was then set: Geographically distributed work can work, as long as no-one is left out of the loop. This means that no important decisions should be done by people in one location alone, and no information should be held in a place where others cannot access it. So everything should be done as openly as possible. After a quick review of available software, we decided to go with Trac, which is an open-source development platform that integrates a wiki, a ticket system and version control into one browsable, interlinked web interface. You can take a look at our project’s Trac site at http://goedel.uiah.fi/projects/calibrate/. The burndown chart and time tracking are additions to the plain Trac, which allow some measure of time estimation (the burndown chart is a Scrum practice).

When browsing our Trac, you’ll notice that everything is done completely visibly. All meeting minutes are there, all discussions about concepts and features are there. Every patch commited into the version control system (subversion, btw) is connected to a defect or enhancement ticket, which in turn is connected to a user story ticket. So the reason for everything that we do can be tracked down. Our team also has a mailing list, where Trac automatically sends all ticket changes, and Subversion sends all commit diffs.

And yes, we did start with a “Face to Face Before Working Remotely” by inviting everybody to Helsinki for a week for an intensive workshop, sauna, and swimming in the sea (in November). I’ll post another entry discussing the organizational patterns that we use more closely.

Software design

Thoughts on “Bringing design to software”.

5 core processes of interaction design: understand, abstract, structure, represent, detail.

Design languages are used to create products and help in learning to use them. An effective language brings coherence, relevance, and quality.

Development: characterization (of current languages), reregistration (of a new assumption set), development and demonstration (of language elements in scenarios, sketches or prototypes), evaluation (in context), and evolution.

Black-box designs can be foolproof, but lead to impoverished communities of practice. Transparent-box design reveals the functionality (as appropriate), allowing comprehension and skill development.

The more important a product is for the user, the more trouble he is ready to accept (threshold of indignation).

Action-centered design focuses on linguistic distinctions, standard practices, actions that need coordination among people, tools (usable without thought), breakdowns, and ongoing concerns.

Innovation demands that prototypes drive specifications, not vice versa. Rapid prototyping cycles are also necessary for quickly answering questions about design. Prototypes need to be community property (public) to be useful in eg. facilitating communications and redesign.