Don't draw diagrams of wrong practices – or: Why people still believe in the Waterfall model

The Waterfall model is originally invented by Winston W. Royce in 1970. He wrote a scientific article that contained his personal views on software development. In the first half of the article, he discusses a process that he calls “grandiose”. He even drew a figure of the model, and another showing why it doesn’t work (because the requirements always change). This model is the waterfall. He used it as an example of a process that simply does not work. In the latter half of the article he describes an iterative process that he deems much better.

OK, so why do people still advocate the waterfall? If you look at the scientific articles on software engineering that discuss the waterfall, they all cite Royce’s article. In other words, they’re saying something like “The waterfall is a proven method (Royce, 1970).” So they base their claims on an article that actually says the opposite: that the model does not work.

This is how science (unfortunately) often works – researchers just cite something, because everyone else does so as well, and don’t really read the publications that they refer to. So eventually an often cited claim becomes “fact”.

Agile methods and the linear method were both originally used already in the 1950′s. The scientific mis-citing boosted the waterfall’s popularity, but in the 1980′s it was being phased out – because people had found out (the hard way) that it does not work.

But, things change as a busy engineer in the US defense organization is asked to come up with a standard for military grade software projects. He doesn’t know what a good process would be, and he’s told that “Hey, Royce already came up with the correct method: the waterfall. Use it.” So the waterfall becomes the US military standard DoD-2167.

A bit later the NATO countries notice that they too need a software engineering standard. And they ask the US for their standard – no need in reinventing the wheel, right? And thus the waterfall is again used everywhrere, since if the military uses it, it must be good. Right?

Here in Finland we had already dropped the waterfall in the 1980′s, but after the DoD-2167 was sent over the Atlantic like a weapon of mass desctruction, it became the standard, again.

What have we learned here: The agile (iterative, incremental) methods are not new. They’re as old as the waterfall. There is no scientific evidence that the waterfall works – on the contrary, most projects fail. There is, however, massive evidence that agile methods improve productivity, quality and success rates.

So what should we really have learned: People remember images. And people often read the beginning of an article, but not the end.

So: Don’t draw figures or diagrams of wrong models, because people will remember them. And in the worst case that could cause hundreds of thousands of failed projects and billions of euros and dollars wasted for nothing.

Update (February 2010): Commenters have been asking for references to back up my claims, so I added links to the most central documents referred in this blog post.

Object orientation

Object orientation is an interesting and powerful way of designing, modeling and creating not just applications but also business models and just about anything. Here I’ll explain the basics.

Classes and objects

Instead of creating insubstantial and vague procedures and functions with extensive flow control to create the best spaghetti meal in town, objects allow the creation on concrete, close-to-the-real-world things (called objects) which are classified according to what they know and how they behave.

A Bicycle, for example, can change its gear, accelerate, break and steer. It has information on how many gears it has and what its speed and heading is.

Everything in an object oriented world is an object and each object is classified and therefore is an instance of a certain class. A class describes what behaviour and information an object of said class has, and each object of that class then has that information and behaviour.

Continuing with the Bicycle example, we could create two instances of class Bicycle, which can be called objects of class Bicycle. The first object could be heading north at a speed of 10 km/h on its 3rd gear, while the second one is stopped, facing east, on gear 1. Both of these objects have the same information (heading, speed, gear), but the individual values are different and can change over time. As the second bike starts moving, its speed increases, and at some point it will probably change to a higher gear. An object can also have information that is specified when the object is created. The color of the bicycle, for example, doesn’t change very often.

Object
A single entity that is a collection of (depending on your viewpoint):

  • data and code (binary perspective)
  • variables and methods (source code perspective)
  • information and services (design perspective)
  • state and behaviour (analysis perspective)
Class
A definition of what information and services a certain type of object has. Every object is an instance of some class.

Generalization and inheritance

When you start defining classes for our system, you’ll quickly find out, that you have several classes that are slightly different but have meny things in common. In that case you can take advantage of class inheritance and define a superclass that has the common information and several subclasses that add to the superclass’s definition.

If we add motorcycles to the system in addition to the bicycles, we have two classes that are different, yet in many ways similar. Let’s define a superclass Vehicle that has behaviour such as accelerate, break and steer, and information on its speed and heading. We’ll then specify that class Bicycle inherits or extends Vehicle, and has additional behaviour such as changing the gear. The class Motorcycle will also inherit Vehicle and has additional behaviour like changing the gear, starting the engine and shutting down the engine, and it will also have additional information, such as the amount of petrol left in the tank.

Superclass
A class that has subclasses.
Subclass
A class that inherits all the information and behaviour from one or more superclasses. A subclass can have additional information and can have additional behaviour, or override some of the superclasses’ behaviour.

Encapsulation

One of the key features of OO is information hiding and encapsulation. Information hiding means that the information an object has is hidden from the outside, so it cannot be changed or even looked at. The only way to get the value of a piece of information is to call a method that gives out that information. Similarly, to change a value, you have to call a method that will do that for you. A method is another (more programming oriented) name for a service. An object’s behaviour is made up of its services, or methods.

Encapsulation concerns the object’s methods. Only the signature of the method is visible to the outside. The signature shows how that particular method can be used. However, what the method actually does when it is invoked, is not shown. This has quite powerful ramifications. The greatest advantage is that the method’s contents can be changed freely at any time and as long as the signature stays the same, the change will not affect any other part of the system. In traditional programming styles a change somewhere can easily break the system in a dozen other places. Encapsulation makes sure that no part of the system is dependent of a class’s inner implementation, which makes updates and bug corrections significantly easier.

Do not confuse information hiding with security. Information hiding does not protect the data in any way, but is simply there at compile time to make sure that the developers don’t directly use an object’s internal data from the outside, but do it via appropriate method calls.

Variable
Storage place for one piece of information. A variable has a type, which tells what the information is (a number, a string, an array, a GearBox…) and a name, which is used at the source code level to handle the variable.
Method
A piece of code that can be called (invoked). Calling a method executes the code it contains. A method can accept parameters which it can then use in its code. A method can return a single value of some type, which the caller can use after the call is complete.
Signature
Every method has a signature, which defines the method’s name, its visibility (who is allowed to call it), the parameters it requires and the type of value it returns. The signature contains all the information needed to make sure that the method is called in an appropriate way, but contains no reference of what the method actually does.

Polymorphism

Polymorphism is another powerful feature of object orientation. By definition it means that any object of a certain class C can be treated as if it were an object of any superclass of class C. This is quite natural, since a subclass always inherits all the information and behaviour of its superclasses. This means that an object of that class will have all the methods and variables that the superclasses’ objects need to have, and therefore it can be thought of and treated as an instance of any of those superclasses.

If we create several Bicycle objects and Motorcycle objects, they are different and have different services, but they all contain the methods defined in the superclass Vehicle. Therefore we can treat these objects as Vehicles if needed. We can, for example, create a Vehicle array and store all the Bicycles and Motorcycles there. We can further iterate through the array and tell all the Vehicles to brake and turn to the right. If we later want to differentiate between Bicycles and Motorcycles, we can check the exact class of each object.

If a subclass overrides some of its superclasses methods, then the new methods (defined in the subclass) are used instead of the superclass’s methods. However, when the object is being treated as an instance of the superclass (as in the example) and an overriden method is called, two things could happen: 1) the subclass’s method is called, or 2) the superclass’s method is called. So which will happen? Object orientatied languages have a word for this: virtual methods. If a method is declared virtual, then all overrides are effective, and whenever a method is called, the actual class of the object is checked and the subclass’s version of the method is called. If a method is non-virtual, then the method defined in the class that the object is treated as will be called. Using non-virtual methods is a potential source of bugs, but virtual methods have a higher overhead, which makes method virtualization a performance issue. C++ methods are by default non-virtual, and Java methods are always virtual.

Abstract classes and methods

To be added later.

Conclusion

This concludes my short introduction to object orientation.
Hopefully it has been useful to you. I would definitely like to hear any
comments you have

Introduction to XP

This is an introduction to the XP (Extreme Programming)
method of developing software. It is based loosely on Kent Beck’s book
“Extreme Programming Explained” and mainly on my own experiences in using
XP. This introduction shortly explains the basics behind XP, how each XP
practice relies on other practices, with heavy testing being the founding
base, and how the XP process works. This intro does not cover the motivation
- why XP is useful. I assume that since you’re already here, you’ve found
out that traditional processes have their problems.

Behind the scenes

Values

XP is founded on 4 values, which are:

Communication
All interested parties must understand each other and communicate regularly.
Simplicity
A simple design cannot contain as many bugs as a complex one, so the system needs to be kept as simple as possible. A complex system is less easy to understand, which also adds problems.
Feedback
The domain experts must be able to continually steer the development in the right direction, while the technical staff provides them with the information on technical restrictions and possibilities.
Courage
Make bold decisions – make simple designs (they will work), code fast (rely on the automatic tests), make changes (refactor as needed, throw away dysfunctional code).

Principles

XP has 5 basic principles, which are:

Rapid feedback
The process needs to be self-steering, so feedback from all parties to others must be given quickly, and reacted to quickly.
Assume simplicity
Every problem should be solved as simply as possible, bordering on ridiculous. Most of the time simple solutions will work, and the few truly complex issues can be tackled when the simple solution doesn’t work. Time will be saved.
Incremental change
Make small incremental changes, not huge additions that will cause integration problems and slow down the feedback cycle.
Embracing change
Keep your options open and be ready to change the strategy at any time.
Quality work
While working fast, produce high quality results. Quality can only be “excellent”, no lower.

Practices

XP consists of 12 practices, which are shortly summarized here. Each practice supports others, making up a working system. Some practices may be left out without the system breaking up while others are mandatory.

The Planning Game
Quickly determine the scope of the next release by combining business priorities and technical estimates. As reality overtakes the plan, update the plan.
Small Releases
Put a simple system into production quickly, then release new versions on a very short cycle.
Metaphor
Guide all development with a simple shared story of how the whole system works.
Simple design
The system should be designed as simply as possible at any given moment. Extra complexity is removed as soon as it is discovered.
Testing
Programmers continually write unit tests, which must run flawlessly for development to continue. Customers write tests demonstrating that features are finished.
Refactoring
Programmers restructure the system without changing its behaviour to remove duplication, improve communication, simplify, or add flexibility.
Pair programming
All production code is written with two programmers at one machine.
Collective ownership
Anyone can change any code written anywhere in the system at any time.
Continuous integration
Integrate and build the system many times a day, every time a task is completed.
40-hour week
Work no more than 40 hours a week as a rule. Never work overtime a second week in a row.
On-site customer
Include a real, live user on the team, available full-time to answer questions.
Coding standards
Programmers write all code in accordance with rules emphasizing communication through the code.

Synergy of the XP practices

The most critical part of XP is the testing – every feature and non-trivial function of the software must be covered by automated unit tests.

Comprehensive tests allow us to do major refactoring as needed. Whenever a feature needs more or different functionality, we can recode it the way it’s needed. The automated unit tests will report if the refactoring breaks anything else in the system. We can thus expand and evolve the system and its design incrementally.

Our ability to refactor at need removes the need for detailed system design
beforehand. By using a simple design, changing the design
is easier than when heavy design work has already been invested in. When implementation starts, you always have a high level design, which is followed, but details are kept as simple as possible. As the system evolves, the low level design will mature on its own along with the actual code. When implementing a new feature, one shouldn’t try to build it perfect and complete right away, but rather do a crude first version that does what is needed, and no more. Keeping the grand design in mind, though, will steer the solutions in the correct direction, making further changed less severe.

To make the actual coding process as flexible as possible, everyone should be able to change everything. Collective ownership means that no-one owns any piece of code, but rather that everyone can change any part of the system. This also requires that coding standards are in place so everyone writes code the same way. We also need a code repository that is accessible to everyone and everyone can make commits to that repository.

The product in the repository must be functional at all times if different developers are to work on it simultaneously. This means that no erroneous code can be committed. All unit tests must run 100%, before new changes can be committed to the repository. Each change to the system (task) should be doable in less than one day. Continuous integration means that programmers make small incremental changes, run the automated tests and make a commit when the tests run flawlessly. As a result, the product in the repository changes several times a day, but is always in working condition.

Pair programming is an excellent method to improve code quality. One programmers types the code and concentrates on the details, while the other monitors, makes sure that unit tests are written, spots errors and concentrates on the larger design. Several studies have shown that two programmers working as a pair are more productive than two separate programmers. As part of code quality, no overtime should be used – a tired programmer isn’t creative, and it often requires creativity to find the most elegant, simple solution to a given problem. Tired programmers make easy but bad solutions.

Keeping the product always functional allows us to keep make small releases in a quick cycle – a new version can be released virtually daily, if needed.

The production team needs to have domain experts (customers) in it. The customer knows how the system is going to be used and can steer the development in the right direction. He can also spot mistakes and bad design quickly (since we have a short release cycle and the product is always functional). To make sure that the programmers and the customer understand each other, the system has to be described using a simple metaphor that everyone understands.

The planning game is the meeting where the customer and the programmers meet and plan the next cycle. The metaphor is used to make sure everyone understands the concepts and can communicate.

High level planning in XP

The milestones in the XP process are the Planning Game meetings, where Development and Business meet. Development is represented by the programmers and designers, while Business is represented by management, domain experts (customers) and marketing.

The Planning Game cycle can vary from a few weeks to several months. At each meeting the project plan is extended to cover the time until the next meeting. After each cycle, a new release is made.

An release cycle is divided into three phases that follow each other.

Exploration phase

Business writes story cards that clearly tell what features are needed in the system. Story cards can describe new features or changes to existing ones.

Development estimates how long each story will take to implement. This estimate is in IED (Ideal Engineering Time), meaning coding with nothing else to do and no interruptions or problems.

If a story can’t be estimated, it is split into smaller stories. Also if one part of a story is more important than others, the story can be split.

Commitment phase

Business sorts the stories into those that are vital, those that are significant and those that would be nice to have.

Development sorts the stories into those that can be esimated precisely, reasonably well or not at all.

Development tell Business the velocity, or, how much IET the team can complete per calendar month.

Business selects the cards that are to be included in the next release, either by first choosing the date and selecting appropriate cards, or first choosing the cards and calculating the release date from their estimates.

Steering phase

Steering is divided into iterations of 1-3 weeks in length. At the beginning of each iteration, Business selects one iteration’s worth of the most valuable stories that should be implemented. The very first iteration should include the stories that make up a working system.

If Development has overestimated its velocity, it can recover by asking the Business to leave out the least valuable stories from the release.

If Business needs a new story in the middle of development, it can write the new story, have Development estimate it and remove other stories to make room for it.

If Development feels that the plan is no longer accurate, it can reestimate all remaining stories and recalculate its velocity.

Low level planning in XP

Iteration planning is done within one iteration and only involves Development. It also contains three phases.

Exploration phase

Turn the stories of the iteration into tasks. Normally a story produces several tasks and sometimes one task can support several stories.

If a task takes longer than a few days, it needs to be split into smaller tasks. Also several one hour tasks should be combined into larger tasks.

Commitment phase

Each programmer accepts responsibility for a task.

The responsible programmer estimates the task in IET days.

Each programmer select his load factor – how many IET days he can complete in an iteration. This data comes from previous iterations. The load factor can range from 2 to 8 days in a three week iteration when half the time of each programmer is used in pair programming.

Each programmer adds up the task estimates he’s committed to and multiply by their load factor. Overcommitted programmers must give up tasks. If the entire team is overcommitted, they must recover in the Planning Game Steering phase.

Steering phase

Each programmer takes a task, finds a partner, writes the test cases, codes until they all work, integrates until all tests run and then commits.

One team member should collect information on progress – how much time each programmer has spent on each task and how much time is left.

If a programmer is overcommitted, he can reduce the scope of some tasks, ask the customer to reduce the scope of some stories, leave out nonessential tasks, getting more help, or asking the customer to leave some stories to a later iteration.

More information

www.extremeprogramming.org
contains good reference information on the practices of XP and helpful
diagrams of the process.