Using Story Points Correctly

TL;DR

Your sprints will be successful if you use points as relative work and if you base your sprints on your team’s average point completion rate. Otherwise, you will either not complete as much as you expect or you will complete stories faster than expected, setting you up for failure in future sprints.

Points are relative work

Points are used to compare stories against the smallest amount of work (e.g., changing CSS colors, fonts or some other simple thing). I wrote more about that previously in How Not To Use Story Points.

If a story is pointed at 3, asking the developer to “point it a 5” will not make the story result in higher quality, include more features or be done faster (or slower). The work drives the points not the other way around.

Pointing Stories

How stories are Pointed is just as important. It is easy for the developer to hem and haw and pull a number out of the air, but that leads to poor pointing. Instead we want to mix in some rubber ducking.

English: A rubber duck assisting with debuggin...
A rubber duck assisting with debugging code. (Wikipedia)

To do this, the developer needs to explain the story in enough detail so that someone not familiar with the work can understand the level of effort that it requires to complete the story.

For example, a story to add a log in screen to a page might be described like this:

  1. The screen has the following elements:
    • email and password fields
    • a “login” button
    • a “forgot password” link, and
    • a “signup” link
  2. Both fields are required and appropriate error messages will display if either are not filled out when the login button is pressed.
  3. If the credentials are incorrect, a notice will be displayed with the text “email and/or password incorrect”, the password field is blanked, but the email field retains whatever value was previously entered
  4. Clicking the forgot password link will go to the password recovery screen
  5. Clicking the signup link will go to the registration screen
  6. If the user successfully signs in, they will go to the dashboard screen

Voting

Once the story is adequately explained, then the developers must all vote on it. This serves as a “reality check” and prevents skewing by any one person. Whereas one person may see the work at one number, others may view it at another. The rough average of votes is taken (sometimes throwing out the high and low, depending upon the group) and fit in the point scale: if the team used the fibonacci scale and a vote average was, for example, 4, it would be pointed as a 5 to fit in the scale.

The most effective voting system is done where no one can see how the others voted until all the votes are cast. This prevents issues of groupthink.

The Sprint

Keeping track of the average number of points delivered in prior sprints is a good way of estimating the number of points one can expect the same team can deliver in the current sprint. Remember, the work drives the points, not the other way around, so insisting the team deliver Widgets 1 through 5, when the points say they should only be able to deliver 1 through 3, is the way to deliver even less.

Things that can affect delivery are team members’ absences (due to planned vacation or sickness), bugs and mispointing.

If a team member is absent, they obviously cannot deliver the points they normally could. If the absence is planned, then reducing the expected points for that sprint is one way to mitigate the risk.

Bugs should never be pointed, because they do not add value and cannot be sized against stories. Bugs are bugs, so time spent on bugs is expected to cut into time available for building features. This will naturally reduce the expected point delivery over time. The average point delivery will reflect bugs the team encounters.

Mispointing is less of a concern when there are enough (3-5) developers present to vote on each story. If the story is sufficiently described, and outliers are discarded, better accuracy can result.

DRYing up tasks makes life easier

When writing scripts that perform the same task over and over again with different parameters, it is tempting to just cut and paste. Here’s an example of a bash script:


#!/usr/bin/env bash
grunt make_magic --dir=/foo
grunt make_magic --dir=/bar
grunt make_magic --dir=/lorem
grunt make_magic --dir=/ipsum
grunt make_magic --dir=/magnum
...

Imagine that goes on for hundreds of directories!

This isn’t very DRY. If we wanted to add an extra parameter to the grunt command, we’d have a lot of editing to do.

If we needed to do another command on those very same directories, we would have to enter the directory names again. As you can see, it would be very easy to forget to add all of them.

DRY

DRY stands for “Don’t Repeat Yourself”. If some code or commands are duplicated elsewhere, that’s a sign that there are inefficiencies in the code.

The idea behind this is to pull out common code into functions/methods. When a change to that code is required, it only needs to be made once. If a bug is identified in the code, once it is fixed, it is fixed “everywhere”.

DRYing

Let’s take the example above and dry it up. We have to run the same command with different parameters (directories), so we will create an array of directories:

DIRS=( foo bar lorem ipsum magnum )

Now we write a “for…each” loop to do something for each element in the array:


for thedir in "${DIRS[@]}"; do
grunt make_magic --dir=/$thedir
done

This works on each element of the DIRS array and assigns it to thedir variable, which is now available inside the block.

Using this technique, it is very easy to add additional commands and/or directories.

How (not) to use Story Points

Abstract (tl;dr)

Story points should never be used to represent hours, but simply relative size of effort to complete a story. Over time, the team will tend to complete a consistent range of story points each sprint. Trying to tie story points to duration breaks the model and leads to inaccurate forecasting. Points should not be used to compare teams, nor should they be used to compare bugs.

Introduction

In Agile projects, each work packet is called a Story. Each story has a point value assigned to it.

I prefer to use the Fibonacci scale for story points.

1, 2, 3, 5, 8, 13

Each number is the sum of the previous two numbers (3 = 2 +1; 8 = 5 + 3; etc).

But what do these points mean? We will get to that in a minute, but first let’s examine how difficult it is to estimate sizes.

Glass of Water

English: Glass of water sitting on a coaster.
(Wikipedia)

How many ounces of water is in the glass? If you’re like most people, that is not an easy thing to guess.

On the other hand, if I compare it to the glass below, I might say is has roughly twice as much water, and I would be mostly correct.

en: A glass of water / de: Ein Glas Wasser / t...
(Wikipedia)

This is the concept of story points.

1 is the baseline amount of work.

2 is twice as much effort

3 is three times as much, and so on.

Either 8 or 13 are “too big to do”—otherwise known as EPICs—and are slated to being broken up into smaller stories later.

The reason this is done is because it is easier for humans to judge a relative size than an absolute size.

Smallest amount of work

The effort of the smallest amount of work is considered a “1”. For web projects, this is often the effort required to change some element’s CSS style (color, font, size, etc).

Every other story is compared to this task.

Count the points

During the course of the sprint, the team completes the stories. And the end of the sprint, all completed story points are summed and that is the number of points completed for that sprint. After 3-4 sprints are completed, the average number of completed points for the prior three sprints is a good indicator of the number of points the team will complete in the next sprint.

Sprint 1 Sprint 2 Sprint 3 Average
14 18 14 15 (rounded down)

As you can see, points cannot be hours because the number of points varies, based upon many factors:

  • Team members’ skills
  • Team leader’s leadership
  • Distractions/work environment
  • Complexity of work
  • Tools/equipment quality

Points are not fungible

fungible (fŭnˈjə-bəl)

adj. Interchangeable.

Points are not fungible, that is, they are tied to the team. One can’t judge one team’s performance against another’s by counting points, because of the factors that cause variability in points. One team might complete 15 points in a sprint, while another might complete 40. The first team is not worse than the second team; points mean different things for the two teams.

Bugs don’t Point

Donald Trump enters the Oscar De LA Renta Fash...
HUUUUUGE! ( Wikipedia)

You can’t point bugs. Well, you can, but you’re making a huge (huuuuuge) mistake. When pointing stories, one needs to explicitly lay out the tasks required to complete the work. Then, that work is compared to the Smallest Amount of Work and given a point.

Bug HAVE NO defined tasks required to complete the work because no one knows what is causing the bug. What are the exact steps required to fix it? There aren’t any; one simply works the problem until it is fixed.

For example, let’s take the “bug” of me losing my car keys. How late will I be? If I estimate the tasks to find my keys, it will be something like:

I will look:

  1. In my backpack
  2. On the table
  3. In my pants
  4. On the kitchen counter
  5. On the bathroom counter
  6. On the nightstand next to my bed
  7. Under the nightstand next to my bed

Given all that, when will I find my keys? After #1? After #8? Later?

Misunderstanding Cookies

Example of Cookie Notification

In a recent column, Tracey Capen posits the reason so many sites are displaying the “This site uses cookies. Click ‘okay’ to continue” banner is because of Advertising—specifically because many people use ad-blockers so the advertisers want to still be able to track you.

 

Cookies in the EU

He almost accidentally mentions the actual reason for these banners: EU Regulations on privacy. Specifically, the regs state that websites must notify visitors before the site places cookies on the visitor’s browser, AND give them an option to opt out before that happens.

English: Tor Logo
(credit: Wikipedia)

So why do we in the non-EU part of the world see these banners? Because it is simply easier to display the banner to everyone, rather than by attempting to determine if the visitor is in the EU and displaying the banner IFF they are.

There are a myriad of ways to mask one’s location, such as using the TOR network, so the risks of failing to show the banner (and getting fined) versus just displaying it to everyone is a no-brainer.

But what about Advertisements?

Ads—and their ad-delivery network—use cookies in a variety of ways. Advertisers are charged by the networks for impressions (number of people to whom the network delivers the ad or CPM) and for the number of people who click on the ad (clickthroughs). The latter is more valuable to the advertiser and thus the networks charge more.

The ads aren’t directly loaded on the page without the network because there needs to be a system that counts the number of impressions, rotates multiple ads in the page location, and stops showing an ad when the ad’s CPM is exhausted. If an advertiser pays for 5,000 CPM, once 5,000,000 visitors visit the page, the next visitor will not see that ad.

When an adblocker runs it does at least one of two things: it blocks the ad code or (more likely) it blocks the ad delivery network code. What this means is the visitor’s browser doesn’t even request the code from the server; NOTHING from that server is requested. No code means the ad network can’t run and deliver the ads. Also, without a request, there is no cookie.

Even if somehow the site displayed the banner (“please accept our marketing/advertising cookie” Capen imagines them saying), the cookie can’t be added because there is no cookie. There is no ad network that loads. There is no ad to display.

If I’m going to hire you, I don’t want to read your resume

Well, I take it back. I want to remember your name, and I’m terrible at names. So it helps if I have your name in front of me.

Dead-Tree Resume
Dead-Tree Resume

“But,” you ask, “How will you know if the candidate has the skills and experience you need?”

Good question. Let’s look at the purpose of an interview.

The Two Questions

An interview must answer two questions:

  1. Can the candidate do the job now and in the future?
  2. Will the candidate be a good fit for the team?

Neither of these questions are answered by the candidate’s resume.

Wait a minute

Unless the candidate worked for a company doing the exact work, for the exact clients you have, their experience will not tell you if they can do the current job at your company.

Put it this way: would it make sense to compare your company to another in a different industry, with different customers and a different product, and then make a judgement about performance based upon that comparison?

Company Industry # Customers # Employees Product
MyCorp Pharmaceutical 3 7 Blue Pills
Acme, Inc Energy 150 70 Batteries
Initech Accounting Software 3,200 1,500 TPS Reports

We will investigate the 2 questions in a future post.

Adding defensive sanity checks

I recently needed to make a set of several favicons, so I went to the web to see if anyone had a script I could borrow steal.

Sure enough, I found one written by Joshua McGee: “Create a favicon with ImageMagick” (not reproduced here for copyright reasons).

It was a simple enough script, just a series of escaped commands. I noticed, however, that it assumed a few things:

  • An image file was specified on the command line,
  • The image existed, and
  • Imagemagik was installed.

In other words, the script was not developed defensively. This makes sense: it was just a bang-out.

The script had no inline documentation, and if a favicon file that already existed in the current directory would be silently overwritten—not good.

I’m clumsy: I delete and overwrite files all the time, so I could use a little help. Maybe I can tidy up the script? (more…)

Standalone CRUD

CRUD: Create, Read, Update, Delete; actions for managing data usually stored in a database.

Data model diagram picture of an EMPLOYEE data...
Data model (credit: Wikipedia)

A system I maintain has a very unusual quirk: when adding a new element (“blub”) to a list of elements (“blubs”), the system crashes with a generic error.

What this tells me is there is an unmet dependency, probably a join to another database table. I suspect the original developer (OD) created all the blubs manually, then later added in CRUD screens to manage them. For some reason OD never tested creating a new blub.

For any list of things, unless they are constants that will never be CRUDded (okay, reading is okay), the app should be able to Create/Update/Delete them without any manual steps.

Why? Because someday, some schlub who maintains your code is going to have to CRUD. And they will be screwed.

The importance of turning on debugging

Debugging a php script with emacs in geben mode.
Debugging a php script with emacs in geben mode. (Photo credit: Wikipedia)

I was recently testing some wordpress plugin code for an upgrade. As part of my testing, I turn on debugging to see if any errors or warnings show up.

Imagine my surprise when several errors appeared, many of them deprecation warnings. After troubleshooting to determine the source of the errors, I discovered it was coming from the theme I was using.

Without turning on debugging, the site appears to behave normally, which is problematic. One of my objections to many languages and frameworks is that they hide problems from the developer. In many cases, for example, deprecation warnings don’t mean much—except they’re a ticking time bomb. At some point in the future, the code is going to break. Better to fix it now while it is easier to do, than scramble to diagnose and fix lots of code just after an upgrade.

In the case of WordPress (or PHP in general), it would be helpful if the admin area showed everything – bugs and warnings. For novice admins, of course, this would be a support nightmare, as they would have little clue what was going on. The upside is (hopefully) sloppy coders would fix their stuff promptly. Nothing will get a plugin/theme pulled faster than getting an error right after installing it.

Development Languages are not Just Syntax

I recall a conversation I had with a coworker years ago. This fellow mentioned he thought differences between languages (e.g., Perl and PHP) were simple syntax.

example of Python language
example of Python language (Photo credit: Wikipedia)

This is true, sort of. The nuts and bolts for each language (displaying a variable’s value and the like) are same in concept and similar in construct.

<?php =someVar ?>

or

<% puts someVar %>

You get the idea.

The difference (and where the coworker was wrong), is the concept of how the data flows through the system and which pattern(s) are encouraged/discouraged.

Some languages encourage MVC, whereas others encourage spaghetti and still others encourage a hybrid (or something else entirely, like modules).

What is a language best at?

Going beyond this, what is the language best at doing? Is it installed in places you want to host your code? Do you have data structures/stores that fit better with one language than another?

Picking the best tool for the job is important in delivering a quality product. Choosing which language to use isn’t trivial and can make it easier—or harder—to deliver the solution.