Clean Code & programming principles – The ultimate beginner’s guide

Programming principles are very important for writing clean code.

In turn, clean code is very important for writing:

  • Correct programs.
  • Programs that are easy to change.

In this article, we’ll start by covering the requirements of software. What is it that we’re trying to achieve when writing code?

Afterwards, we’ll briefly examine the kind of code that is difficult for us to work with.

Finally, from those two (requirements of software + the things we need to avoid in code), we are left with how to write good code. We derive the programming principles, the guidelines, to help us achieve that.

Audience

I believe that this article is suitable for all audiences.

If you’re a beginner, it probably won’t help you immediately. However, I believe that by "planting the seed", so to speak, it will really help you as you progress throughout your career. You’ll hopefully gain a surface understanding of what clean code is, why it’s important and how to achieve it. Hopefully, as you keep learning, that understanding will grow and help you more than if had you never read this article.

Further, when you revisit programming principles and clean code later you’ll hopefully be able to learn them quicker, as you’ll already have somewhat of a base from this article.

I think you’ll gain the most benefit from this article if you’re at the intermediate level. That’s the level where you can handle the basic syntax well and perhaps you can even write basic, complete programs. One of the next important steps in your journey is to learn how to write clean code that scales, so that you can proceed to write bigger and more advanced programs without issues.

If you’re at the advanced level, you’ll probably get the least benefit from this article. You’ll probably already know all of the things that this article will cover. However, this article features some unique points of view that you may not have heard elsewhere. This may be beneficial for you.

Requirements of software

Consider, what are the requirements of software? There are many ways to answer this question, but consider it from the angle of clean code and programming principles. What are the goals of clean code? What are the benefits of clean code? Conversely what are the downsides of bad code?

See if you can come up with some answers on your own before continuing.

My personal answer is this:

The requirements of software are:

  • Code should work as intended, with no bugs.
  • Code should be easy to change.

The fact that it should work as intended should be obvious.

However, why does it need to be easy to change?

It’s for a few reasons.

The first reason is that experience has shown us that we work best in iterations. That’s when we code a small feature and submit it. Afterwards, we code the next feature, modifying our old code in the process.

Another reason is because requirements change all the time, which results in even more code changes.

Finally, sometimes our code has bugs. To fix them, again, we have to modify the code.

Therefore, code should be easy to change, because a large part of what we do is modifying code. If we can’t modify it easily, then everything will take longer.

Clean code, good code, is code that optimises these core requirements. Programming principles are the guidelines that we have created to help us write good code in our day to day coding.

For much more detail on this, please see the post requirements of software.


Human limitations and bad code

As already mentioned, programming principles arise both from the requirements of software + avoiding the kind of code that’s difficult to work with.

Code can be difficult to work with because of our limitations.

Here are some examples:

  • We can’t remember too much at any one time.

    The quote about short term memory and the magical number 7 plus or minus 2 comes to mind. To counter that, we need code to be sufficiently independent (decoupled) and without hidden dependencies. That way when we’re modifying code, we won’t accidentally break it due to forgetting to also update a dependency that we didn’t remember existed.

  • We get impatient, skim things often, have bad days and get bored.

    To counter that, we should make code simple, easy to understand and easy to work with.

  • We like things simple.

    Complicated things are disproportionally more difficult for us, partly because we need to keep in mind many things about them at once. Therefore, we should make code simple and easy to work with.

  • We make mistakes, everywhere, all the time, often, and in all areas in life, whether it’s computers, mathematics, engineering, art, design, or anything at all.

    Therefore we always need to double check our work. As a result, we use practices like pair programming and automated testing. We also use tools to statically analyse our code.

  • We are bad with repetitive work, particularly if there are subtle differences between the repetitions.

    Repetitive work means more chances to make an error. Also, probably due to impatience and lack of focus, we’re more likely to rush this type of work, rather than provide the necessary care and attention to every single change. To help, we should minimise repetitive work.

Additionally, here is how we work:

We create software by composing code together.

This doesn’t happen by accident, it’s deliberate. If we just tried things at random until something worked, then by definition we would have little idea of what’s going on.

This means that we could have broken anything in the codebase without realising, except for anything that our tests guarded us against (if we even have tests).

To minimise the chance of error, it’s important to understand as much about what we’re doing as possible and to be as certain as possible that we’re doing the right thing. The best way to do that is to make code simple, easy to understand and easy to work with.

In summary, all of these things point to a certain way for how our code should be:

  • It should be simple.
  • It should be easy to understand.
  • It should be organised in a way that makes sense, so we can easily find what we’re looking for and so we can easily understand what’s going on and how the program is structured.
  • Different concerns (functionality) should be independent, without hidden dependencies, so that nothing randomly breaks when we’re making changes to seemingly unrelated files. Making independent changes should be easy.
  • We should minimise the need to make repetitive changes.
  • Etc.

Be pragmatic – The most important principle

Not just in programming, but pretty much everything in life, being pragmatic is essential.

It means to remember the true goal of what you’re trying to accomplish, maximise that, and not get side-tracked.

In programming terms, it also serves as a cautionary principle. It means that you shouldn’t dogmatically apply programming principles unless you truly believe that it is the best thing to do in your project.

Here are some example cases where you might want to be careful:

1. Don’t make code shorter if it will make it more complicated

Code that’s short and concise is usually good. It has many benefits and it makes code easier to work with.

However, I’m sure that you can understand that playing code golf (literally trying to make your code as short as possible using various mathematical and syntactic tricks) would be very detrimental.

Nobody would be able to understand how the code works without some serious time examining it and analysing it. Sometimes they wouldn’t be able to understand how the code works at all, especially if it uses unfamiliar mathematical tricks and such.

So use common sense. The goal is clean code, not to dogmatically apply programming principles. Always aim to maximise the true goal of making code easy to understand, easy to change, easy to test, easy to reuse. Short code just happens to be a good way of achieving all this, but only if applied sensibly and with those goals in mind.

2. Don’t repeat yourself (DRY) and the rule of three

The rule of three says that you shouldn’t refactor code into a single abstraction until you find 3 instances (or more) of similar code.

This rule is very useful because if you refactor on the second occurrence, you might create an "incorrect abstraction". The two pieces of code you combined into a single abstraction might each change differently in the future. In this case, your abstraction will be more difficult to work with. In fact, things will be more difficult than if you never combined them into a single abstraction in the first place.

So that’s why, the majority of the time, you should follow the rule of three.

However, be pragmatic. In rare cases, it may be far better to refactor on the second occurrence of similar code.

For example, you may have a situation where you have two instances of code which are both similar. Additionally, they may be quite long and complex, error prone to change, and also happen to change in very similar ways every time a change is required. Overall, working with those instances of code may be quite difficult. Having to make the change to one of them might be complex enough, having to make a similar change to both may be much worse.

In this case, it may be better for the health of the project to refactor these into a single abstraction, even though there are only two occurrences of similar code.

You would have to weigh the pros and cons. Consider if the benefits of refactoring the code into a single abstraction (faster and less error-prone changes) outweigh the cost that you might have to separate the abstraction again in the future (for example if they change too independently and working with them as a single abstraction proves difficult).

In a particular case in a project that I worked in, refactoring on the second occurrence was very worthwhile. The increase in ease of work and reduction of bugs far outweighed the potential cost in the future.

In summary:

  • Always be pragmatic.
  • Do what’s best for your software and project.
  • Don’t dogmatically apply programming principles unless they will improve your project.

KISS (keep it simple stupid) and the principle of least astonishment

KISS (keep it simple stupid) is another principle that’s universal to most things in life. It means that your code should be very simple and easy to understand.

The principle of least astonishment is also very important. It means that things should work exactly as you expect them to, they shouldn’t be surprising. It’s a cousin to KISS.

If you don’t keep things simple and easy to understand, you could encounter many problems:

  • Everything takes longer to understand.
  • Sometimes you might not understand how things work, even after spending a lot of time on them.
  • You might misunderstand how things work. Then, if you modify the software, you will create bugs by definition (when something doesn’t work as you expect, it’s a bug).

How to apply KISS and the principle of least astonishment

Default to writing dumb code, avoid writing clever code

Dumb code is simple code. Clever code is probably not simple code.

Really clever code is not simple, it’s difficult to understand, and it’s tricky. People will misunderstand it and create bugs as a result.

Keep code short and concise

Shorter code is more likely to be simple.

Shorter code also means that individual units of code (such as functions or classes) are easier to understand.

Use good names

If you name something well, you can understand what it’s doing just from the name.

This makes code much faster to read and much easier to understand.

It also means that you don’t have to read as much code. For example, if a function is well-named, you can understand what it does just from the name, without having to read the function body.

Always consider the programmer reading the code for the first time

This is the person you’re trying to optimise the code for. The colleague who has never worked on this code before, or even yourself, 6 months from now, when you’ve forgotten what this code does and how it works.

Consider that when you’re writing the code, you know what you’re trying to accomplish and you know what the code is doing. However, when you’re reading the code back 6 months later, not only do you have to decipher the complicated code but you have no idea what you were trying to do either, which makes things even harder to understand.

Consider immutability (never reassigning the values of variables)

Immutability provides a guarantee that a value will never change.

This can make the code simpler to understand, because you don’t have to trace through the code for the history of the variable, just in case it happened to change anywhere in your codebase.

Follow existing conventions

Code that follows existing conventions is unsurprising. Code that breaks conventions can be very unexpected. Someone who skims the code may not realise it doesn’t follow the convention, so they may misunderstand how it works.

Try to follow conventions which already exist in your codebase. Conventions which exist in your language or framework are less essential to follow, but also recommended.


Separation of concerns

Separation of concerns means to organise functionality well in code.

Code should be separated into sensible units (modules, classes, functions and methods). Someone looking at the code should immediately understand what the particular unit does.

For example, if you have a Circle class, an Enumerable interface or a Math object or module, you tend to have a pretty good idea of what each does and contains. You would expect to find Math.PI, or Math.pow(base, exponent) (these methods exist in the JavaScript Math object). However, you wouldn’t expect to find Math.printHelloToTheScreen() or Math.produceAccountingReport(). The methods in the latter example would be unexpected, which would break the principle KISS and the principle of least astonishment.

In addition, units should be small and only do one thing (also known as the single responsibility principle). Another way of thinking about this is that different concerns should be separated at a granular level.

For example, you shouldn’t have a god-class called Shape that has functionality for all possible shapes within it. Instead, you should have a small class for each shape.

This code is the bad version:

class Shape {
  constructor(typeOfShape, length1, length2 = null) { // length2 is an optional parameter
    this.type = typeOfShape;
    if (this.type === 'circle') {
      this.radius = length1;
    } else if (this.type === 'square') {
      this.width = length1;
    } else if (this.type === 'rectangle') {
      this.width = length1;
      this.length = length2
    }
    // And so on for many more shapes
  }

  getArea() {
    if (this.type === 'circle') {
      return Math.PI * this.radius ** 2;
    } else if (this.type === 'square') {
      return this.width * this.width;
    } else if (this.type === 'rectangle') {
      return this.width * this.length;
    }
    // And so on for many more shapes
  }
}

This is the good version:

class Circle {
  constructor(radius) {
    this.radius = radius;
  }
  getArea() {
    return 2 * Math.PI * this.radius;
  }
}

class Rectangle {
  constructor(width, length) {
    this.width = width;
    this.length = length;
  }
  getArea() {
    return Math.PI * this.radius ** 2;
  }
}

Here is another example.

This code is the bad version:

function sendData(data) {
  const formattedData = data
    .map(x => x ** 2)
    .filter(Boolean)
    .filter(x => x > 5);

  if (formattedData.every(Number.isInteger) && formattedData.every(isLessThan1000)) {
    fetch('foo.com', { body: JSON.stringify(formattedData) });
  } else {
    // code to submit error
  }
}

This code is the better version:

function sendData(data) {
  const formattedData = format(data);

  if (isValid(formattedData)) {
    fetch('foo.com', { body: JSON.stringify(formattedData) });
  } else {
    sendError();
  }
}

function format(data) {
  return data
    .map(square)
    .filter(Boolean)
    .filter(isGreaterThan5);
}

function isValid(data) {
  return data.every(Number.isInteger) && data.every(isLessThan1000);
}

function sendError() {
  // code to submit error
}

The idea that you should have small, specific units applies to all code.

Smaller, more specific units, provide multiple advantages.

Advantages of small units

Better code organisation

Technically, with the god-class Shape, you know where to go to find the circle functionality, so the organisation is not too bad.

But obviously, with the more specific units of Circle and Rectangle, you can find functionality faster and easier.

It’s less obvious with the sendData example, but the same thing applies. Say you want to find the functionality for validating the data. You can find that instantly in the second version. There is a function clearly named isValid, so you can look at that. sendData also calls isValid(formattedData), which obviously validates the data.

However, in the first version, you’ll have to spend more time reading through the details of sendData to find it. Then, since the functionality isn’t clearly labelled, you’ll have to also recognise the line which does the data validation. If you’re not familiar with the code, this may be difficult.

In summary, smaller units provide better organisation.

Simplicity and understandability

If you examine the Shape example, you’ll see that the code there is quite long and complex. It’s difficult to follow. In comparison, the code for Circle and Rectangle is super simple. As a result, it’s much easier to understand.

In the sendData example, understanding what sendData does is easier in the second version. It almost reads like English:

  1. Format data
  2. If the data is valid: fetch (although the fetch line may be better in its own function too)
  3. Else: sendError

You also don’t have to read the implementation of the separate functions, such as isValid, because their names tell you what they do.

All of the smaller functions are simpler too. If you want to understand the implementation for checking if the data is valid, it’s in a single line in the isValid function, without unrelated functionality around it to distract you. Additionally, even if you didn’t understand the implementation inside one of the functions, the name gives you a big hint, which is very helpful. Imagine how much worse it would be if you didn’t understand the implementation and you also had no clue what the code was trying to do.

In general, smaller units have less code and do less things. This applies the KISS principle, which makes code easier to read and understand.

Additionally, smaller, well-named units, make code much more descriptive and easier to understand.

Easier changes

Code that does fewer things is easier to change than code which does many things.

Consider the god-class Shape example. The code for the functionality of all the shapes is entangled together. Then, if you try to change the code for the circle functionality, you could accidentally modify something else and create a bug. This is made worse by the fact that the functionality for circle exists in multiple different methods inside Shape, so you’ll have to jump around and change multiple different things.

On the other hand, Circle and Rectangle are very easy to change:

  • All of the changes that you need to make are in one place.
  • Unrelated code is nowhere to be found, so you can’t break anything else in the codebase by accident.

The same applies in the sendData example.

In the second version, if you want to change the data validation, you only have to change the code inside the isValid function. You can’t break any unrelated code, because there isn’t any.

However, in the first version, since a lot of unrelated code is placed together, you might accidentally change something else by accident.

Easier to test

If a unit does less stuff, it’s easier to test than if it does more stuff.

Easier to reuse

If a unit does one specific thing, it’s immediately reusable any time you need that one thing. However, if a unit does 10 things, or even 2 things, it’s generally not reusable unless you need all of those things.

How to apply separation of concerns

Applying separation of concerns is fairly simple. All you have to do is extract functionality.

For example, with Shape, if you extract all of the relevant code for the circle functionality into its own class, you end up with Circle.

Here is a more step-by-step process:

Here is Shape again for reference.

class Shape {
  constructor(typeOfShape, length1, length2 = null) { // length2 is an optional parameter
    this.type = typeOfShape;
    if (this.type === 'circle') {
      this.radius = length1;
    } else if (this.type === 'square') {
      this.width = length1;
    } else if (this.type === 'rectangle') {
      this.width = length1;
      this.length = length2
    }
    // And so on for many more shapes
  }

  getArea() {
    if (this.type === 'circle') {
      return Math.PI * this.radius ** 2;
    } else if (this.type === 'square') {
      return this.width * this.width;
    } else if (this.type === 'rectangle') {
      return this.width * this.length;
    }
    // And so on for many more shapes
  }
}

Let’s create a class called Circle that doesn’t do anything yet.

class Circle {}

From Shape, let’s extract only the constructor functionality that’s relevant to circle. That’s the part inside constructor and inside the if (this.type === 'circle') conditional.

class Circle {
  constructor(radius) {
    this.radius = radius;
  }
}

Repeat for the getArea function:

class Circle {
  constructor(radius) {
    this.radius = radius;
  }

  getArea() {
    return Math.PI * this.radius ** 2;
  }
}

And so on for all the other methods which might be in Shape, as well as for all the other shapes.

The same process applies for sendData, although in this case we’re not completely replacing sendData like we did with Shape and Circle. Instead, we’re extracting functionality into separate functions and calling them inside sendData.

For example, the code to format data was moved into the formatData function and the code to check if the data is valid was moved into the isValid function.

When to apply separation of concerns

Now that you understand the "why" and "how" of separation of concerns, when should you apply it?

Generally, the difficult part to understand is "small, specific units that only do one thing".

That’s because the definition of "one thing" varies, it depends on context.

If you were to show the god-class Shape to a beginner, they would rightfully say that it only does one thing. "It handles shapes".

Someone else may say that Shape does a lot of things. "It handles circles, rectangles and so on. That’s multiple things".

Both points of view are correct. It all depends on how you break it down.

In general, it’s good to break it down quite a lot. You want units that do small, specific things.

That’s because, as already examined, smaller units give you more benefits than larger units.

So, here are some guidelines.

When code feels large and complicated

If you feel that some code is difficult to understand, or too large, try extracting some units out of it.

Can you keep extracting?

Robert Martin has a technique that he calls "extract till you drop".

In short, you keep extracting functionality until there is no reasonable way of extracting any more.

As you write code, consider: "Can I extract some more functionality from this unit, into a separate unit?"

If it’s possible to extract further, then consider doing so.

Also, please see Robert Martin’s blog post on extract till you drop for more information on this technique.

Reasons to change

Consider, what reasons does this code have to change?

Code which is placed together, which has different reasons to change (different parts may change at different times), is bad. We already examined this.

The solution is to move code with different reasons to change into separate units.

If you consider the Shape example, Shape will change every time the functionality for circle, rectangle or any other shape changes. It will also change any time a new shape is added.

In the sendData example, sendData could change for these reasons:

  • The formatting of the data may need to change.
  • The validation of the data may need to change.
  • The data in the error request may need to change.
  • The endpoint (URL) of the error request may need to change.
  • The data in the sendData request may need to change.
  • The endpoint (URL) of the sendData request may need to change.
  • Etc.

All of these reasons are indicators that you may want to extract and isolate those different pieces of functionality.

Who (which role in the company) may want to change this code

This is another flavour of trying to figure out the reasons that some code may have to change.

It asks who (which role in the company) may want to change the code.

In the sendData example:

  • Tech administrators may want to change something about the URL endpoints of the requests or the bodies of the requests.
  • Accountants may want to change the data validation in the future.
  • A product owner who uses the submitted data to generate reports could want to format the data differently in the future.

Both of these questions (what could change and who may want changes) try to point out different concerns in the code, that may benefit from separation.

Be pragmatic

As mentioned, small units are good. In fact, even very small units can be good.

Especially as a beginner, you should try making units that feel very small. One reason for this, is because your units may not actually be small enough, but you may not know that yet until you practice some more and experience the difference.

However, after you gain some more experience, just be pragmatic. You don’t have to make miniscule units if you don’t feel that they’re helpful.

The goal is always to write good code that is easy to work with, not to apply programming principles to the point where they’re detrimental to your code.


Principle of least knowledge

In software, it’s beneficial to minimise knowledge. This includes the knowledge that code has of other code (dependencies), as well as the knowledge you need to be aware of to work with particular areas of code.

In other words, you want software to be decoupled and easy to work with. Making changes shouldn’t break seemingly unrelated code.

Knowledge in code

In programming, knowledge means dependencies.

If some code (call it module A), knows about some other code (call it module B), it means that it uses that other code.

If some code is being used elsewhere, that means that there are limitations on how you can change it, otherwise you would break the code that uses it.

Without discipline and control, this is where you can get into a chain reaction of propagating changes. The situation where you just wanted to make a small change and had to modify every file in the system to do so. You changed A, which was used by B and C so you had to change both of those to accommodate your changes to A. In turn B and C were used in other places which you also had to change, and so on.

Every change is error-prone, multiple cascading changes are much worse.

Additionally, you need to actually be aware of these dependencies. If you’re not, and you don’t update things correctly, you’ll immediately introduce bugs. This is quite difficult to do, especially when dependencies propagate far and wide throughout your code.

That’s why you need to minimise knowledge in your code.

Modifications to code

So what can you modify if you want to make changes?

No change to contract

The only change you can make with no propagating changes, is a change that doesn’t affect anything else in the codebase in any way.

For example:

// Original
function greet(name) {
  return 'Hello ' + name;
}

// After change
function greet(name) {
  return `Hello ${name}`;
}

These two functions are exactly equivalent from a caller’s point of view. They have the same contract. If you change from one version to the other, nothing else in the codebase needs to change, because nothing could possibly be affected by this change.

Changing the contract of a "private" function

The next best case is when you change the contract of a private function. Something that’s not public to the majority of the codebase. In this case, if you change the contract, the code that is affected is very small.

For example, consider this Circle class:

// Circle.js
class Circle {
  constructor(radius) {
    this.radius = radius;
  }

  getArea() {
    return _privateCalculation(this.radius);
  }
}

function _privateCalculation(radius) {
  return Math.PI * radius ** 2;
}

export default Circle;

Next, consider that we want to delete _privateCalculation. Here is the code after the change:

// Circle.js
class Circle {
  constructor(radius) {
    this.radius = radius;
  }

  getArea() {
    return Math.PI * this.radius ** 2;
  }
}

export default Circle;

When we deleted _privateCalculation, getArea was affected. As a result, we also had to modify getArea to accommodate the changes. However, since _privateCalculation wasn’t used anywhere else in the codebase and since getArea didn’t change its contract, we’re finished. Nothing else in the codebase needs to be modified to accommodate the changes.

Changing the contract of a public function

The pattern continues in the same way. If you change the contract of anything, you’ll have to modify everything that uses it to accommodate. If you change more contracts as a result, you’ll have to modify more things and so on.

For example, if you delete getArea, you’ll have to update all of the code in the codebase that uses it.

In real code, you sometimes need to make changes like these. However, as much as possible, you need to prevent them.

The only real way to prevent them is to separate concerns properly. You need to organise your code into sensible units that make sense for your project. If done well, that minimises the chance that you’ll need to change those units in the future.

For example, what is the chance the the Circle class needs to change the contract of one of its methods? It’s very low.

Other than that, keep everything you can private, so that very little is affected if you change something.

More tips

When minimising knowledge, you have to do so with everything, not just "public" and "private" functionality.

Other applications of this principle include:

  • Interface segregation principle (which keeps interfaces small to apply separation of concerns and minimise knowledge required by the user code).
  • Law of Demeter (so you your code doesn’t know too much and couple things too tightly)
  • Immutability (so you always know the values of variables and don’t have to worry about how changing them may affect other code)
  • Only accessing / modifying variables in the local-scope, or perhaps up to class / instance scope. This is a broad topic known as "side effects". In summary, if many things can access and modify variables, then it’s really difficult to track their values over time. If they are not what we expect, at any time, we have a bug.

Abstraction and don’t repeat yourself (DRY)

DRY (don’t repeat yourself) is a core principle in programming.

It says that if you have multiple instances of similar code, you should refactor them into a single abstraction. That way you’ll end up with just one instance of the code, rather than multiple.

To accommodate the differences, the resulting abstraction accepts arguments.

Motivation for DRY

One of the reasons for DRY is to cut down the time it takes you to write code.

If you already have an abstraction for X functionality, then you can import it and use it, rather than re-code it from scratch every time you need it.

Another reason is to make changes easier. As already mentioned, we’re bad with repetitive work. If code is DRY, then you only have to make a specific change in one place. If code isn’t DRY then you have to make a similar change in multiple places. Making a single change is much better than making multiple similar changes.

Additionally, keeping code DRY is better for separation of concerns and code organisation.

Multiple instances of code, all of which do the same thing, are much harder to track and organise in your codebase than having just a single instance of that code. How do you organise 10 functions, all of which calculate the area of a circle? The obvious answer is to keep just one of them and delete the rest.

How to apply abstraction and DRY

Combine similar code into a single abstraction

Whenever you find multiple instances of the same or similar code, refactor (combine) it into a single abstraction.

You’ve probably done this a vast number of times throughout your career.

To illustrate the point, let’s use map, the functional programming utility.

The process that map abstracts is fairly common:

  1. Create a new, empty, array.
  2. Iterate over an array.
  3. Do something to every value.
  4. Push the result of every value to the new array.
  5. Return the new array.

Other than "do something to every value" and the array that you operate on, every other step is the same every single time.

Instead of writing that common code multiple times, you can use the map utility.

Here is some example code without map:

function double(x) {
  return x * 2;
}

function doubleArray(arr) {
  const result = [];
  for (let i = 0; i < arr.length; i++) {
    result.push(double(arr[i]));
  }
  return result;
}
const arr = [1, 2, 3, 4];
const result = doubleArray(arr);

Here is the same code with map:

function double(x) {
  return x * 2;
}
function doubleArray(arr) {
  return arr.map(double);
}
const arr = [1, 2, 3, 4];
const result = doubleArray(arr);

Here’s another more concise version using lambda functions:

const double = x => x * 2;
const doubleArray = arr => arr.map(double);
const arr = [1, 2, 3, 4];
const result = doubleArray(arr);

You can imagine that without map you would have for-loops everywhere in the codebase, a lot of them doing almost exactly the same thing. map makes things much better. It completely abstracts away all of the common functionality into a single method.

You can do the same with your code. Any time you encounter similar code, refactor (combine) it into a single function. You can also accept arguments for the differences. Then, use that abstraction wherever you need it.

Rule of three

The rule of three is a precaution against combining functionality too early.

It states that you should generally combine functionality into a single abstraction on the third occurrence, rather than the second occurrence.

The reason for this is that the instances of code you’re considering combining may change independently in the future.

For example, consider this code:

function formatUsername(str) {
  return str.toLowerCase();
}

function formatUserMessage(str) {
  return str.toLowerCase();
}

It would probably be a mistake to combine the common functionality into its own abstraction, like so:

function formatUsername(str) {
  return format(str);
}

function formatUserMessage(str) {
  return format(str);
}

function format(str) {
  return str.toLowerCase();
}

The problem is that in the future, the formatUsername and formatUserMessage may change independently.

For example, in the future, formatUsername may need to be stripped of whitespace at the start and end. Also, formatUserMessage may need to insert a newline character after each sentence.

Obviously we could make scenarios work in the format function, but it would be more difficult than if we had kept the functionality of formatUsername and formatUserMessage separate from the start.

This is why we use the rule of three. Waiting until the third occurrence makes it more likely that the similar functionality is significant rather than coincidental, so things are less likely to change independently in the future.

It also makes it so that if one of the three instances of similar code changes independently, you can separate it and still keep the combined abstraction for the other two. On the other hand, if you combined functionality on the second occurrence, then had to separate them out again, you would have to revert both of them.

Overall, it means that there is probably less work for you to do in the future and less risk, if you follow the rule of three.

Of course, the rule of three is just a guideline. Remember to be pragmatic. If some similar instances of code are changing in the same way every time, or you judge that they are very likely to change in similar ways in the future, and the changes are error prone, it may be best to combine them into a single abstraction immediately. At the end of the day, project health is most important, not blind adherence to principles.


Side effects

In programming, the general definition of a side effect is anything that changes the state of the system. This includes:

  • Changing the value of a variable.
  • Logging to the console.
  • Modifying the DOM.
  • Modifying the database.
  • Any mutation whatsoever.

It also includes "actions" that may not be viewed as mutations, such as sending data over the network.

Personally, I also consider accessing non-local scope to be a side effect, or at least just as unsafe as a side effect. This is especially so if the variable you’re trying to access is mutable. After all, if you access a global variable whose value isn’t what you expect, you have a bug, even if the code in question doesn’t modify it.

Of course, in all code, you need "side effects" for the program to do anything useful. For example, in a web application, the server must send HTML to the client.

The danger of side effects

Side effects are not directly harmful, but they can be indirectly harmful.

For example, code A and B might both depend on the value of a global variable. You might change the value of the global variable, because you want to influence code A, while you don’t remember that code B will be affected as well. As a result, you now have a bug.

These hidden dependencies, where you change one thing and break something else, can be very difficult to remember, track and manage.

Another example is changing the DOM. The DOM can be thought of as just a global object with state. The problem is that, if different pieces of code affect the DOM at different times, in non-compatible ways, there can be bugs. Maybe code A depends on element X to be there, but code B deleted that entire section altogether just before code A ran.

Perhaps you’ve encountered bugs like these in your work as well.

Additionally, side effects break most of the principles we’ve covered so far:

  • KISS and the principle of least astonishment.
  • Principle of least knowledge (because code affects other, seemingly unrelated code).
  • Separation of concerns (because concerns are not necessarily self-contained or well-organised).

One important thing to understand however, is that side effects are not inherently harmful. They only cause bugs if we code them incorrectly. They are code we write which happens to be incompatible with other code we write. We write code A and then we write code B which breaks code A under certain circumstances.

The main danger of side effects is that they’re generally very difficult to track.

The reason for that is because tracking global state, which anything can modify at any time, is very difficult. If uncontrolled, how could you possibly track changes made to the DOM over time? You may have to track so many things that it just wouldn’t be feasible.

Asynchronicity and race conditions also add to the complexity and difficulty of tracking side effects.

Another downside of side effects is that code with side effects is generally harder to test.

Handling side effects

Even though side effects are dangerous, they can be handled effectively.

Be pragmatic

The most important point, yet again, is to be pragmatic.

You don’t have to avoid all side effects to the extreme. You are only required to be careful with potentially incompatible code.

For example, immutability is a good way to avoid many types of side effects. However, immutability makes little difference in the local scope of functions.

function sumRange1(n) {
  let result = 0;
  for (let i = 1; i <= n; i++) {
    result += i;
  }
  return result;
}

function sumRange2(n) {
  if (n <= 0) {
    return 0;
  }
  return n + sumRange2(n - 1);
}

In the example, sumRange1 uses mutation. The values of result and i both change during execution.

sumRange2 uses immutable variables. The values of the variables inside it never change during function execution.

But it makes no difference. Other than some language limitations of recursion (which we’ll ignore for this example), for all intents and purposes, sumRange1 and sumRange2 are exactly the same from the perspective of the caller and the rest of the code in the system.

In fact, people tend to be less comfortable with recursion than loops, so sumRange2 could actually be the worse choice depending on your team.

So be pragmatic and do what’s best for your project.

Immutability

Having said that, immutability is an easy way to avoid a large portion of side effects.

By never modifying values in your code unnecessarily, you remove the large problem of values changing unexpectedly and having to track the lifecycle of variables to know the values they contain.

When starting with immutability, start simple and over time try to make as many things immutable in your work as possible.

With variables containing primitive data types, just create a new variable instead of modifying the old one. With mutable objects, instead of modifying the old one directly, create a new variable with a new object, put the results in that, and then return the new object.

For example:

// Example 1 - Don't do this
function doubleArray(array) {
  for (let i = 0; i < array.length; i++) {
    array[i] = array[i] * 2; // mutates the original array
  }
}
const arr = [0, 1, 2, 3];
doubleArray(arr);
// Example 2 - Do this
function double(x) {
  return x * 2;
}
function doubleArray(array) {
  return array.map(double); // returns a new array, without modifying the original
}
const arr = [0, 1, 2, 3];
const result = doubleArray(arr);

In example 1, the original array is modified.

In example 2 the original array is not modified. doubleArray creates a new array with the doubled values, and we create a new variable called result to hold the new array.

Avoid non-local scope

Avoid accessing or modifying things that are not exclusively in the local scope of your functions or methods. This means that it’s probably okay to modify variables that originated in your local scope, but not variables which were passed in as arguments (originated outside of the local scope).

If necessary, it’s alright to mutate things up to instance or module scope.

The further up from local scope you go, the more dangerous it gets, because it gets harder to track the values of variables.

Wherever possible:

  • Pass things explicitly as arguments, rather than accessing things outside of the local scope.
  • Stick as close to local-scope as possible.

For example:

// Example 1 - Don't do this
function doubleResult() {
  result *= 2; // Accesses and mutates a variable outside the local scope
}
let result = 5;
doubleResult();
// Example 2 - Do this
function double(n) {
  return n * 2; // Accesses parameter which is in local scope. Doesn't mutate anything
}
const initialValue = 5;
const result = double(initialValue);

In example 1, the doubleResult accesses result, which is a variable outside its local scope. It also mutates it, changing the value of result. Now, if any other code in the codebase accesses result, it will see the new value.

In example 2, double only accesses its parameter, which is part of its local scope. It doesn’t mutate any values outside of its local scope.

In a real codebase, something resembling example 1 could be very difficult to track. The result variable may be much further away from both the doubleResult function as well as the doubleResult function call.

Then, if at some point result is not exactly what you expect, for example, because you’ve already called doubleResult 3 times but don’t remember, you have a bug.

Overall, in example 1, you can’t predict what a function that uses result will do unless you know the exact value of result at that time. To do this, you’ll need to search and trace through the entire codebase to keep track of result at all times.

In the second example, initialValue is always 5, so there are never any surprises. Also you can see what the function is doing immediately and can easily predict what will happen.

Be extremely careful

Sometimes you can’t just rely on immutability. For example, at some point, you must mutate the DOM or the database, or make a call to a third party API, or run some sort of side effect. As already mentioned, asynchronicity only adds to the problem.

In this case, you just have to be extremely careful.

Side effects are probably where the majority of the bugs in your codebase reside or could reside in the future, because they’re the hardest code to understand and track.

Regardless of what you do to try and manage side effects, you must always invest the required time and attention to them.

Separate pure and impure functionality

For the most part, try to separate code with side effects and code without side effects into separate functions. Your functions shouldn’t both perform side effects and have "pure" code, they should do one or the other (within reason). This is also known as the command-query separation principle.

Functions without side effects (pure functions) are generally easy to understand, reuse and test. On the other hand, functions with side effects are the opposite. They can be harder to understand, reuse and test. Additionally, they are more likely to contain bugs.

Separating the two can be helpful due to the principles we’ve already examined. For example, since side-effect code is more likely to have bugs and therefore you may have to modify it more often, it should be separate from other code (pure code) which won’t need modification as often. Additionally, in terms of code organisation, things like calculations can be considered vastly different to things like updating the database. Those different concerns should be separated. And so on…

For example, instead of this:

function double(x) {
  return x * 2;
}

function doubleArrayAndDisplayInDOM(array) { // this function does a non-trivial calculation / operation and performs a side effect
  const doubled = array.map(double); // (pretend this is a non-trivial calculation / operation)
  document.querySelector('#foo').textContent = doubled;
}

function main() {
  doubleArrayAndDisplayInDOM([1, 2, 3, 4]);
}

Do this:

function double(x) { // this function only does a calculation
  return x * 2;
}

function doubleArray(array) { // this function only does a calculation / operation
  return array.map(double);
}

function displayInDom(content) { // this function only performs a side effect
  document.querySelector('#foo').textContent = content;
}

function main() {
  const doubled = doubleArray([1, 2, 3, 4]);
  displayInDom(doubled);
}

Clear areas of responsibility

Apply separation of concerns and good code organisation.

As much as you can, try to make sure that code which performs side effects doesn’t conflict with other code performing other side effects at different times.

For example, if code A modifies element X in the DOM, then it should ideally be the only code which modifies that part of the DOM. That way tracking changes to element X is as easy as possible.

Additionally, try to organise code dependencies well. For example, code A shouldn’t run if any other code runs which would conflict with it. Also, code A shouldn’t run if the state that it depends on isn’t there or isn’t what code A expects.

Side effects in pairs

For side effects which come in pairs (e.g. open / close file), the function that opened the side effect should ideally also close it.

For example, instead of this:

/* Note, this is psudocode */

function openFile(fileName) {
  const file = open(fileName);
  return file;
}
const file = openFile('foo.txt');

/* Lots of other code in-between */

doStuffToFile(file);
close(file);

Do this:

/* Note, this is psudocode */

function useFile(fileName, fn) {
  const file = open(fileName);
  fn(file);
  close(file);
}
useFile('foo.txt', doStuffToFile);

Robert Martin calls this technique "passing a block". The function useFile both opens and closes the file, so it doesn’t leave an open file pointer in the system. It accepts a function as an argument to execute on the file. At the end, it closes the file.

This ensures that you won’t forget to close the side effect later. It also provides good code organisation and applies KISS, separation of concerns and the principle of least knowledge, because the entire side effect is fully handled in one place.

Consider using a framework or functional programming language

As mentioned before, the best option might be to avoid side effects as much possible.

To help with this, you can consider delegating some of them to a framework / library, or a functional programming language.

For example, for working with the DOM, you can use a library such as React (or one of the many alternatives).

Something like React handles all the DOM-related side effects. Then, in your application, you just write pure functions. Instead of modifying the DOM directly, your pure functions return a JavaScript object of what the DOM should look like. React then uses this object and makes the required changes to the DOM.

This can be good because pure functions are much easier to work with, track and test than functions with side effects. You just delegate the side effects to React to deal with.

Additionally, the parent / child hierarchy of React ensures that your DOM manipulations won’t conflict with each other and cause problems. For example, React code involving element X won’t run if element X won’t actually exist. This is an example of good organisation and structure in your code to prevent conflicts with other side effects.

Of course, all of this requires that you follow the React conventions and not bypass React to work with the DOM directly.

Disclaimer: There are many more things to consider before choosing to use a framework / library like React or an alternative. In this case, I’m just presenting it as an example for you to consider, not as a recommendation.


Further reading

That was a quick (quicker than a book at least) high-level overview of what I consider to be the most important concepts for writing good code. I hope that this article helped you understand the reasoning, motivation and overview behind clean code and programming principles. Hopefully, when you go on to learn more programming principles, or find more examples of how to apply programming principles, the insights you gained will help you understand them and apply them better.

For the next step, I recommend learning clean code and programming principles more practically. Use a resource that explains the concepts with many examples and applications in code.

I highly recommend looking into content created by Robert Martin. For the "quick", free version, I found his lectures Coding a better world together part 1 and Coding a better world together part 2 to be some of the best programming videos I’ve ever watched. For more detail you might want to check out his book Clean Code or his videos Clean Coders (start with the fundamentals series and the SOLID principles). I’ve learned a lot from Robert Martin’s resources. I especially like that he explains the principles very practically, giving many practical examples of each one and a lot of information in general.

I also found the book The Pragmatic Programmer very good. Some of the details are outdated, but the concepts are not and that book truly hammers in the concept of being pragmatic with your code and your work. If anyone reads the 20th anniversary edition of The Pragmatic Programmer please let me know what you thought. It’s on my list but I haven’t read it yet.

I’m sure there are other amazing resources as well, but these are the ones I’m familiar with and can personally recommend.

Finally, I recommend thinking deeply about programming principles yourself and challenging them. Spend time on your own and consider everything that this article discussed.

Alright, I hope this post was useful to you in some way. If you have any feedback or even counter-arguments, please let me know in the comments. I’m always happy for a discussion. See you next time.

Subscribe
Notify of
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Nikos
Nikos
16 days ago

Thank you for writing this! It gave me a great insight into how to structure my code better and separate concerns. There’s lot to take in but everything I read made sense to me and seems logically correct. I will definitely be reading it again and applying the principles to solidify the knowledge.