Clean code and programming principles

Clean Code & Programming principles

So you want to learn programming principles and how to write clean code?

The goal

Programming principles are very important when it comes to writing good code.

In turn, good code is important because:

  • Good code means a higher chance that the code is correct and has less bugs.
  • Good code is easy to change. It makes it easy to add new features, fix bugs, etc.
  • Good code is easier to read, understand and work with.

Following programming principles is an attempt at achieving those goals.

Unless your goal is to learn programming principles because someone is going to test your knowledge of them, then your true goal is probably to learn how to write better code, rather than learn programming principles. But you hope that learning programming principles will help you write better code.

This distinction is very important, so here it is again. Programming principles are not the goal, writing better code is the goal. Programming principles are just best practices to help you achieve your goal of writing better code.

The distinction is important because it encompasses the first and most important principle of all:

Be pragmatic. A.k.a. always remember the true goal and work towards that. A.k.a. don’t dogmatically follow "best practices" if you judge that they will make your code worse rather than better in particular cases.

If you’re still with me and agree, then the most important thing to do is examine the requirements of software. What it is that clean code and programming principles try to help us with.

Otherwise, please feel free to skip the next few sections and jump straight to the programming principles further down.


Requirements of software

Consider, what are the requirements of software? Of course, there are many ways to answer this question, but consider it from the angle of clean code and programming principles. What are the goals of clean code? What are the benefits of clean code? Conversely what are the downsides of bad code?

See if you can come up with some answers on your own before continuing.

My personal answer is this.

The requirements of software are:

  • Code should work as intended, and have no bugs.
  • Code should be easy to change. The reasons why it needs to be easy to change are that we develop new features all the time, we fix bugs and we have to maintain code even after it’s accepted by the client. All of these actions require modifying the code repeatedly, therefore it should be easy to change.

For much more detail on this, please see the post Requirements of Software.

So to summarize, this is why we write clean code and apply programming principles. Cleaner code helps us fulfil the requirements of software: Code should work as intended and it should be easy to change.


Human limitations

The reason for our current standards of what we consider to be "clean code" is largely influenced by our limitations.

For example, we tend to be bad with repetitive work. Additionally, regardless of how hard we try, we always make mistakes, even with simple things. Also, we can’t handle infinite complexity, and unfortunately, software can grow to be very complex with interweaved dependencies reaching far and wide into the system.

To overcome these limitations, we have established some principles and practices to make it easier for us to work with code.

Here are some of our limitations, and what we should do to counter them:

  • We can’t remember too much at any one time. The quote about short term memory and the magical number 7 plus or minus 2 comes to mind. To counter that, we need code to be sufficiently independent (decoupled) and without hidden dependencies. That way when we’re modifying code, we won’t accidentally break it due to forgetting to also update a dependency that we didn’t remember existed.
  • We get impatient, skim things often, have bad days and get bored. To counter that, we should make code simple and quick to understand and easy to work with.
  • We like things simple. Complicated things are disproportionally more difficult for us to work with, partly because we need to keep in mind many things about them at once. Therefore, we should make code simple and easy to work with.
  • We make mistakes, everywhere, all the time, often, and in all areas in life, whether it’s computers, mathematics, engineering, art, design, or anything at all. Therefore we always need to double check that our code works. To counter that, we use practices like pair programming and automated testing. We also use tools to statically analyse our code, whether they’re built-in to the language with a strongly typed language like C#, or separate tools like JavaScript’s ESLint.

Additionally, here is how we work:

We create software by composing code together. This doesn’t happen by accident, it’s deliberate. If we just tried things at random until something worked, then by definition we would have no idea what’s going on, which means we might have broken anything anywhere and we would have no idea, except maybe for what our tests happened to flag. Therefore, necessarily, we have to compose code while understanding as much as possible about what we’re doing, so we can be as certain as possible that we’re doing the right thing. To help facilitate that, we should make code as easy to understand and work with as possible.

All of these qualities, among others, promote a certain way for how our code should be:

  • It should be simple.
  • It should be easy to understand.
  • It should be neatly organised in a way that makes sense. Different concerns (functionality) should be appropriately separated and independent (separation of concerns).
  • It should be easy to make independent changes, so that we don’t have to change every file in the system to modify how something works.
  • The number of similar changes we should have to make should be minimal, because we’re bad with repetitive work.
  • Etc.

Be pragmatic – The most important principle

Not just in programming, but pretty much everything in life, being pragmatic is essential.

It means to remember the true goal of what you’re trying to accomplish, maximise that, and not get side-tracked.

In programming terms, it also serves as a cautionary principle against the others. It means that you shouldn’t dogmatically apply programming principles unless you truly believe that it is the best thing to do in your project.

Here are some example cases where you might want to be careful:

1. Don’t make code shorter if it will make it more complicated

Code that’s short and concise is usually good.

It means that the code is not doing too much, therefore it’s easy to understand, easy to change, easy to test, and easy to reuse.

Conversely, if code does a lot of things, it’s generally harder to understand, more difficult to change (because when trying to change one of the things you might break the rest), harder to test and more difficult to reuse (because how would you reuse code that does 10 things when you only need 1 of those things?).

But hopefully you can understand that playing code golf, literally trying to make your code as short as possible using various mathematical and syntactic tricks, would be very detrimental.

Nobody would be able to understand how the code works without some serious time examining it and analysing it. Sometimes they wouldn’t be able to understand how the code works at all, especially if it uses unfamiliar mathematical tricks and such.

The point is, don’t dogmatically apply the practice of keeping code short to the extreme. Always aim to maximise the true goal of making code easy to understand, easy to change, easy to test, easy to reuse. Short code just happens to be a good way of achieving all this, but only if applied sensibly and with those goals in mind.

2. Don’t repeat yourself (DRY) and the rule of three

The rule of three says that we shouldn’t refactor code into a single abstraction until the third time that we encounter a similar piece of code.

This rule is very useful because if we refactor on the second occurrence, we might create an "incorrect abstraction". The two pieces of code we combined into a single abstraction might each change differently in the future. In this case, our single abstraction will be more difficult to work with. In fact, things will be more difficult than if we never combined them into a single abstraction the first place.

So that’s why, the majority of the time, we should follow the rule of three.

However, be pragmatic. In rare cases, it may be far better to refactor on the second occurrence of similar code. Remember that the rule of three exists to provide a certain benefit: to make the code easiest to work with in the long run. So the goal is to maximise that benefit, not to dogmatically follow the rule of three or DRY.

For example, we may have a situation where we have two instances of code which are both similar. Additionally, they may be quite long and complex, error prone to change, and also happen to change in very similar ways every time a change is required. Overall, working with those instances of code may be quite difficult. Having to make the change to one of them might be complex enough, having to make a similar change to both may be much worse.

In this case, it may be better for the health of the project to refactor these into a single abstraction, even though there are only two occurrences of similar code.

You would have to weigh the pros and cons. Consider if the benefits of refactoring the code into a single abstraction (faster and less error-prone changes), outweigh the cost that you might have to separate the abstraction again in the future (for example if they change too independently and working with them as a single abstraction proves difficult).

In a particular case in a project that I worked in, refactoring on the second occurrence was very worthwhile. The increase in ease of work and reduction of bugs far outweighed the potential cost in the future.

In summary: Always be pragmatic. Do what’s best for your software and project. Don’t dogmatically apply programming principles unless they will improve your project.


KISS (keep it simple stupid) and the principle of least astonishment

Description of KISS and principle of least astonishment

KISS is another principle that’s universal to most things in life, not just programming.

The principle of least astonishment means that things should work exactly as you expect them to, and shouldn’t be surprising. It is a cousin to KISS, although personally I view them as the same thing.

If we don’t keep things simple and easy to understand, we could encounter many problems:

  • Everything takes longer for us to understand.
  • Sometimes we might end up not understanding how things work, even after spending a lot of time on them.
  • We might misunderstand how things work. If we then modify the software, we will have a bug by definition (when something doesn’t work as we expect, that’s a bug).

How to apply KISS and principle of least astonishment

Default to writing dumb code, avoid writing clever code

Dumb code is simple code. Clever code is probably not simple code.

Really clever code is not simple, it’s difficult to understand, and it’s tricky. People may misunderstand it and create bugs as a result.

Keep code short and concise

Short code is more likely to be simple. It’s also more likely to have little functionality, which is a quality we want in software (separation of concerns).

Warning: But always keep in mind that the goal is to make software easy to work with. Don’t artificially try to make code shorter at the detriment of simplicity and understandability.

Use good names

If you name something well, you can understand what it’s doing just from the name. For example, if you have a function, you can understand what the function is doing from the name without having to read the function body.

Always consider the programmer reading the code for the first time

This is the person we’re trying to optimise the code for. The colleague who has never worked on this code before, or even yourself, 6 months from now, when you’ve forgotten what this code does and how it works.

Consider that when you’re writing the code, you know what you’re trying to accomplish and you know what the code is doing. When you’re reading the code back, not only do you have to decipher the complicated code but you have no idea what you were trying to do either, which makes things even harder to understand.

So keep it simple, use good names, and make it as easy to understand and work with as possible.

Consider immutability

Immutability provides a guarantee that a value will never change.

This can make the code simpler to understand, because you don’t have to trace through the code for the history of the variable, just in case it happened to change anywhere in your codebase.

Follow existing conventions

Code that follows existing conventions is unsurprising. Code that breaks conventions can be very unexpected. We always want simple and unsurprising code.

Try to follow conventions which already exist in your codebase. Conventions which exist in your language or framework are less essential to follow, but also recommended.


Separation of concerns

Separation of concerns means organising functionality well in code.

We split things into modules and classes that make sense to us. For example, if you have a Circle class, a Shape interface or a Math object or module, you tend to have a pretty good idea of what each does. You would expect to find Math.PI, or Math.pow(base, exponent) (these methods exist in the JavaScript Math object), and would not expect to find Math.PrintHelloToTheScreen().

In simpler words, it’s just common sense organisation of your code.

We also want code to do minimal things. This is commonly known as the single responsibility principle.

There are benefits to this.

Benefits of separation of concerns

Simplicity and understandability

Code that does fewer things is simpler than code which does many things.

Code that does fewer things is easier to understand than code which does many things.

Easier changes

Code that does fewer things is easier to change than code which does many things. At the very least, the code you’re trying to change is not entangled with other unrelated code which may distract you or which you have to avoid accidentally changing and breaking.

In other words, if a function does multiple things, and you modify it, for example to fix a bug, you might break any of the things the function does. But if a function only does one thing, and you modify it, you can’t break anything else in the system unless you change the contract of the function.

Easier to test

If a function does one thing, it’s easier to test than if it does multiple things.

Easier to reuse

If a function does one thing, it’s immediately reusable any time you need that one thing. But if a function does 10 things, or even 2 things, it’s generally not reusable unless you need all of those things.

Summary of benefits

In summary, separation of concerns:

  • Makes code simpler and easier to understand.
  • Organises functionality well so that the whole system makes more sense and is easier to work with.
  • Makes code easier to reuse.
  • Makes code easier to change and makes changes more independent.
  • Makes code easier to test.

How to apply separation of concerns

Extract functionality

To apply separation of concerns, we just extract functionality into separate modules / classes and functions / methods.

For example, this code:

function sendData(data) {
  const formattedData = data
    .map(x => x ** 2)
    .filter(Boolean)
    .filter(x => x > 5);

  if (formattedData.every(Number.isInteger) && formattedData.every(isLessThan1000)) {
    fetch('foo.com', { body: JSON.stringify(formattedData) });
  } else {
    // code to submit error
  }
}

Can be changed to this:

function sendData(data) {
  const formattedData = format(data);

  if (isValid(formattedData)) {
    fetch('foo.com', { body: JSON.stringify(formattedData) });
  } else {
    sendError();
  }
}

function format(data) {
  return data
    .map(square)
    .filter(Boolean)
    .filter(isGreaterThan5);
}

function isValid(data) {
  return data.every(Number.isInteger) && data.every(isLessThan1000);
}

function sendError() {
  // code to submit error
}

The first and second examples have exactly the same functionality. The only difference is that in the second example, some of the functionality has been moved to separate functions.

Overall, even though the second example is longer, it should be simpler to read and understand, especially if you don’t need the full details of the entire code. In the first example, you would have to read the entire code of sendData to try and understand what it’s doing. But in the second example, you can understand what sendData is doing very quickly. If you trust that the code works correctly, you don’t even need to examine format and sendError functions, because their names tell you what they do.

If for some reason you want to examine further and see what some of the other functions do, such as format or isValid, that’s easier as well. Each function is isolated in a small space, so you don’t have to search through a large amount of code to find them. Also each function only has its relevant code, without other unrelated code getting in the way and having you wonder where the relevant functionality begins and ends.

This also provides the other benefits mentioned earlier. Functions in the second example are:

  • Easier to reuse
  • Easier to test
  • Easier to change
  • Easier to read and understand
  • More logically organised

So in summary, you can extract functionality into separate modules / classes and functions / methods, give them a descriptive name, and that tends to make the code better.

When applied as much as possible, Robert Martin calls this technique "extract till you drop".

Be pragmatic

It’s possible to take this principle to the extreme and extract functionality until it’s almost impossible to extract anymore. That way, you can end up with functions that average between 2 lines to 6 lines long.

But in-line with being pragmatic, my personal recommendation is to extract functions only as far as you feel is helpful.

Remember to be pragmatic and consider the pros and cons. Would the current code be easier to understand and use if you separated functionality even more? If you extracted to the extreme, would it be even better, would it make no real difference, or would it be more difficult to work with? (If you’re not sure, feel free to try it out. Some people prefer extreme extraction and some don’t.) Do the benefits outweigh the time costs? Etc.

When to extract functionality

Complexity and length of code

If some code is long and difficult to understand, it may be easier to work with if you extract and separate some of its functionality.

Reasons to change

Another thing to consider is: What reasons does this code have to change?

Reasons to change indicate separate concerns. This means that it may be better for the code to be extracted into separate modules / classes and functions / methods.

One reason for this is for code simplicity and better code organisation. If your code has multiple different concerns or reasons to change, and you separate it to better reflect the different concerns, it may become easier to understand and better organised.

An even more important reason is to minimize the scope of changes.

Every change made to code is error-prone.

Therefore it is important to minimise the scope of changes.

As such, if you separate code that has different reasons to change, you’ll be reducing the scope of each possible change in the future. All else being the same, this should make the code safer to work with.

Consider the original sendData example.

What reasons could that code have to change in the future?

  • The formatting of the data may need to change.
  • The validation of the data may need to change.
  • The data in the error request may need to change.
  • The endpoint (URL) of the error request may need to change.
  • The data in the sendData request may need to change.
  • The endpoint (URL) of the sendData request may need to change.
  • Etc.

All of these reasons are indicators that you may want to extract and isolate that functionality.

Another flavour of this question is: Who (which role in the company) may want to change this code?

For example:

  • Tech administrators may want to change something about the URL endpoints of the requests or the bodies of the requests.
  • Accountants may want to change the data validation in the future.
  • A product owner who uses the submitted data to generate reports could want to format the data differently in the future.

Both of these questions (what could change and who may want changes) try to point out different concerns in the code, that may benefit from separation.

From then on, it is up to you to be pragmatic and consider whether separating the code is worth it, and to what extent.

In summary:

  • Consider "what reasons does this code have to change".
  • Consider "who (which role in the company) may want to change this code".
  • Be pragmatic.

Further reading on separation of concerns

For more information on this separation of concerns, please see the full post Separation of Concerns.


Principle of least knowledge

Description of principle of least knowledge

In software we always want to minimise knowledge. This includes the knowledge that code has of other code (dependencies), as well as the knowledge we need to be aware of to work with particular areas of code.

In other words, we want software to be decoupled and easy to work with. Making changes shouldn’t break seemingly unrelated code.

Knowledge in code

In programming, knowledge means dependency.

If some code (call it module A), knows about some other code (call it module B), it means that it uses that other code.

If some code is being used elsewhere, that means that there are limitations on how we can change it, otherwise we would break the code that uses it.

Without discipline and control, this is where we can get into a chain reaction of propagating changes. The situation where we just wanted to make a small change and had to modify every file in the system to do so. We changed A, which was used by B and C so we had to change both of those to accommodate our changes to A. In turn B and C were used in other places which we also had to change, and so on.

Every change by itself is error-prone, multiple cascading changes are therefore much worse.

Additionally, we (the programmers) also need to actually be aware of these dependencies. If we’re not, and we don’t update things correctly, we’ll immediately introduce bugs. This is quite difficult to do, especially when dependencies propagate far and wide throughout our code.

That’s why we need to minimise knowledge in our code.

Modifications to code

So what can we modify if we want to make changes?

No change to contract

The only change we can make with no propagating changes, is a change that doesn’t affect anything else in the codebase in any way.

For example:

// Original
function greet(name) {
  return 'Hello ' + name;
}

// After change
function greet(name) {
  return `Hello ${name}`;
}

These two are exactly equivalent. Nothing else in the codebase needs to change after making this change.

That’s because the function is still doing the same exact thing, that is, still fulfilling the same contract.

Changing the contract of a "private" function

The next best case is when we change the contract of a private function. Something that’s not public to the majority of the codebase. In this case, if we change the contract, the code that is affected is very small.

For example, if we start with this:

// Circle.js
class Circle {
  constructor(radius) {
    this.radius = radius;
  }

  getArea() {
    return _privateCalculation(this.radius);
  }
}

function _privateCalculation(radius) {
  return Math.PI * radius ** 2;
}

export default Circle;

(Please don’t pay attention to _privateCalculation being a separate function outside of the Circle class. This is only because JavaScript doesn’t have private methods in classes.)

And change to this:

// Circle.js
class Circle {
  constructor(radius) {
    this.radius = radius;
  }

  getArea() {
    return Math.PI * this.radius ** 2;
  }
}

export default Circle;

We deleted the _privateCalculation function. As a result getArea, which used _privateCalculation was affected, so we also modified getArea to accommodate the changes.

But because only getArea knew (used) _privateCalculation nothing else needs to change in the rest of the codebase.

Changing the contract of a public function

The principle continues in the same way. If we change the contract of anything, we’ll have to modify everything that uses it to accommodate. If we change more contracts, more things will have to be modified.

For example, if we delete getArea, we’ll have to update all the code in the codebase that uses it.

In real code, we sometimes need to make changes like these. But as much as possible, we need to prevent them.

The only real way to prevent them is to use separation of concerns effectively. We need to organise code effectively so that our modules / classes and functions / methods make sense for our project. If done well, we minimise the chance that we’ll need to change the contracts (publicly accessible functionality) in the future. We then keep everything we can private, to further minimise the scope of changes.

Summary

This concept is better known as using interfaces and encapsulation.

The parent concept is that we want to minimise all knowledge in code.

Organise code effectively in terms of public functionality. Keep everything else private.

More tips

When minimising knowledge, we have to do so with everything. "Public" and "private" functionality are just one example.

Other applications of this principle include:

  • Interface segregation principle (which keeps interfaces small to apply separation of concerns and minimise knowledge required by the user code).
  • Law of Demeter
  • Immutability (so we don’t need to track how values might change over time, and neither worry about how our changes might affect seemingly unrelated code)
  • Only accessing / modifying variables in the local-scope, or perhaps up to class / instance scope. This is a broad topic known as "side effects". In summary, if many things can access and modify variables, then it’s really difficult to track their values over time. If they are not what we expect, at any time, we have a bug.

Further reading on the principle of least knowledge

For more information on the principle of least knowledge, please see the full post Principle of least knowledge.


Abstraction and don’t repeat yourself (DRY)

Don’t repeat yourself is an integral principle in programming.

It says that if you have multiple of the same or similar code, you should refactor them into a single abstraction, so you end up with just one instance of the code. To accommodate the differences, the resulting abstraction accepts arguments.

Motivation for DRY

The motivations for the principle are quite simple.

Firstly, it’s more efficient in the long run, to create useful abstractions and reuse them rather than create similar code repeatedly over time. Some proof for this are libraries and high level programming languages. If libraries and higher level languages didn’t save us time, we would always code everything from scratch in Assembly. Alright, that argument might be a bit of a stretch, as these things are used worldwide and therefore save a disproportionate amount of time compared to the time it takes to create them. However, the point probably still applies in your codebase with your abstractions.

The second and more important reason, is to make changes easier.

Having to make a change only once minimises the chance of error.

Any code change is prone to error. Multiple changes are disproportionally error-prone. When we have to make multiple, similar (but not exactly the same) changes, that’s even more error prone and requires much more attention. Additionally, with multiple changes, we may forget one of the many places we need to change.

How to apply abstraction and DRY

Applying the principle is quite simple.

Combine similar code into a single abstraction

Whenever you find multiple instances of the same or similar code, refactor / combine it into a single abstraction.

You’ve probably done a vast number of times throughout your career.

To illustrate the point, let’s use map, the functional programming utility.

Iterating over the values of an array, doing an operation on each, and returning a new array containing each result, is quite a common task.

Instead of writing that common code multiple times, language creators have provided us with the map utility, among others.

Here is some example code without map:

function doubleArray(arr) {
  const result = [];
  for (let i = 0; i < arr.length; i++) {
    result.push(arr[i] * 2);
  }
  return result;
}
const arr = [1, 2, 3, 4];
const result = doubleArray(arr);

Here is the same code with map:

function double(x) {
  return x * 2;
}
function doubleArray(arr) {
  return arr.map(double);
}
const arr = [1, 2, 3, 4];
const result = doubleArray(arr);

And another more succinct version using lambda functions:

const double = x => x * 2;
const doubleArray = arr => arr.map(double);
const arr = [1, 2, 3, 4];
const result = doubleArray(arr);

You can imagine that without map we would have for-loops everywhere in the codebase, a lot of them doing the same thing except for the array they operated on and the functionality they performed on each element. map makes things much better, placing all common functionality (the creation of a new array, the for-loop, the addition of each element to the new array and returning the array) into a utility. That way we can replace all common parts with map and just pass in, explicitly or implicitly, just the unique parts: The array, and the function to run on each element.

So remember to do the same with your custom code. Any time you find multiple similar instances, combine them into a single abstraction which accepts arguments for any differences.

As a sidenote, this is also an example of separation of concerns. We’ve extracted the concern of looping through the function into a separate method called map (well, the language creators did), and the transformation of each element into a function called double. This way our function doesn’t have to worry about the low-level details, so it becomes easier to understand.

Rule of three

The rule of three is a precaution to combining functionality too early.

It states that you should generally combine functionality into a single abstraction on the third occurrence, rather than the second occurrence.

The reason for this is that the instances of code you’re considering combining may change independently in the future.

For example, consider this code:

function formatUsername(str) {
  return str.toLowerCase();
}

function formatUserMessage(str) {
  return str.toLowerCase();
}

It would probably be a mistake to combine the common functionality into its own abstraction, like so:

function formatUsername(str) {
  return format(str);
}

function formatUserMessage(str) {
  return format(str);
}

function format(str) {
  return str.toLowerCase();
}

The problem is that in the future, the formatUsername and formatUserMessage may change independently. For example what would we do in the future if formatUsername also needed to be stripped of whitespace at the start and end and formatUserMessage needed a new line after each sentence? Obviously we could make it work, but it would definitely be more messy than if we never combined the format functionality in the first place.

Waiting until the third occurrence makes it more likely that the similarity is significant rather than coincidental, so things are less likely to change independently in the future.

It also makes it so that if one of the three instances of similar code changes independently, we can separate it out, and still keep the combined abstraction for the other two. But if we combined functionality on the second occurrence, and then had to separate them out again, we would have to revert both of them.

Overall, it means that there is probably less work for us to do in the future, and less risk, if we follow the rule of three.

Of course, the rule of three is just a guideline. Remember to be pragmatic. If some similar instances of code are changing in the same way every time, or you judge that they are very likely to change in similar ways in the future, and the changes are error prone, it may be best to combine them into a single abstraction immediately. At the end of the day, project health is most important, not blind adherence to principles.

Further reading on abstraction and DRY

For more information on abstraction and DRY, please see the full post Abstraction and DRY.


Side effects

A word of warning, my view of side effects is slightly different from anything else I’ve read elsewhere. Feel free to check other resources as well if you don’t agree.

What side effects are

In programming, the general definition of a side effect is anything that changes the state of the system. This includes:

  • Changing the value of a variable.
  • Logging to the console.
  • Modifying the DOM.
  • Modifying the database.
  • Any mutation whatsoever.

It also includes "actions" that may not be viewed as mutations, such as:

  • Sending data over the network.

Personally, I also consider accessing non-local scope to be a side effect, or at least just as unsafe as a side effect. This is especially so if the value we’re trying to access is mutable. After all, if we access a global variable whose value isn’t what we expect, we have a bug, even if the code in question doesn’t modify it.

Of course, in all code, we need "side effects" for the program to do anything useful. For example, in a web application, the server must send HTML to the client.

The danger of side effects

Side effects are not directly harmful, but they can be indirectly harmful.

For example, code A and B might both depend on the value of a global variable. We might change the value of the global variable, because we want to influence code A, while we don’t remember that code B will be affected as well. As a result, we now we have a bug.

These hidden dependencies, where you change one thing and break something else, can be very difficult to remember, track and manage. They also break the principle of least astonishment, principle of least knowledge and separation of concerns. Code shouldn’t be fragile and have hidden dependencies like that.

Another example is changing the DOM. The DOM can be thought of as just a global object with state. But if different pieces of code affect the DOM at different times, in non-compatible ways, there can be bugs. Maybe code A depends on element X to be there, but code B deleted that entire section altogether just before code A ran.

Perhaps you’ve encountered bugs like these in your work as well.

One important thing to understand however, is that side effects are not inherently harmful. They are our own bugs. They are just code we write, which happens to be incompatible with some other code we write. We write code A, and then we write code B which breaks code A under certain circumstances.

The main danger of side effects is that they’re generally very difficult to track.

The reason for that is because tracking global state, which anything can modify at any time, is very difficult. If uncontrolled, how could we possibly track changes made to the DOM over time? We may have to track so many things that it just wouldn’t be feasible.

Asynchronicity and race conditions also add to the complexity and difficulty of tracking side effects.

Another downside of side effects is that that code with side effects is generally harder to test.

Handling side effects

Even though side effects are dangerous, they can be handled effectively.

Be pragmatic

The most important point, yet again, is to be pragmatic.

You don’t have to avoid all side effects to the extreme. You are only required to be careful with potentially incompatible code.

For example, immutability is a good way to avoid many types of side effects. However, immutability makes little difference in the local scope of functions.

function sumRange1(n) {
  let result = 0;
  for (let i = 1; i <= n; i++) {
    result += i;
  }
  return result;
}

function sumRange2(n) {
  if (n <= 0) {
    return 0;
  }
  return n + sumRange2(n - 1);
}

In the example sumRange1 uses mutation. The values of result and i both change throughout execution.

sumRange2 uses immutable variables. The values of the variables inside it never change throughout function execution.

But it makes no difference. Other than some language limitations of recursion (which we’ll ignore for this example), for all intents and purposes, sumRange1 and sumRange2 are exactly the same from the perspective of the caller and the rest of the code in the system.

In fact, people tend to be less comfortable with recursion than loops, so sumRange2 could actually be the worse choice depending on your team.

So be pragmatic and do what’s best for your project.

Immutability

Having said that, immutability is an easy way to avoid a large portion of side effects.

By never modifying values in your code unnecessarily, you remove the large problem of values changing unexpectedly and having to track the lifecycle of variables to know the values they contain.

When starting with immutability, start simple and over time try to make as many things immutable in your work as possible.

With variables containing primitive data types, just create a new variable instead of modifying the old one. With mutable objects, instead of modifying the old one directly, create a new variable with a new object, put the results in that, and then return the new object.

For example:

// Example 1 - Don't do this
function doubleArray(array) {
  for (let i = 0; i < array.length; i++) {
    array[i] = array[i] * 2; // mutates the original array
  }
}
const arr = [0, 1, 2, 3];
doubleArray();

// Example 2 - Do this
const double = x => x * 2;
const doubleArray = array => array.map(double);
const arr = [0, 1, 2, 3];
const result = doubleArr(arr);

In example 1, the original array is modified.

In example 2 the original array is not modified. doubleArray creates a new array with the doubled values, and we create a new variable doubled to hold the new array.

Avoid non-local scope

Avoid accessing or modifying things that are not in the local scope of your functions or methods.

If necessary, it’s alright to go up to instance or module scope.

The further up from local scope you go, the more dangerous it gets, because it gets harder to track the values of variables.

Wherever possible:

  • Pass things around explicitly as arguments.
  • Stick as close to local-scope as possible.

For example:

// Example 1 - Don't do this
function doubleResult() {
  result *= 2; // Accesses and mutates a variable outside local scope
}
let result = 5;
doubleResult();

// Example 2 - Do this
function double(n) {
  return n * 2; // Accesses parameter which is local scope only. Doesn't mutate anything
}
const initialValue = 5;
const result = double(initialValue);

In example 1, the doubleResult accesses result, which is a variable outside its local scope. It also mutates it, changing the value of result for everything else in the codebase.

In example 2, double only accesses its parameter, which is part of its local scope. It doesn’t mutate any values outside of its local scope.

In a real codebase, something resembling example 1 could be very difficult to track. The result variable may be much further away from both the doubleResult function as well as the doubleResult function call.

Additionally, if result is not exactly what we expect, for example, because we’ve already called doubleResult 3 times but we don’t remember, we have a bug.

In the second example, initialValue is always 5, so there are never any surprises. Also we can see what the function is doing immediately, and can easily predict the result. In comparison, in example 1, we can’t predict the result unless we search and trace through the entire codebase to keep track of result and what its value is.

Be extremely careful

Sometimes you can’t just rely on immutability. For example, at some point, you must mutate the DOM or the database, or make a call to a third party API, or run some sort of side effect. As already mentioned, asynchronicity only adds to the problem.

In this case, you just have to be extremely careful.

Side effects are probably where the majority of the bugs in your codebase reside or could reside in the future, because they’re the hardest code to understand and track.

Regardless of what you do to try and manage side effects, you must always invest the required time and attention to them.

Clear areas of responsibility

Apply separation of concerns and good code organisation.

As much as you can, try to make sure that code which performs side effects doesn’t conflict with other code performing other side effects at different times.

Additionally, try to organise code performing side effects as well as possible.

For example, if code A modifies element X in the DOM, then it should ideally be the only code which modifies that part of the DOM. That way tracking changes to element X is as easy as possible.

Additionally, try to organise your code and dependencies well so that code A won’t run if any other code runs which would conflict with it.

Side effects in pairs

For side effects which come in pairs (e.g. open / close file), the function that opened the side effect should ideally also close it.

For example, instead of this:

/* Note, this is psudocode */

function openFile(fileName) {
  const file = open(fileName);
  return file;
}
const file = openFile('foo.txt');

/* Lots of other code runs here */

doStuffToFile(file);
close(file);

Do this:

/* Note, this is psudocode */

function useFile(fileName, fn) {
  const file = open(fileName);
  fn(file);
  close(file);
}
useFile('foo.txt', doStuffToFile);

Robert Martin calls this technique "passing a block". The function useFile both opens and closes the file, so it doesn’t leave an open file pointer in the system. It accepts a function as an argument to execute on the file. At the end, it closes the file.

Use a framework or functional programming language

As mentioned before, the best option might be to avoid side effects as much possible.

To help, you can consider delegating some of them to a framework / library, or a functional programming language.

For example, for working with the DOM, you can use a library such as React (or many other alternatives).

Something like React handles all the DOM-related side effects. Then, in our application, we just write pure functions. Instead of modifying the DOM directly, our pure functions return a JavaScript object of what the DOM should look like. React then uses this object and makes the required changes to the DOM.

Additionally, the parent / child hierarchy of React ensures that our DOM code won’t conflict with itself. For example, our React code involving element X won’t run if element X won’t actually exist.

Of course, all of this is dependent on us following React conventions and not bypassing React to work with the DOM directly ourselves.

Disclaimer: I’m not encouraging or discouraging the use of React. There are many more considerations than what I’ve outlined here. I’m just presenting it as a possible option you could consider.

Further reading on side effects

For more information on side effects, please see the full post Side effects.


Further reading

That was a quick (quicker than a book at least) high-level overview of what I consider to be the most important concepts for writing good code. I hope that this article helped you understand the reasoning, motivation and overview behind clean code and programming principles. Hopefully, when you learn the practical versions of programming principles, these insights will guide you on how to apply them and also help you understand them better.

For the next step, I recommend learning clean code and programming principles more practically. Use a resource that explains concepts with many examples and applications in code.

For additional reading on clean code, with more practical examples of how to write clean code and apply programming principles, I recommend Robert Martin‘s resources. For the quick, free version, I found his lectures Coding a better world together part 1 and Coding a better world together part 2 to be some of the best programming videos I’ve ever watched. For more detail you might want to check out his book Clean Code or his videos Clean Coders. I’ve learned a lot from Robert Martin’s resources. I especially like that he explains the principles very practically, giving more practical examples of each one and more information in general.

I also found the book The Pragmatic Programmer very good. Some of the details are outdated, but the concepts are not, and that book truly hammers in the concept of being pragmatic with your code and your work. If anyone reads the 20th anniversary edition of The Pragmatic Programmer please let me know how it is, it’s on my list but I haven’t read it yet.

I’m sure there are other amazing resources as well, but these are the ones I’m familiar with and can personally recommend.

Finally, as I wrote in the post [How to learn programming](), I recommend thinking deeply about programming principles yourself and challenging them. Spend time on your own and consider yourself everything that this article discusses. What are the goals of programming? What qualities do we want in our code and why? What principles are useful? If being pragmatic, when might you apply a principle or not? Etc.

Alright, I hope this post was useful to you in some way. If you have any feedback or even counter-arguments, please let me know in the comments. I’m always happy for a discussion. See you next time :).