Why automation matters

March 2021

When working in a smaller dev-team, time is crucial. The time spent to validate and review what other team members have done has to be as short as possible. However, in order to keep code-quality as high as possible whilst not cutting corners, we need to remove some of the workload off of the reviewer.

This is where automation comes into play.

The issue

Imagine a big project with several 10-thousands lines of code on which there are currently 5 people working on.

Among those 5 people, you will always have someone whose IDE will format the code differently. In addition, everybody has their own style of coding and some people do not see the need in writing tests for their newly added functionalities. And of course, there is always someone who does not see obvious bugs in their code.

At the end of the day, the developers commit to their code and create a Pull-Request for review. This is where the fun begins.

Now, the reviewer has to review hundreds of lines of code even if the author of the Pull-Request only changed a PHPDOC.
The issue was that the author reformatted the code using their IDE before implementing their changes.

Having to review several PRs like this every day can not only be annoying and time consuming, but is also a trap. Because the Reviewer will then probably just quickly glimpse over the code thinking “I can only see formatting changes” and risks to introduce new bugs into the code without even knowing.

A similar issue arises when changes are made to complex parts of the code. While changes can look minor, their impact can be huge.

Generally speaking, the less work the Reviewer needs to do, the better. But how do we approach automation?

Preparing for automation

Whilst automating everything sounds easy, it should not be the first step. The first step would obviously be to tell your team to properly configure their IDE. To review their own code before committing, write tests, etc. but in the end, there will always be someone who forgets or ignores these rules.

Before you can even start thinking about automating certain workflows, you need to prepare your project to be automatable.

Employing automation technology in a Projects workflow is not as easy as enabling a certain feature in some tool. It has to be planned beforehand and some manual work is definitely required.

Code Style validation

The easiest way to maintain code-quality is to force the developer to follow some guidelines. Mainly how the code should be formatted. For this, we use PHP Code Sniffer (PHPCS).

PHPCS is an utility which automatically checks if code matches a pre-defined code style. This ensures that the code is properly formatted before the user implements their changes.

The majority of IDEs already support PHPCS directly. Meaning that the code is validated as it is written and issues are directly highlighted within the Editor.

Having PHPCS properly configured in a project already greatly improves the overall code quality in terms of formatting, reducing potential line-changes in a PR.

Code functionality validation

After we ensure that the code is properly formatted, we also want to make sure that it actually works as intended.There are several ways to do it. We will focus on 2 of the most common ways.

The first being a static validation of the code. You might already know static validation from other programming languages that don’t compile their code at runtime (C, C#, C++, etc.). With these languages, the code first needs to pass a static validation before being compiled into machine code.

With PHP, it is kind of a similar story, as the code is checked without actually being run.

For this, we use PHPStan. PHPStan is a great tool to evaluate code on a very basic logic level without having to write tests. It catches issues like unreachable statements or issues with expected vs. actual return types.

However, in order to test the code further and on a more “business logic”-level, we need to create custom tests.

This is one of the most difficult tasks, as this requires additional code to be written that tests the actual code.

In itself, this sounds easy, but it can get complicated very quickly. Aside from managing Fixtures and the environment in general when writing/running tests, the actual test itself also matters.

As an example, we want to test a method, which normally should return an array with values based on some arbitrary parameters. A very crude test might only check if the return-type of the method is an array, but is that enough?

It depends.

Is the array allowed to be empty?

What can/should the array contain?

What happens if the method cannot give a proper result using the given parameters?

Are there any exceptions that can be thrown?

To put it short, your test depends on what you are testing. If your written test does not bring any more validation than the static validation, is there even a need for your test?

If you choose to write tests, always consider the worst case. Don’t just test with known-good parameters for a correct result, but also consider testing to see what happens with the “worst” parameters possible. Only then can you be sure that your test is actually useful.

The tools mentioned above are only there to reduce the workload of the reviewer. There is no need for a human to validate something if the computer already knows that there is a problem or that a format mismatch will cause an enormous diff. On the other hand, even if all the computerised tests pass, it doesn't mean that the code is working properly. Therefore, the final human validation remains as important as before.

Planning the usecase

After putting in place tools that evaluate the code, we need to plan how we can leverage them in order to be productive. This ultimately depends on the project.

As an example, you might want to run your Tests on a daily basis if your project depends on external libraries in order to make sure that a newer version of a library won’t cause any issues. However, you don’t necessarily want to run code-style analysis daily if there are no changes to the code.

One of the most useful ways to implement your tools into an automated workflow are Git hooks.

Git hooks can be used to execute certain scripts and tools before a commit or push is made. They are configured on a per-project basis and are run every time a user wants to commit something.

The downside of this workflow is that, depending on project size, running all the tools locally takes some time. On really large projects, this time can be significant towards the point that more time is lost on waiting for the tools than it is worth.

This brings us to another way of integrating our toolset, but on a more distributed scale.

Locally, the User only runs the necessary tools in order to cover his changes. Afterwards the user then Commited/Pushed his changes, a separate server then takes care of running the complete suite of tools without blocking the user.

Depending on what type of Repository-Hosting service is used, it is also possible to configure said workflows in a way that the server side validation is only done when the User creates a Pull Request. With this, it is also possible to automatically reject a Pull Request if some criteria are not met, with the chosen reviewer not even being notified until it is worth doing so.

Conclusion

Implementing automated measures is a big task, but the reward is also very big. With the available tools, maintaining clean code has been made easier than ever before. By choosing the right amount and type of automation, we can greatly improve productivity among teams by reducing the number of tasks that a computer can do for us.

Next Blog Post

How we are bringing Web Accessibility to Luxembourg