From a software delivery life cycle perspective, most of the projects nowadays are working with Agile (and more particularly Scrum, Kanban, etc.). During the Agile sprints, development and testing are working together to deliver a functional increment of the application, focusing less on the non-functional area. The most important non-functional quality attributes to be validated are usually performance and security tests.
For these reasons we have implemented the initiative to bring the performance and security testing into the sprint (‘Shifting Left’, i.e. earlier in the timeline stretching from left to right). The purpose is to increase the testing types covered by the team. Performance and security tests in a sprint should be run at every codebase change to catch any potential issues sooner rather than later.
The performance and security testing in the sprint initiative should be run inside the CI/CD pipeline as a last step after the functional testing, giving another set of validations from a non-functional point of view. The alternative is to run it before the release to production.
Usually, performance and security testing are left to the end of the project milestone, thus increasing the risk of encountering potential non-functional problems too late. By shifting the performance and security testing left, the project can confidently identify performance and security issues early. This increases the probability of delivering high quality software in regard to security and performance.
Background and significance
We see an increase in the complexity of the products not only because of functional complexity, but also because of an increased focus on quality attributes like security and performance. Nowadays products get built incrementally and it is impossible to add the quality attributes at the very end. We need to validate continuously throughout the delivery process to ensure these quality attributes are complied with. The people involved in delivering this model are the non-functional testers. They work alongside the rest of the sprint team as any potential performance and security defects found must be fixed within the sprint. Additionally, during the estimation phase, the performance and security testers consider appropriate estimation for the planning session. This implies that the delivery effort for each feature shall also consider the non-functional effort involved and potential defects that need to be fixed.
The main goal is to catch any potential performance and security problems sooner rather than later in the sprint at every code base change. If these defects are found at the end of the release cycle, the deployment of the application release candidate into production might be in jeopardy. Of course, the challenge is to keep track of the performance and security test results. This can be mitigated by automating the tests and improving communication flows with the other team members.
A secondary goal is to increase the performance and security awareness of the rest of the engineering team, as they are usually focused on delivering from a functional point of view. In time, the team would increase its knowledge in security and performance and will implement features proactively considering these quality attributes.
Tool set and methodology
We have used a certain set of tools for the initiative. The tool used for performance testing is Gatling for writing the tests and measuring the response times. The results are stored in a MySQL DB using a Python Performance module written to extract only the core performance data from the results. An ElasticSearch DB is used to index the data based on test names using Logstash. Kibana is used for plotting the results, showing potential performance degradation from run to run compared to a benchmark. Everything is integrated in the CI/CD pipeline using a Jenkins job. All required environments are created and destroyed on AWS for every test. The comparison is providing the pass/fail result to Jenkins without any manual intervention thus giving a quick overview of the performance results.
For security testing, OWASP ZAP is integrated using both a Jenkins plugin that is inserted in the CI/CD pipeline, but also a transparent mode instance is installed in the environment to scan API requests and identify potential security problems. A transparent mode instance is a proxy that is placed between the browser of the user and the application on the server in order to scan the traffic that passes through.
One of the main challenges was to integrate the non-functional test results into the CI/CD reporting due to the fact that there are no simple pass/fail mechanisms for performance and security. For performance, response times must be analysed, in addition to resource consumption, error rates, etc., and for security, false positives must be investigated. A solution is to develop an ‘assessment framework’ that can manage the data from the test results and derive a quick pass/fail result to the stakeholders.
Another side challenge was that the team’s knowledge about non-functional testing was not homogenous. We addressed this through several training sessions where performance and security mechanisms were explained. This was considered a success.
From a duration point of view, both performance and security tests can take quite some time to run. For instance, one can monitor the performance degradations during a weekend to identify a memory leak that leads to increasing memory consumption. In order to cope with the increasing cost for these kinds of tests – if a cloud-based solution is used, like AWS – we could schedule the machines to be destroyed after the tests are finished and the results are saved in a non-volatile environment.
We discovered potential problems very early. Based on the defect/cost graph the savings are substantial, varying around several days per release. In addition, due to the immediate feedback, the overall security and performance knowledge inside the sprint team increased, leading to more awareness during the implementation. The team no longer sees the non-functional defects as something that they do not know how to interpret, but rather they are aware how they should start to fix them.
For the long-term objectives, the ‘performance and security tests shifting left’ initiative has a notable impact on the quality of the delivered software. Previously all the issues were found very late, delaying the release of the product. Now performance and security issues were identified during the sprint and were fixed sooner rather than later. We evaluated the success based on the rate of adoption by the engineering team across projects.
What we could have done differently is the way we integrated the tests in CI/CD as we did not have a clear plan from the start. We needed to adapt as we progressed with the initiative. But despite all of the challenges mentioned, the initiative is an ongoing success that helps all stakeholders to have a clearer view of the performance and security quality of the application.
In the future, the plan is to improve the model by creating a basic modular framework that could be easily ported to other projects. In doing so, the impact on the team and the estimates would decrease while maintaining the advantages of this approach of shifting non-functional tests left.