What is Scalability Testing?

During the development phase (and potentially the test phase) of the Software Development Lifecycle (SDLC), only a handful of users will be utilising the application that’s being developed. However, normally upon release, thousands more users can be expected. If your scalability testing procedures aren’t successfully completed, your company can see a variety of issues, and disaster can strike!

What is scalability testing?

The overall goal of scalability testing (classified as black box testing) is to investigate issues that go beyond whether or not an application follows the functionality documents.

For example, we recently released Qucate, a functional testing and test management platform. Before we could ask users to signup to our new system, we needed to make sure that our product will not collapse under the pressure of receiving too many requests at the same time, which forms the guiding principle behind scalability testing.

When conducting our scalability testing, we also had to consider our system’s ability to shrink, as scale is bidirectional. That way, when scalability testing results were collated, our business could confidently plan for long-term growth, in any direction.

How is scalability testing different to load testing?

Load testing and scalability testing are both part of the performance testing methodology and are sometimes considered the same, however although there are similarities, they measure different levels of loads. Load testing determines the point at which the application crashes, whereas scalability testing attempts to identify the cause of the application crash and take corrective action.

Load testing assesses the application under test at the maximum load at which the system would fail. The main goal of load testing is to determine the maximum point beyond which users will be unable to use the system.

Scalability testing measures the system at both the minimum and maximum loads at all levels, including software, hardware, and databases. Once we’ve determined the maximum load developers will respond appropriately to make sure that the system remains scalable.

For example, if scalability testing determines that the maximum load is 10,000 users, then developers will need to take measures like decreasing the response time after the 10,000-user limit is reached. They could also increase provisioned resources to accommodate the growing user data to make the system scalable.

What metrics should I be testing?

Depending on the complexity of your application, you could have dozens of test points and metrics which you can include within your scalability testing, so how do you choose the right metrics?

For example, it would be pointless to test how a system designed performs when 1,000 users make concurrent resource requests, as it’s highly unlikely it will never reach that level. There would also be no grounds for testing database-related issues in the case of a static application. As such, there is no ‘one-size-fits-all’ checklist, so make sure that you’re including the metrics that would matter for your system!

As a start-point, and putting aside the size and complexity of your application, you can consider following metrics:

User-number-related performance

When the number of users communicating with your system grows, the more load is put on your infrastructure. With a higher load on infrastructure, the application can slow down. If the application continues to slow down (as expected), it will eventually completely freeze.

Understanding how your application behaves under increasing and decreasing load can allow for corrective action and allow the application to perform smoothly, at any load.

Response time

Response time is the time between a user’s request and the applications response.

Generally, the application will take longer to respond to a user’s request when it is under more load or suffers from slow, long running, or complicated tasks which can adversely affect throughput. So, issues with user-number-related performance can also have an impact on response time.

Long response times often lead to a poor user experience as the user is forced to wait for extended periods of time before the information is presented and can give the impression that the application is sluggish and unresponsive.

CPU usage

CPU usage is a measure of the CPU utilisation for performing a task.

Typically, applications which are CPU intensive are poorly optimised and can have adverse effects on several other metrics.

Memory usage

Memory usage is a measure of the Random Access Memory (RAM) consumed to perform a task.

Some applications are intended to consume large amounts of RAM by their nature, for example, applications which hold an in-memory cache to quickly serve commonly requested resources. However, situations where large amounts of RAM are consumed for infrequently accessed resources can lead to adverse effects on several other metrics.

Network usage

Network usage is the amount of bandwidth consumed by the application under test.

Understanding your applications network usage is a good way to ensure that your infrastructure can withstand higher traffic volumes. Networks which have a high congestion can suffer from being unable to fulfil network requests and can lead to a poor user experience.

Screen transition time

A good indicator of scaling is testing how long it takes your application to transition from one interface to the next.

Slow screen transitions are frequently indicative of inefficiencies, or slow network speeds or server response times.

When hosted resources are requested to load a screen, you should optimise static assets through compression and serve the assets through a content-delivery network (CDN).

If you fail to optimise screen transition, it could result in long screen transition times and leave the impression that your application is unresponsive and sluggish.

How do I test the scalability of my software?

We’ve talked about what scalability testing is, how it differs to load testing and different metrics, so now we can start the process of testing your application at scale.However, it’s important that we take a step back and make sure we consider our prerequisites ahead of conducting our testing:

Scope: What do you want to achieve? What parts of the application need to be tested at scale? Is it feasible to test all features of the application?
Tools and Capabilities: What tools do you have available? What are the capabilities and limitations of the tools available? What is the cost of acquiring new tools?
Environments: What servers, operating systems, and platforms are to be tested at scale?
Infrastructure: Is the test infrastructure like-for-like equivalent to your production environment? Can the test infrastructure be scaled up (and down) to simulate varying loads?

Assuming you’ve answered all the above, the checklist below should guide you from concept to actionable findings in the form of condensed reports.

I’ve got my scalability results, now what?

The results of your scalability tests should often lead to improvements in your application and the infrastructure supporting it.

If you’ve determined that you need to scale your infrastructure, there are typically two approaches to do so.

Vertical scaling

Adding resources to your existing infrastructure is one way you can improve capacity and processing speed to support additional load. You can install more powerful processors add more memory, hard drives and so on to the servers that host your applications.

To scale vertically, whether in the cloud or on-premises, you need to allocate more resources or upgrade the hardware to the existing servers.

Vertical scaling can be less expensive than horizontal scaling but does have its ceiling limits, for example, you may already be running the most powerful processor money can buy.

Horizontal scaling

In the case of horizontal scaling, a new server will be added to the network to share the load with the existing infrastructure. This is a quick way to double or improve your infrastructure’s performance and capacity.

The obvious disadvantage of this approach to scaling is the cost of duplicating your loadout each time you need to expand. Another consideration is that you will need to devote more time and effort to maintaining the new equipment that has been added to your current setup.

As a top tip for scaling quickly: leverage the cloud. Scalability testing results can sometimes offer insights into implementation issues or poorly optimised code. Often, the results you’ll encounter will point to the need for infrastructure improvements and upgrades. As a result, it makes sense to host your applications in an environment with virtually infinite amounts of processing power. Where possible, you should utilise the cloud in some way to save money and cut corners without sacrificing performance.

If you’ve determined that you’re over-provisioned and using less resources than you need, you could save a lot of money by scaling down your infrastructure and removing unnecessary resources. You can then improve your applications or products and grow your user base. Just remember to allow for a little breathing room!

Conclusion

There’s a lot to consider when doing scalability testing, and a lot more to actually implement.

While large corporations such as Google, Apple, and Amazon may perform scalability tests on a regular basis, smaller businesses may only do so when absolutely necessary. Typically, smaller businesses do not have as many tools, hands, or large enough budgets to conduct scalability testing at the same scale. Make use of tools that simplify the entire process!

Software development

Digital transformation

ETL services

SQL Server health check

Database administration

SQL Server emergency support

About us

Careers

Blogs

Case studies

What is Scalability Testing?

Table of Contents