Getting Started with SQL Server Performance Monitoring

When users are complaining that the system is slow, you do some googling and find a bunch of articles. You run a bunch of queries, but by now, the slow performance has subsided. You’ve no idea what went wrong. You don’t have a performance monitoring tool – they’re expensive after all!

Instead of diagnosing a SQL Server problem after users have complained about performance, you should be monitoring key metrics over time and building a picture of what ‘good performance’ looks like. That way, when things go wrong, you have a baseline for comparison.

If your system has already slowed down and is continuously slow, our previous blog covers the seven most common causes of SQL Server slowdowns and how to address them.

Here are the key things to capture and monitor in SQL Server, and what the numbers are telling you.

Waiting and blocking

Whenever SQL Server needs access to a resource, whether that’s hardware like CPU and disk access, or data pages in memory, it must wait.

Hopefully the wait time is just a couple of milliseconds, but if that resource is busy, it can turn into seconds or even minutes!

SQL Server tracks exactly what it’s waiting on at any given moment. These wait statistics are one of the most informative monitoring data points available because they tell you directly where time is being spent.

You can capture these wait statistics yourself (using a script like this) and periodically log them to a table. Or use built-in tools like Query Store to analyse which queries are experiencing the highest waits.

Two common categories of waits you’ll encounter are lock-related waits (LCK_M_*) and I/O-related waits (PAGEIOLATCH_*). They represent very different bottlenecks inside the engine.

LCK_M_* waits occur when one session needs to read or modify data that is already locked by another session. SQL Server uses locks to maintain consistency, so this is normal, but excessive locking usually points to a blocking issue.

PAGEIOLATCH_* waits are very different. These indicate that SQL Server is waiting for a data page to be read from disk into memory. A significant spike in these waits could indicate memory pressure (reduced cache hit rate), slow storage / high I/O latency, or inefficient query/index design causing excessive reads.

Remember, waiting is normal. What’s crucial is to understand whether a particular type of wait increased during your performance problem, because that might direct you to the source of the problem. You can read more about why SQL Server waits in our blog.

Disk I/O latency

SQL Server is highly sensitive to storage performance. When disk reads or writes are slow, queries pause while they wait for data, creating bottlenecks that can significantly reduce throughput.

It’s important to monitor read and write latency separately, and to look at data files, TempDB, and the transaction log independently. They behave differently and affect the system in different ways.

Transaction log latency is especially critical. Every committed transaction must be written to the log before it can complete. Even modest log write latency has a system‑wide impact, slowing down OLTP workloads across the board.

TempDB latency is equally important, often more so. TempDB is used heavily for sorts, hashes, row versioning, temporary objects, and spill operations. When TempDB I/O is slow, every database on the instance can suffer because TempDB is a shared system-wide resource. Poor TempDB performance can amplify bottlenecks far more than slow user-database storage.

As a rough guide under normal OLTP load, latency consistently above 20 ms for data file reads/writes, or above 5 ms for log writes, is worth investigating. You can capture IO performance metrics for your environment using Windows Performance Monitor, or a script like this one.

Query execution trends

Individual queries are often the start of performance problems, but they’re also where monitoring tends to be weakest. Most teams know their server is slow but don’t have the historical data to know which queries have got slower, or when the change happened.

Query Store addresses this by capturing query performance over time, so you can literally see when a query got slower. You can see how average duration, CPU consumption, and logical reads for each query has changed over time, and by execution plan.

If poor performance is spiky and happening now, the Adam Machanic’s sp_whoisactive is a great way to see what’s currently running on the server. You can also tell it to fetch blocking information too, so you can see what’s causing a query to go slow. There could be a management report that runs just a few times a day that is locking the entirety of a key system table and blocking all other queries from modifying it.

If poor performance has been more general and consistent, then focus your investigation on those queries that have the highest total impact on the system (CPU, logical reads) rather than the slowest individual execution.

Memory

Memory is critical to SQL Server. Almost all data processing depends on it. When there isn’t enough available memory, SQL Server is forced to read data from disk more frequently, and this has a significant impact on performance because disk access is far slower than memory access.

SQL Server’s max server memory setting controls how much RAM the engine is allowed to use. It’s important to understand this value and configure it appropriately. In most cases, you set it close to the maximum available, while leaving several gigabytes for the operating system and any other services on the server. Once SQL Server starts using memory, it will generally hold on to it, so it’s normal to see high SQL Server memory usage in Task Manager, even when the server is idle.

What matters far more than the raw memory usage is how long data pages remain in memory before SQL Server must evict them to make room for newly read pages. This is measured by Page Life Expectancy (PLE), a performance counter that indicates, in seconds, the average time a page stays in memory.

When PLE is high, SQL Server is effectively serving most requests from memory.

When PLE drops and remains low for an extended period, this usually indicates memory pressure: SQL Server is repeatedly discarding cached pages and re-reading them from disk, which slows down queries.

So, what is a “good” PLE value?

Older guidance suggested a target of 300 seconds, but this is outdated. That rule came from a time when servers had far less RAM. Modern servers with large memory allocations naturally produce much higher PLE values, several thousand seconds or more. A temporary dip is not a concern; what matters is whether PLE stays low and how quickly it recovers.

The best approach is to establish a baseline for your own environment: capture PLE during periods of good performance and use that as your reference point. You can track this using Windows Performance Counters, and you can even read these counters directly from SQL Server and log them to table, if preferred.

Another useful indicator is Memory Grants Pending, which measures the number of queries waiting for workspace memory. This type of memory is used for operations such as sorts, hashes, and large aggregations. When SQL Server doesn’t have enough free memory to satisfy these requests, queries queue until memory becomes available.

A consistently non‑zero value, especially if it rises above five to ten for extended periods, suggests that the server doesn’t have enough memory to comfortably support the current workload. In practice, this often coincides with low PLE, but not always. You can have adequate buffer cache for data pages yet still experience memory pressure for query execution.

Monitoring both PLE and Memory Grants Pending gives a more complete picture of the server’s memory health.

Conclusion

Even without professional monitoring tools, you can capture enough information to understand why performance is dropping and spot trends before they become persistent problems. Consistent, thoughtful tracking of key metrics, combined with a clear baseline for your environment, is often all it takes to stay ahead of issues and keep SQL Server healthy and responsive.

If performance concerns are already impacting your business, we’re here to help. At Koderly, we provide database administration and support to market-leading organisations. Head to our database services page or contact us below for more information.

Software development

Digital transformation

ETL services

SQL Server health check

Database administration

SQL Server emergency support

About us

Careers

Blogs

Case studies