Four keys I look for when reviewing software pull requests

Share on:

Over the past 15 years I've reviewed hundreds upon hundreds of pull requests. From pull requests against projects I built and maintain to PRs submitted as promotion portfolio reviews in projects I've never seen or worked on, and everything in between.

Over time I've developed a set of four key items I look for in each pull request review - regardless of the circumstances of the change. I'm synthesizing and sharing these keys with you in hopes they're helpful to you as well.

Can someone besides the author understand and maintain this code?

Code is written once, but read and maintained for years. And often maintained by folks besides the original authors.

So one of the key things I look for - do I think someone else besides the author can understand and maintain this code? Writing code that a compiler can understand is straightforward. But writing code that a human can successfully understand and maintain later can be another thing entirely.

For example, I look for things like:

  • Are the methods broken out into concise, self-contained sections of functionality is there one long 400 line public method in a class?
  • Are method and variable names descriptive or are they overly-terse and opaque?
  • Are there comments explaining the "why" when code deviates from the expected? (more on when code comments are valuable)
  • and more

If you've ever maintained code written by someone else (especially if they aren't in the team and/or company anymore) - the more you will appreciate if they took care to make it maintainable. And you can help ensure that happens in code your team produces - your future selves, future teammates, and other future maintainers will appreciate it!

Are these changes covered by valuable tests?

In addition to readability, one of the essential elements to maintainable code is that it is covered by tests. And not just any tests - tests that are valuable for this specific area of code.

There isn't a one-size-fits-all approach to testing. All types of testing - unit, integration, functionality, performance, etc. - have their place. Which types of testing are valuable depends on the code being written. But in general - if a test suite will fail if the underlying code is broken, then it is a valuable set of tests.

For example, if the code change include a new database query - a test against a real database (such as a Postgres instance spun up via TestContainers) will help ensure the query works with the real database. Instead of a test that heavily relies on mocks for the internals of the database. In this situation, the test with a real database is much more valuable.

Or on the frontend, a React snapshot test that verifies a rendered DOM matches a pre-canned DOM tree is barely helpful. But a test that simulates user behavior - such as filling out a search form and verifying the correct request is sent to the backend - is much more valuable.

Tests that will catch issues when:

  • Expected behavior is accidentally broken by future changes
  • Dependencies are upgraded
  • Bugs are accidentally re-introduced
  • and more

Valuable tests for code go hand-in-hand with readability from the first key item above. Unreadable and untested code won't survive - your hard work will likely get deleted and rewritten.

But with both valuable tests and care applied in readability, your code can live on in the future and be safely maintained.

Can we observe this code's behavior running in production?

Software changes in the codebase are one thing, but code running in production is a different beast. Once this code is deployed to production, is there enough instrumentation to understand how it's functioning - is it fast enough, are operations successful, etc.? And when (not if) something unexpected happens in production, can we get enough information from the instrumentation to debug that situation?

If an error occurs, can we debug it with the level of instrumentation? Can we understand the software's performance? And diagnose any performance bottlenecks that might arise?

In addition to functional and performance instrumentation, is there sufficient instrumentation to learn if a feature is valuable to users? Are they discovering the feature and able to use it to accomplish their goals?

If the answer is "no" to any of the above, then the code changes need additional instrumentation.

And similar to the efforts spent on code readability, initial efforts on instrumentation will pay dividends later. You can identify and fix issues before they impact significant amounts of users. Or when the inevitable bug report comes in affecting only a small set of users, have a fighting chance at pinpointing and fixing the issue. You can find out if a feature is valuable to users and double-down on it. Or learn a feature isn't landing with users they way you expect, and needs to be changed or removed.

In addition to the issues tests can catch, production instrumentation is key for future maintainers to be able to safely modify existing code without unknowingly breaking production. And you can help ensure your software efforts are making a difference - that the features you're building are valuable to your users.

What did the author do well that I can highlight?

You submit a pull request and post a message in your team's Slack channel asking for a review. A bit later, GitHub emails you that there is a comment on your pull request. You think "oh no, what did I do wrong this time?"

But instead an issue, the comment is positive - it points out something the author thought you did well!

Positive comments are not just fluff - they can also help reinforce good patterns in code changes so the author knows they are valuable and is more likely to continue those patterns in the future. See thorough test cases for a bug fix? Comment on that! Or carefully crafted methods for enhanced readability? Comment on that as well!

Adding positive comments helps encourage behavior you want to reinforce. And it also makes work just a bit more pleasant.

Why not use a checklist?

I could create a multi-page checklist of every type of code issue I've ever encountered and use it in every pull request review. That way, none of these issues would ever slip through again - right?

In my experience, checklists like this start out as well-meaning efforts. Maybe an issue made it through testing and code review and into prod, and we want to stop it from happening again. So folks think they should start a code review checklist so that specific issue isn't missed again. Later another type of issue makes it through to prod, and gets added to the checklist.

Over time, the checklist grows to many different items - aiming to guard against issues historically made it through to prod. This is great, right?

But now this list is quite large. And these checklist items aren't applicable to all types of changes. As an example, in the past someone may have added a new index to a large table in Postgres and it locked up writes on the table while the new index is applied. The fix for that is to add the index concurrently. But having a review checklist item for that specific case will only get applied in the few cases where someone is adding a new index to a large table. Every other time, that item is just noise in the review process.

So reviews can easily devolve into mindlessly checking off items in the list, and skipping over many as they don't apply to the change under review. And the reviewer turns off their brain during the review and easily misses other issues in the pull request not covered by the checklist.

Instead, by using a limited set of four items I look for in all reviews I free up my brain from mindless list-checking to thoroughly focus and analyze the specific change I'm reviewing.


By checking these four items in all pull request reviews, I can get the high value out of my review while not switching off my brain in checklist mode. I can help ensure the code is maintainable for the future and make the review process less painful with positive comments as well.

And I apply these same criteria to the code I write as well - to help future readers and maintainers of my code. And future maintainers includes myself - after 6 months when I've forgotten all the context around my original changes!

Hopefully these techniques help you as well.