Install Now
Back to Guides

Python Code Review Checklist: 25 Things to Check for Engineering Teams

Jan 16, 2026
Molisha Shah
Molisha Shah
Python Code Review Checklist: 25 Things to Check for Engineering Teams

A comprehensive Python code review checklist should include 25 prioritized checks covering style, correctness, security, and documentation, as inconsistent review criteria lead to subjective debates that waste engineering time while missing genuine risks.

TL;DR

Python code reviews fail when teams lack shared pass/fail criteria, leading to subjective debates over style while security vulnerabilities slip through. This checklist organizes 25 concrete checks by priority level (critical, high, medium), with automation commands and examples validated across mature engineering teams that follow PEP standards, Google practices, and OWASP guidelines.

Code reviews represent a significant investment of engineering resources, yet teams frequently debate the same style choices, miss critical security issues, and block PRs for minor preference differences. The friction stems from undefined review standards: without explicit pass/fail criteria, reviewers default to personal preferences while overlooking genuine risks.

This checklist addresses that gap by categorizing 25 concrete checks into must-fix blockers and recommended improvements. The distinction follows Google's documented standard: "Reviewers should favor approving a CL once it is in a state where it definitely improves the overall code health of the system being worked on, even if the CL isn't perfect."

Each checklist item includes tool configurations, code examples, and authoritative source citations that reviewers can reference during PR discussions.

Augment Code's Context Engine processes entire codebases across 400,000+ files, enabling teams to automate consistency checks while AI-assisted reviews catch architectural issues that manual review misses. Explore automated code review capabilities →

Style and PEP 8 Compliance Standards for Consistent Codebases

Style checks represent the foundation of consistent codebases. Pre-commit hooks configured for formatting (Black/Ruff), linting (Flake8/Pylint), and related style tools reduce style debates in human review and enable reviewers to focus on logic, architecture, and domain knowledge.

1. Indentation and Line Length

Consistent indentation and line length form the visual backbone of readable Python code. When developers encounter inconsistent spacing or excessively long lines, they spend cognitive energy parsing structure rather than understanding logic.

Priority: Must-Fix

Pass Criteria: 4 spaces per indentation level, no mixed tabs/spaces, lines ≤79 characters (PEP 8 standard) or ≤88 characters (Black formatter compatibility)

python
# ❌ FAIL - Mixed indentation
def calculate_total(items):
total = 0 # 2 spaces
for item in items: # 4 spaces
total += item.price
return total
# ✅ PASS - Consistent 4-space indentation
def calculate_total(items):
total = 0
for item in items:
total += item.price
return total

Automation: Configure .flake8 with max-line-length = 88, then enforce via pre-commit hooks.

2. Naming Conventions

Names serve as the primary documentation layer in code: they communicate intent, scope, and type at a glance. Python's established conventions (snake_case for functions, PascalCase for classes) provide a shared vocabulary that experienced developers parse unconsciously.

Priority: Must-Fix

Pass Criteria: Functions and variables use snake_case, classes use PascalCase, constants use UPPER_CASE

python
# ✅ PASS
class DataProcessor:
MAX_CONNECTIONS = 100
def process_data(self):
pass

3. Import Organization

Import statements act as a dependency manifest at the top of every module. Well-organized imports immediately communicate what external systems a module relies on.

Priority: Must-Fix

Pass Criteria: Three groups separated by blank lines: standard library, third-party, and local imports. Alphabetically sorted within groups.

python
# ✅ PASS - Proper three-group structure
import os
import sys
from flask import Flask
from myapp.models import User

Automation: Use isort with Black-compatible configuration in pyproject.toml.

4. Pre-Commit Hook Enforcement

Pre-commit hooks shift quality enforcement left in the development workflow, catching issues before they enter version control. This approach prevents style violations and security issues from ever reaching the PR stage.

Priority: Must-Fix

Pass Criteria: Repository includes .pre-commit-config.yaml with formatters, linters, and security scanners configured

text
# .pre-commit-config.yaml
repos:
- repo: https://github.com/psf/black
rev: 24.1.1
hooks:
- id: black
- repo: https://github.com/PyCQA/bandit
rev: 1.7.6
hooks:
- id: bandit
args: ['-ll']

Test Coverage and Type Checking Requirements for Production Code

Test coverage and type checking represent the primary defense against regressions. Teams using AI-assisted code review tools can accelerate test generation while maintaining coverage thresholds that prevent regressions.

5. Test Coverage Threshold

Test coverage metrics provide objective evidence that code paths have been exercised, but the number alone doesn't tell the full story. Effective coverage combines quantitative thresholds with qualitative review of edge cases, error paths, and boundary conditions.

Priority: Must-Fix

Pass Criteria: ≥80% branch coverage on new/modified code with systematic edge-case testing

python
# ✅ PASS - Comprehensive edge case coverage
@pytest.mark.parametrize("input,expected", [
([], 0), # Empty collection
([1], 1), # Single item boundary
([1, 2, 3], 6), # Normal case
pytest.param(None, 0, id="none_input"),
])
def test_sum_values(input, expected):
assert sum_values(input) == expected
def test_sum_values_error_handling():
"""Test error paths and invalid input types."""
with pytest.raises(ValueError, match="Invalid format"):
sum_values("invalid")

Automation: Configure pytest with --cov=src --cov-fail-under=80 in pyproject.toml.

6. Type Hints with Modern Syntax

Type hints transform Python from a dynamically typed language into one with optional static analysis capabilities. Modern Python syntax (3.9+) eliminates the need for typing module imports in most cases. Teams using Augment Code's Context Engine can identify type inconsistencies across 400,000+ files, catching interface mismatches that manual review overlooks.

Priority: Must-Fix (Python 3.9+)

Pass Criteria: Uses list[str] instead of List[str], X | None instead of Optional[X], mypy runs in CI with strict mode

python
# ❌ FAIL - Deprecated typing imports (Python 3.9+)
from typing import List, Dict, Optional
def process(items: List[str]) -> Optional[Dict[str, int]]:
pass
# ✅ PASS - Modern syntax (Python 3.10+)
def process(items: list[str]) -> dict[str, int] | None:
pass

7. Exception Handling Specificity

Exception handling determines how gracefully code responds to unexpected conditions. An overly broad exception handler masks bugs by silently swallowing errors that should propagate.

Priority: Must-Fix

Pass Criteria: No bare except clauses, specific exception types caught, and logger.exception() used for tracebacks

python
# ❌ FAIL - Bare except catches SystemExit
try:
process_data()
except:
pass
# ✅ PASS - Specific handling with logging
import logging
logger = logging.getLogger(__name__)
try:
process_data()
except ValueError as e:
logger.exception("Invalid data format")
raise

Strengthen your code review automation →

8. Assertion Specificity in Tests

Assertions serve as executable documentation of expected behavior. Vague assertions like 'assert result pass' when any truthy value is returned provide no protection against incorrect truthy values.

Priority: Recommended

Pass Criteria: Assertions validate specific expected values, and exception messages are validated with pytest.raises(match=)

python
# ❌ FAIL - Insufficient specificity
assert response.status_code == 200
assert user.id is not None
# ✅ PASS - Specific validation
assert response.status_code == 200
assert user.email == "test@example.com"

9. Test Independence: No Shared Mutable State

Test independence ensures that each test validates behavior in isolation, making failures deterministic and debuggable. Shared mutable state between tests creates order-dependent behavior.

Priority: Recommended

Pass Criteria: Tests run independently in any order with no shared mutable state

10. Fixture Scoping

Fixture scoping balances test isolation against setup performance. Function-scoped fixtures guarantee a fresh state for each test but may repeatedly perform expensive operations.

Priority: Recommended

Pass Criteria: Expensive setup uses @pytest.fixture(scope="session"), test isolation uses @pytest.fixture(scope="function")

CheckPriorityPass CriteriaAutomation
Indentation/Line LengthMust-Fix4 spaces per level, ≤88 charactersBlack/Ruff
Naming ConventionsMust-Fixsnake_case functions/variables, PascalCase classesFlake8
Import OrganizationMust-Fix3 groups sorted, separated by blank linesisort
Test CoverageMust-Fix≥80% branch coverage on changed codepytest-cov
Type HintsMust-FixModern syntax (PEP 585/604), strict mypymypy
Exception HandlingMust-FixSpecific exception types only, no bare exceptFlake8/Bandit
Augment Code CTA graphic highlighting Context Engine analyzing 400,000+ files with "Ship features 5-10x faster" call-to-action button on dark tech-themed background

Readability and Code Structure Checks for Maintainable Python

Readability checks require human judgment but follow consistent principles. These items improve maintainability without blocking deployment.

11. Cyclomatic Complexity

Cyclomatic complexity measures the number of linearly independent paths through code, directly correlating with testing difficulty and bug likelihood.

Priority: High (Strong Recommendation)

Pass Criteria: Cyclomatic complexity ≤10 per function

sh
radon cc src/ -a -nb

12. Function Length

Function length serves as a proxy for single-responsibility adherence. Long functions typically handle multiple concerns, making them harder to test and reuse.

Priority: High (Strong Recommendation)

Pass Criteria: Functions fit on one screen (40-60 lines) with clear single responsibility

13. Context Manager Usage

Context managers guarantee resource cleanup regardless of how a code block exits: whether it exits normally, via an exception, or through an early return.

Priority: Must-Fix

Pass Criteria: File operations use with statements, exit returns False

python
# ❌ FAIL - Manual resource management
f = open('file.txt')
data = f.read()
f.close()
# ✅ PASS - Context manager
with open('file.txt') as f:
data = f.read()

14. Idiomatic Python Patterns

Idiomatic Python leverages the language's built-in features to write code that is both more concise and less error-prone.

Priority: Must-Fix

Pass Criteria: List comprehensions for simple transformations, enumerate() instead of manual counters, no mutable default arguments

python
# ❌ FAIL - Mutable default argument
def append_to(element, target=[]):
target.append(element)
return target
# ✅ PASS - Proper default argument handling
def append_to(element, target=None):
if target is None:
target = []
target.append(element)
return target

15. Minimal Try Block Scope

Try blocks should contain only the code that can raise the specific exception being caught. Overly broad try blocks obscure which operation actually failed.

Priority: Recommended

Pass Criteria: Try blocks contain only code that can raise the specific caught exception

Performance Optimization Checks for Scalable Python Applications

Performance checks require profiling evidence before optimization. Profile first using tools like cProfile, and optimize only the hot paths identified through data.

16. Algorithmic Complexity

Algorithmic complexity determines how code scales with input size. Selecting appropriate data structures provides O(1) operations that maintain consistent performance regardless of collection size.

Priority: High (Context-Dependent)

Pass Criteria: O(1) set/dict lookups for membership testing, no O(n²) patterns in loops

python
# ❌ FAIL - O(n×m) complexity
def filter_events(events, allowed_ids):
return [e for e in events if e.user_id in allowed_ids]
# ✅ PASS - O(n) complexity
def filter_events(events, allowed_ids):
allowed_ids = set(allowed_ids)
return [e for e in events if e.user_id in allowed_ids]

17. Generator Usage for Large Data

Generators enable processing of datasets larger than available memory by yielding items one at a time rather than materializing entire collections.

Priority: Recommended (for large datasets)

Pass Criteria: Generators used for streaming large data, no full-file loads when streaming is possible

18. Memory-Efficient Classes

Python objects carry significant memory overhead from their instance dictionaries. The slots declaration reduces memory footprint by 40-50% for simple data classes.

Priority: Medium

Pass Criteria: slots on classes with >10,000 instances

Security Vulnerability Detection in Python Code Reviews

High-impact or critical security vulnerabilities are typically treated as must-fix deployment blockers, while lower-severity issues are handled through risk-based prioritization.

19. SQL Injection Prevention

SQL injection remains one of the most prevalent and dangerous web application vulnerabilities despite being entirely preventable. Parameterized queries completely eliminate this attack vector by separating code from data.

Priority: Must-Fix

Pass Criteria: All queries use parameterized statements

python
# ❌ FAIL
query = f"SELECT * FROM users WHERE username = '{username}'"
# ✅ PASS
query = "SELECT * FROM users WHERE username = %s"
cursor.execute(query, (username,))

20. Secure Deserialization

Deserialization of untrusted data can lead to remote code execution when using formats like pickle. Safe alternatives, such as yaml.safe_load(), restrict deserialization to data types only.

Priority: Must-Fix

Pass Criteria: No pickle.loads() on untrusted data, yaml.safe_load() instead of yaml.load()

21. No Hardcoded Secrets

Hardcoded credentials in source code inevitably leak through version control history, log files, or repository exposure. When teams use Augment Code's Context Engine to scan codebases across 400,000+ files, hardcoded secrets are flagged automatically during development, preventing credentials from reaching version control.

Priority: Critical (Must-Fix)

Pass Criteria: All credentials sourced from environment variables or secret management

python
# ❌ FAIL - Hardcoded credentials
def connect_database():
connection = db.connect(
host="prod-db.example.com",
user="admin",
password="SuperSecret123!"
)
return connection
# ✅ PASS
import os
from dotenv import load_dotenv
load_dotenv()
def connect_database():
connection = db.connect(
host=os.getenv("DB_HOST"),
user=os.getenv("DB_USER"),
password=os.getenv("DB_PASSWORD")
)
return connection

Automation: Bandit detects hardcoded secrets: bandit -r src/ -f json -o bandit-report.json

22. Dependency Vulnerability Scanning

Third-party dependencies extend your attack surface to include every vulnerability in your dependency tree. Automated scanning in CI catches vulnerable dependencies before deployment.

Priority: Must-Fix (HIGH/CRITICAL CVEs)

Pass Criteria: pip-audit runs in CI; HIGH/CRITICAL vulnerabilities block merge

Documentation and Dependency Standards to Prevent Technical Debt

Documentation and dependency hygiene prevent technical debt accumulation. Clear documentation standards and explicit dependency management reduce onboarding friction and long-term maintenance costs.

23. Google-Style Docstrings

Docstrings serve as the primary documentation for API consumers. Google-style docstrings provide a consistent structure for documenting parameters, return values, and exceptions.

Priority: Must-Fix

Pass Criteria: All public APIs have Args/Returns/Raises sections

python
def calculate_tax(income, rate):
"""Calculate tax amount.
Args:
income: Gross income amount.
rate: Tax rate as decimal (0.0 to 1.0).
Returns:
Calculated tax amount.
Raises:
ValueError: If rate is negative or exceeds 1.0.

24. Dependency Lock Files

Lock files capture the exact versions and cryptographic hashes of all dependencies at a known-good point in time, ensuring reproducible builds across all environments.

Priority: Must-Fix

Pass Criteria: pyproject.toml plus lock files with exact versions and hashes committed

25. Build System Configuration

Every dependency added to a project represents a maintenance commitment and potential security liability. Explicit justification ensures teams consciously evaluate trade-offs.

Priority: Recommended

Pass Criteria: Written justification for new dependencies explaining necessity, maintenance status, and security history

Implementing This Python Code Review Checklist in Your Workflow

This checklist transforms subjective code review discussions into consistent, evidence-based decisions by organizing 25 actionable checks into priority levels. Start with automation: add a .pre-commit-config.yaml using Black, isort, Flake8, and Bandit. Establish coverage gates by configuring pytest-cov with the 80% threshold in CI. Enable type checking with mypy in strict mode.

Augment Code's Context Engine identifies code quality issues across 400,000+ files through architectural analysis, accelerating review cycles by catching style violations, type mismatches, and security vulnerabilities before human review begins. Get started with automated code review →

Augment Code CTA graphic showcasing Context Engine for large codebases with "Ship code with confidence" call-to-action button featuring AI processor visualization

Written by

Molisha Shah

Molisha Shah

GTM and Customer Champion


Get Started

Give your codebase the agents it deserves

Install Augment to get started. Works with codebases of any size, from side projects to enterprise monorepos.