What coverage percentage should Python teams require for code review approval?

Most mature engineering teams target ≥80% branch coverage on new and modified code, though the specific threshold matters less than consistent enforcement and prioritizing meaningful test quality over raw coverage metrics.

Should code reviews block PRs for style violations?

Style violations should rarely reach the PR stage if teams configure pre-commit hooks properly. When style issues do appear in reviews, they should not block merges of code that otherwise improve system health, in line with Google's reviewer approval standards.

How should teams handle type-hint requirements across different Python versions?

Teams targeting Python 3.9+ should use built-in generics, such as list[str], instead of typing.List[str], while Python 3.10+ teams can adopt the cleaner union syntax (X | None) instead of Optional[X]. The requires-python field in pyproject.toml should specify the minimum version to ensure consistency.

What distinguishes must-fix security issues from recommended improvements?

Must-fix security issues are deployment blockers that create immediate risk: SQL injection vulnerabilities, hardcoded secrets, HIGH/CRITICAL CVEs in dependencies, and bare except clauses that mask errors. Recommended improvements enhance code health but can be addressed in follow-up PRs without blocking the current change.

How can teams reduce time spent on style discussions during code review?

Pre-commit hooks configured with Black, isort, and Flake8 eliminate virtually all style discussions by enforcing formatting before code reaches version control. This automation frees reviewers to focus on logic, architecture, and domain-specific concerns that require human judgment.

Python Code Review Checklist: 25 Things to Check for Engineering Teams

A comprehensive Python code review checklist should include 25 prioritized checks covering style, correctness, security, and documentation, as inconsistent review criteria lead to subjective debates that waste engineering time while missing genuine risks.

TL;DR

Python code reviews fail when teams lack shared pass/fail criteria, leading to subjective debates over style while security vulnerabilities slip through. This checklist organizes 25 concrete checks by priority level (critical, high, medium), with automation commands and examples validated across mature engineering teams that follow PEP standards, Google practices, and OWASP guidelines.

Code reviews represent a significant investment of engineering resources, yet teams frequently debate the same style choices, miss critical security issues, and block PRs for minor preference differences. The friction stems from undefined review standards: without explicit pass/fail criteria, reviewers default to personal preferences while overlooking genuine risks.

This checklist addresses that gap by categorizing 25 concrete checks into must-fix blockers and recommended improvements. The distinction follows Google's documented standard: "Reviewers should favor approving a CL once it is in a state where it definitely improves the overall code health of the system being worked on, even if the CL isn't perfect."

Each checklist item includes tool configurations, code examples, and authoritative source citations that reviewers can reference during PR discussions.

Augment Code's Context Engine processes entire codebases across 400,000+ files, enabling teams to automate consistency checks while AI-assisted reviews catch architectural issues that manual review misses. Explore automated code review capabilities →

Style and PEP 8 Compliance Standards for Consistent Codebases

Style checks represent the foundation of consistent codebases. Pre-commit hooks configured for formatting (Black/Ruff), linting (Flake8/Pylint), and related style tools reduce style debates in human review and enable reviewers to focus on logic, architecture, and domain knowledge.

1. Indentation and Line Length

Consistent indentation and line length form the visual backbone of readable Python code. When developers encounter inconsistent spacing or excessively long lines, they spend cognitive energy parsing structure rather than understanding logic.

Priority: Must-Fix

Pass Criteria: 4 spaces per indentation level, no mixed tabs/spaces, lines ≤79 characters (PEP 8 standard) or ≤88 characters (Black formatter compatibility)

python

# ❌ FAIL - Mixed indentation
def calculate_total(items):
  total = 0  # 2 spaces
    for item in items:  # 4 spaces
        total += item.price
    return total

# ✅ PASS - Consistent 4-space indentation
def calculate_total(items):
    total = 0
    for item in items:
        total += item.price
    return total

Automation: Configure .flake8 with max-line-length = 88, then enforce via pre-commit hooks.

2. Naming Conventions

Names serve as the primary documentation layer in code: they communicate intent, scope, and type at a glance. Python's established conventions (snake_case for functions, PascalCase for classes) provide a shared vocabulary that experienced developers parse unconsciously.

Priority: Must-Fix

Pass Criteria: Functions and variables use snake_case, classes use PascalCase, constants use UPPER_CASE

python

# ✅ PASS
class DataProcessor:
    MAX_CONNECTIONS = 100
    
    def process_data(self):
        pass

3. Import Organization

Import statements act as a dependency manifest at the top of every module. Well-organized imports immediately communicate what external systems a module relies on.

Priority: Must-Fix

Pass Criteria: Three groups separated by blank lines: standard library, third-party, and local imports. Alphabetically sorted within groups.

python

# ✅ PASS - Proper three-group structure
import os
import sys

from flask import Flask

from myapp.models import User

Automation: Use isort with Black-compatible configuration in pyproject.toml.

4. Pre-Commit Hook Enforcement

Pre-commit hooks shift quality enforcement left in the development workflow, catching issues before they enter version control. This approach prevents style violations and security issues from ever reaching the PR stage.

Priority: Must-Fix

Pass Criteria: Repository includes .pre-commit-config.yaml with formatters, linters, and security scanners configured

text

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/psf/black
    rev: 24.1.1
    hooks:
      - id: black
  - repo: https://github.com/PyCQA/bandit
    rev: 1.7.6
    hooks:
      - id: bandit
        args: ['-ll']

Test Coverage and Type Checking Requirements for Production Code

Test coverage and type checking represent the primary defense against regressions. Teams using AI-assisted code review tools can accelerate test generation while maintaining coverage thresholds that prevent regressions.

5. Test Coverage Threshold

Test coverage metrics provide objective evidence that code paths have been exercised, but the number alone doesn't tell the full story. Effective coverage combines quantitative thresholds with qualitative review of edge cases, error paths, and boundary conditions.

Priority: Must-Fix

Pass Criteria: ≥80% branch coverage on new/modified code with systematic edge-case testing

python

# ✅ PASS - Comprehensive edge case coverage
@pytest.mark.parametrize("input,expected", [
    ([], 0),              # Empty collection
    ([1], 1),             # Single item boundary
    ([1, 2, 3], 6),       # Normal case
    pytest.param(None, 0, id="none_input"),
])
def test_sum_values(input, expected):
    assert sum_values(input) == expected

def test_sum_values_error_handling():
    """Test error paths and invalid input types."""
    with pytest.raises(ValueError, match="Invalid format"):
        sum_values("invalid")

Automation: Configure pytest with --cov=src --cov-fail-under=80 in pyproject.toml.

6. Type Hints with Modern Syntax

Type hints transform Python from a dynamically typed language into one with optional static analysis capabilities. Modern Python syntax (3.9+) eliminates the need for typing module imports in most cases. Teams using Augment Code's Context Engine can identify type inconsistencies across 400,000+ files, catching interface mismatches that manual review overlooks.

Priority: Must-Fix (Python 3.9+)

Pass Criteria: Uses list[str] instead of List[str], X | None instead of Optional[X], mypy runs in CI with strict mode

python

# ❌ FAIL - Deprecated typing imports (Python 3.9+)
from typing import List, Dict, Optional

def process(items: List[str]) -> Optional[Dict[str, int]]:
    pass

# ✅ PASS - Modern syntax (Python 3.10+)
def process(items: list[str]) -> dict[str, int] | None:
    pass

7. Exception Handling Specificity

Exception handling determines how gracefully code responds to unexpected conditions. An overly broad exception handler masks bugs by silently swallowing errors that should propagate.

Priority: Must-Fix

Pass Criteria: No bare except clauses, specific exception types caught, and logger.exception() used for tracebacks

python

# ❌ FAIL - Bare except catches SystemExit
try:
    process_data()
except:
    pass

# ✅ PASS - Specific handling with logging
import logging
logger = logging.getLogger(__name__)

try:
    process_data()
except ValueError as e:
    logger.exception("Invalid data format")
    raise

Strengthen your code review automation →

8. Assertion Specificity in Tests

Assertions serve as executable documentation of expected behavior. Vague assertions like 'assert result pass' when any truthy value is returned provide no protection against incorrect truthy values.

Priority: Recommended

Pass Criteria: Assertions validate specific expected values, and exception messages are validated with pytest.raises(match=)

python

# ❌ FAIL - Insufficient specificity
assert response.status_code == 200
assert user.id is not None

# ✅ PASS - Specific validation
assert response.status_code == 200
assert user.email == "test@example.com"

9. Test Independence: No Shared Mutable State

Test independence ensures that each test validates behavior in isolation, making failures deterministic and debuggable. Shared mutable state between tests creates order-dependent behavior.

Priority: Recommended

Pass Criteria: Tests run independently in any order with no shared mutable state

10. Fixture Scoping

Fixture scoping balances test isolation against setup performance. Function-scoped fixtures guarantee a fresh state for each test but may repeatedly perform expensive operations.

Priority: Recommended

Pass Criteria: Expensive setup uses @pytest.fixture(scope="session"), test isolation uses @pytest.fixture(scope="function")

Check	Priority	Pass Criteria	Automation
Indentation/Line Length	Must-Fix	4 spaces per level, ≤88 characters	Black/Ruff
Naming Conventions	Must-Fix	snake_case functions/variables, PascalCase classes	Flake8
Import Organization	Must-Fix	3 groups sorted, separated by blank lines	isort
Test Coverage	Must-Fix	≥80% branch coverage on changed code	pytest-cov
Type Hints	Must-Fix	Modern syntax (PEP 585/604), strict mypy	mypy
Exception Handling	Must-Fix	Specific exception types only, no bare except	Flake8/Bandit

Augment Code CTA graphic highlighting Context Engine analyzing 400,000+ files with "Ship features 5-10x faster" call-to-action button on dark tech-themed background

Readability and Code Structure Checks for Maintainable Python

Readability checks require human judgment but follow consistent principles. These items improve maintainability without blocking deployment.

11. Cyclomatic Complexity

Cyclomatic complexity measures the number of linearly independent paths through code, directly correlating with testing difficulty and bug likelihood.

Priority: High (Strong Recommendation)

Pass Criteria: Cyclomatic complexity ≤10 per function

radon cc src/ -a -nb

12. Function Length

Function length serves as a proxy for single-responsibility adherence. Long functions typically handle multiple concerns, making them harder to test and reuse.

Priority: High (Strong Recommendation)

Pass Criteria: Functions fit on one screen (40-60 lines) with clear single responsibility

13. Context Manager Usage

Context managers guarantee resource cleanup regardless of how a code block exits: whether it exits normally, via an exception, or through an early return.

Priority: Must-Fix

Pass Criteria: File operations use with statements, exit returns False

python

# ❌ FAIL - Manual resource management
f = open('file.txt')
data = f.read()
f.close()

# ✅ PASS - Context manager
with open('file.txt') as f:
    data = f.read()

14. Idiomatic Python Patterns

Idiomatic Python leverages the language's built-in features to write code that is both more concise and less error-prone.

Priority: Must-Fix

Pass Criteria: List comprehensions for simple transformations, enumerate() instead of manual counters, no mutable default arguments

python

# ❌ FAIL - Mutable default argument
def append_to(element, target=[]):
    target.append(element)
    return target

# ✅ PASS - Proper default argument handling
def append_to(element, target=None):
    if target is None:
        target = []
    target.append(element)
    return target

15. Minimal Try Block Scope

Try blocks should contain only the code that can raise the specific exception being caught. Overly broad try blocks obscure which operation actually failed.

Priority: Recommended

Pass Criteria: Try blocks contain only code that can raise the specific caught exception

Performance Optimization Checks for Scalable Python Applications

Performance checks require profiling evidence before optimization. Profile first using tools like cProfile, and optimize only the hot paths identified through data.

16. Algorithmic Complexity

Algorithmic complexity determines how code scales with input size. Selecting appropriate data structures provides O(1) operations that maintain consistent performance regardless of collection size.

Priority: High (Context-Dependent)

Pass Criteria: O(1) set/dict lookups for membership testing, no O(n²) patterns in loops

python

# ❌ FAIL - O(n×m) complexity
def filter_events(events, allowed_ids):
    return [e for e in events if e.user_id in allowed_ids]

# ✅ PASS - O(n) complexity
def filter_events(events, allowed_ids):
    allowed_ids = set(allowed_ids)
    return [e for e in events if e.user_id in allowed_ids]

17. Generator Usage for Large Data

Generators enable processing of datasets larger than available memory by yielding items one at a time rather than materializing entire collections.

Priority: Recommended (for large datasets)

Pass Criteria: Generators used for streaming large data, no full-file loads when streaming is possible

18. Memory-Efficient Classes

Python objects carry significant memory overhead from their instance dictionaries. The slots declaration reduces memory footprint by 40-50% for simple data classes.

Priority: Medium

Pass Criteria: slots on classes with >10,000 instances

Security Vulnerability Detection in Python Code Reviews

High-impact or critical security vulnerabilities are typically treated as must-fix deployment blockers, while lower-severity issues are handled through risk-based prioritization.

19. SQL Injection Prevention

SQL injection remains one of the most prevalent and dangerous web application vulnerabilities despite being entirely preventable. Parameterized queries completely eliminate this attack vector by separating code from data.

Priority: Must-Fix

Pass Criteria: All queries use parameterized statements

python

# ❌ FAIL
query = f"SELECT * FROM users WHERE username = '{username}'"

# ✅ PASS
query = "SELECT * FROM users WHERE username = %s"
cursor.execute(query, (username,))

20. Secure Deserialization

Deserialization of untrusted data can lead to remote code execution when using formats like pickle. Safe alternatives, such as yaml.safe_load(), restrict deserialization to data types only.

Priority: Must-Fix

Pass Criteria: No pickle.loads() on untrusted data, yaml.safe_load() instead of yaml.load()

21. No Hardcoded Secrets

Hardcoded credentials in source code inevitably leak through version control history, log files, or repository exposure. When teams use Augment Code's Context Engine to scan codebases across 400,000+ files, hardcoded secrets are flagged automatically during development, preventing credentials from reaching version control.

Priority: Critical (Must-Fix)

Pass Criteria: All credentials sourced from environment variables or secret management

python

# ❌ FAIL - Hardcoded credentials
def connect_database():
    connection = db.connect(
        host="prod-db.example.com",
        user="admin",
        password="SuperSecret123!"
    )
    return connection

# ✅ PASS
import os
from dotenv import load_dotenv

load_dotenv()

def connect_database():
    connection = db.connect(
        host=os.getenv("DB_HOST"),
        user=os.getenv("DB_USER"),
        password=os.getenv("DB_PASSWORD")
    )
    return connection

Automation: Bandit detects hardcoded secrets: bandit -r src/ -f json -o bandit-report.json

22. Dependency Vulnerability Scanning

Third-party dependencies extend your attack surface to include every vulnerability in your dependency tree. Automated scanning in CI catches vulnerable dependencies before deployment.

Priority: Must-Fix (HIGH/CRITICAL CVEs)

Pass Criteria: pip-audit runs in CI; HIGH/CRITICAL vulnerabilities block merge

Documentation and Dependency Standards to Prevent Technical Debt

Documentation and dependency hygiene prevent technical debt accumulation. Clear documentation standards and explicit dependency management reduce onboarding friction and long-term maintenance costs.

23. Google-Style Docstrings

Docstrings serve as the primary documentation for API consumers. Google-style docstrings provide a consistent structure for documenting parameters, return values, and exceptions.

Priority: Must-Fix

Pass Criteria: All public APIs have Args/Returns/Raises sections

python

def calculate_tax(income, rate):
    """Calculate tax amount.
    
    Args:
        income: Gross income amount.
        rate: Tax rate as decimal (0.0 to 1.0).
    
    Returns:
        Calculated tax amount.
    
    Raises:
        ValueError: If rate is negative or exceeds 1.0.

24. Dependency Lock Files

Lock files capture the exact versions and cryptographic hashes of all dependencies at a known-good point in time, ensuring reproducible builds across all environments.

Priority: Must-Fix

Pass Criteria: pyproject.toml plus lock files with exact versions and hashes committed

25. Build System Configuration

Every dependency added to a project represents a maintenance commitment and potential security liability. Explicit justification ensures teams consciously evaluate trade-offs.

Priority: Recommended

Pass Criteria: Written justification for new dependencies explaining necessity, maintenance status, and security history

Implementing This Python Code Review Checklist in Your Workflow

This checklist transforms subjective code review discussions into consistent, evidence-based decisions by organizing 25 actionable checks into priority levels. Start with automation: add a .pre-commit-config.yaml using Black, isort, Flake8, and Bandit. Establish coverage gates by configuring pytest-cov with the 80% threshold in CI. Enable type checking with mypy in strict mode.

Augment Code's Context Engine identifies code quality issues across 400,000+ files through architectural analysis, accelerating review cycles by catching style violations, type mismatches, and security vulnerabilities before human review begins. Get started with automated code review →

Python Code Review Checklist: 25 Things to Check for Engineering Teams

TL;DR

Style and PEP 8 Compliance Standards for Consistent Codebases

1. Indentation and Line Length

2. Naming Conventions

3. Import Organization

4. Pre-Commit Hook Enforcement

Test Coverage and Type Checking Requirements for Production Code

5. Test Coverage Threshold

6. Type Hints with Modern Syntax

7. Exception Handling Specificity

8. Assertion Specificity in Tests

9. Test Independence: No Shared Mutable State

10. Fixture Scoping

Readability and Code Structure Checks for Maintainable Python

11. Cyclomatic Complexity

12. Function Length

13. Context Manager Usage

14. Idiomatic Python Patterns

15. Minimal Try Block Scope

Performance Optimization Checks for Scalable Python Applications

16. Algorithmic Complexity

17. Generator Usage for Large Data

18. Memory-Efficient Classes

Security Vulnerability Detection in Python Code Reviews

19. SQL Injection Prevention

20. Secure Deserialization

21. No Hardcoded Secrets

22. Dependency Vulnerability Scanning

Documentation and Dependency Standards to Prevent Technical Debt

23. Google-Style Docstrings

24. Dependency Lock Files

25. Build System Configuration

Implementing This Python Code Review Checklist in Your Workflow

Written by

Molisha Shah

Give your codebase the agents it deserves

TL;DR

Style and PEP 8 Compliance Standards for Consistent Codebases

1. Indentation and Line Length

2. Naming Conventions

3. Import Organization

4. Pre-Commit Hook Enforcement

Test Coverage and Type Checking Requirements for Production Code

5. Test Coverage Threshold

6. Type Hints with Modern Syntax

7. Exception Handling Specificity

8. Assertion Specificity in Tests

9. Test Independence: No Shared Mutable State

10. Fixture Scoping

Readability and Code Structure Checks for Maintainable Python

11. Cyclomatic Complexity

12. Function Length

13. Context Manager Usage

14. Idiomatic Python Patterns

15. Minimal Try Block Scope

Performance Optimization Checks for Scalable Python Applications

16. Algorithmic Complexity

17. Generator Usage for Large Data

18. Memory-Efficient Classes

Security Vulnerability Detection in Python Code Reviews

19. SQL Injection Prevention

20. Secure Deserialization

21. No Hardcoded Secrets

22. Dependency Vulnerability Scanning

Documentation and Dependency Standards to Prevent Technical Debt

23. Google-Style Docstrings

24. Dependency Lock Files

25. Build System Configuration

Implementing This Python Code Review Checklist in Your Workflow

What coverage percentage should Python teams require for code review approval?

Should code reviews block PRs for style violations?

How should teams handle type-hint requirements across different Python versions?

What distinguishes must-fix security issues from recommended improvements?

How can teams reduce time spent on style discussions during code review?

Related Guides

Written by

Molisha Shah

Give your codebase the agents it deserves