Integrating Static Analysis for Cleaner Code

Integrating Static Analysis for Cleaner Code

Making it easier for developers to contribute to GitHub Echo

Introduction

This week in my open-source development course at Seneca, we focused on integrating static analysis tools into our command-line projects. If you’ve been following my blog, you know I’ve been working on GitHub-echo, a command-line tool that extracts actionable insights from GitHub repositories.

Static analysis is a technique used to check code for issues without actually running it. Tools like Flake8 and Pylint analyze code quality, enforce consistency, and help catch errors early. Security-focused tools like Bandit can scan for vulnerabilities, while formatters like Black make the codebase more readable.

Integrating these tools improves code quality, security, and maintainability, which is especially useful for open-source projects like GitHub-echo, where maintaining high standards and readability is key to attracting contributions. In order to achieve this, I opened an issue in my repo. This issue mainly focused on making it easier to contribute to my project by having static analysis and good documentation.

Creating CONTRIBUTING.md and CODE_OF_CONDUCT.md files

Good documentation starts with creating a clear, concise CONTRIBUTING.md file. In my previous project, I only had a single, lengthy README.md that included everything: instructions on usage, development setup, PR guidelines, and commit message standards. This approach is far from ideal, as the README should primarily focus on what the tool does and how end users can get started with it.

Information on contributing—like PR guidelines, commit standards, and ways to run the project for development—should be in a dedicated CONTRIBUTING.md file. So, in this project, I created a CONTRIBUTING.md file to keep developer-specific instructions separate, making the README focused and more user-friendly. This separation helps new developers and contributors get up to speed more quickly while keeping the documentation organized.

I used the CONTRIBUTING.md Generator, which also helped generate a CODE_OF_CONDUCT.md file for my open-source project. I’ve noticed many projects include a code of conduct, so I thought it would be beneficial to add one to my project as well. The result is a new CODE_OF_CONDUCT.md file that promotes a positive and welcoming environment for contributors. Having both these files improves clarity and inclusivity, making it easier for people to engage and collaborate effectively.

Configuring ruff linter and code formatter

Ruff is a powerful linter and code formatter that I’ve previously used in several Python projects (Ruff Tutorial). However, this time I delved deeper into configuring it to optimize my setup, which was a valuable learning experience. I began with Ruff’s basic setup guide to understand the structure of its configuration files and the customization options available.

In my earlier projects, I had only set up Ruff’s formatter through the VSCode extension without a dedicated configuration file, and the linter was not enabled. After reading the Ruff documentation on configuration, I created a ruff.toml file to establish a custom setup. Here’s what that configuration file looks like, along with comments explaining each line:

# Ref Doc: https://docs.astral.sh/ruff/tutorial/#configuration

# Enable fix mode for automatically fixing linting errors
fix = true

# Set the maximum line length for code to 79 characters
line-length = 79

[lint]
# Specify the categories of linting errors to check
select = [
    "A", # PEP8 error codes (style guide for Python)
    "B", # flake8-bugbear (potential bugs)
    "E", # flake8 pycodestyle errors (style violations)
    "F", # pyflakes (checks for errors in Python code)
    "I", # isort (enforces consistent import order)
    "N", # pep8-naming (enforces naming conventions)
    "W", # warnings and style errors (general warnings)
]

# Exclude specific directories from linting to avoid unnecessary checks
exclude = [
    ".git",        # Git version control directory
    "__pycache__", # Compiled Python files directory
    "venv",        # Virtual environment directory
    "build",       # Build artifacts
    "dist",        # Distribution packages
]

[format]
# Specify formatting preferences for the code
quote-style = "single"   # Use single quotes for strings
docstring-code-format=true # Enable code formatting in docstrings
docstring-code-line-length = 72 # Set max line length for code in docstrings
indent-style = "space"    # Use spaces for indentation

[lint.pycodestyle]
# Configure specific settings for pycodestyle checks
max-line-length = 100     # Allow max line length of 100 characters for pycodestyle

[lint.pydocstyle]
# Configure pydocstyle for docstring style checks
convention = "google"     # Use Google style for docstrings

This configuration file enables both linting and formatting with Ruff, though getting it right required more effort than it might sound. I had to go through Ruff’s comprehensive list of rules to determine which ones suited my project. There was a fair amount of trial and error involved as I tested different settings and adjusted them to ensure they aligned with my code’s needs. Eventually, this approach allowed me to fine-tune Ruff’s configuration for optimal results.

The aftermath

After configuring Ruff, I explored Ruff’s documentation on running the linter and formatter. Running the linter initially revealed around 100 linting issues, partly because some rules conflicted—for instance, PEP8 standards sometimes vary from Flake8 standards. I learned that it’s essential to select the rules that work best for your project rather than trying to apply multiple, sometimes overlapping, standards. Attempting to auto-fix with ruff check --fix resolved some issues, but I still had to manually address others.

While tedious, this process ensured consistent coding style across the project, making it easier to maintain.

Creating custom scripts using poetry

Once all code adhered to a consistent style, I formatted it using Ruff’s formatter. To make linting and formatting easier for other users, I created scripts in my pyproject.toml using Poetry commands:

  • poetry run lint: Lints the project using Ruff, fixing fixable errors.

  • poetry run format: Formats all code.

  • poetry run lint-and-format: Lints and formats the project in one go.

Since I use Poetry for dependency management, this setup mirrors npm scripts in Node.js, allowing for easier command execution and consistency. I plan to add a testing script here as I expand the project’s test suite.

Setting Up Auto-Formatting in VSCode

To auto-format code on save, I configured Visual Studio Code (VSCode) to integrate Ruff. The steps I followed are outlined in Ruff’s VSCode editor setup guide. Here’s how I configured my .vscode folder.

  1. .vscode/settings.json: This file defines VSCode’s formatting behavior. Here’s what each setting accomplishes:

     {
         "editor.insertSpaces": true,               // Uses spaces instead of tabs
         "files.eol": "\n",                         // Enforces Unix-style line endings
         "files.insertFinalNewline": true,          // Adds a newline at the end of files
         "[python]": {
             "editor.formatOnSave": true,           // Formats Python files on save
             "editor.codeActionsOnSave": {
                 "source.fixAll": "explicit",       // Automatically fixes all fixable issues on save
                 "source.organizeImports": "explicit" // Organizes imports on save
             },
             "editor.defaultFormatter": "charliermarsh.ruff" // Sets Ruff as the default formatter for Python
         },
         "ruff.configuration": "ruff.toml",         // Points to Ruff’s config file for linting rules
         "[markdown]": {
             "editor.formatOnSave": true,
             "editor.defaultFormatter": "esbenp.prettier-vscode" // Uses Prettier to format Markdown files
         },
         "[json]": {
             "editor.formatOnSave": true,
             "editor.defaultFormatter": "esbenp.prettier-vscode" // Uses Prettier to format JSON files
         }
     }
    

    These settings ensure that Python files are auto-formatted and linted on save, using Ruff’s rules from the ruff.toml config file. Markdown and JSON files are formatted with Prettier.

  2. .vscode/extensions.json: This file suggests recommended extensions for users opening the project in VSCode, providing a streamlined setup for consistent development. Here’s what each recommendation accomplishes:

     {
         "recommendations": [
             "ms-python.python",                   // Python development support
             "esbenp.prettier-vscode",             // Prettier for consistent code formatting across languages
             "ms-python.vscode-pylance",           // Pylance for advanced Python features and better type-checking
             "yzhang.markdown-all-in-one",         // Markdown support with features like table of contents generation
             "charliermarsh.ruff"                  // Ruff extension for linting and formatting Python code
         ]
     }
    

The extensions.json provides helpful recommendations for new contributors, ensuring a consistent development environment across different setups. After setting this up, I now have auto-formatting on save, adhering to Ruff’s linting and formatting standards, with minimal manual effort needed during development.

This approach improves code quality and keeps the project clean and consistent for both new contributors and experienced developers.

Pre commit hooks

To enhance code quality and maintain consistency across the codebase, this project utilizes pre-commit hooks. Pre-commit hooks are scripts that run automatically before a commit is made, allowing you to enforce certain coding standards and styles, perform linting, or run tests to catch errors early in the development process.

Benefits of Pre-Commit Hooks

  • Automated Code Quality Checks: Pre-commit hooks automatically run linters and formatters on your code before it gets committed, ensuring adherence to defined coding standards.

  • Consistency Across Teams: By enforcing consistent style and quality checks, pre-commit hooks help maintain a uniform coding style, making collaboration easier.

  • Error Prevention: These hooks can catch errors and potential issues before they reach the main branch, reducing the likelihood of bugs in production.

  • Saves Time: Automating the process of formatting and linting allows developers to focus on writing code rather than worrying about style inconsistencies.

Integration with Poetry

In this project, pre-commit has been added as a dependency in the pyproject.toml file:

[tool.poetry.dependencies]
python = "^3.9"
typer = "^0.12.5"
google-generativeai = "^0.8.0"
python-dotenv = "^1.0.1"
httpx = "^0.27.2"
pytest = "^8.3.3"
single-source = "^0.4.0"
groq = "^0.11.0"
toml = "^0.10.2"
pre-commit = "^4.0.1"  # Added pre-commit as a dependency

This setup ensures that when you run poetry install, both pre-commit and its associated configurations are installed automatically.

Pre-Commit Configuration

The .pre-commit-config.yaml file is configured to utilize Ruff for linting and formatting. You can read more about this here.

# .pre-commit-config.yaml
# Reference: https://github.com/astral-sh/ruff-pre-commit
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.7.1  # Specify the version of Ruff to use
    hooks:
      - id: ruff
        args: [--fix]  # Automatically fix linting issues
      - id: ruff-format  # Run the formatter to apply consistent styling

As a user, all I have to do is run the following command only once when I clone the repo and initialize poetry.

pre-commit install

Once set up, every time you commit changes, the pre-commit hooks will run, automatically linting and formatting your code as per the configurations specified, thereby maintaining a high standard of code quality throughout the development process.

Summary: Setting Up Ruff for Newbies

As we discussed, in the journey of enhancing code quality in open-source projects, integrating static analysis tools is crucial, and Ruff stands out as a powerful linter and code formatter. Here's a step-by-step guide for newbies on setting up Ruff and understanding its benefits.

Why Use Ruff?

  1. Comprehensive Linting and Formatting: Ruff supports a variety of linting rules and automatic code formatting, ensuring that your code adheres to established style guidelines and maintains readability.

  2. Improved Code Quality: By catching errors early in the development process, Ruff helps maintain high standards, making it easier for others to contribute.

  3. Ease of Configuration: Ruff allows you to tailor its behavior through a simple configuration file (ruff.toml), enabling you to select specific rules and formatting preferences that suit your project’s needs.

Steps to Set Up Ruff

  1. Install Ruff: Begin by adding Ruff to your project using your dependency manager, such as Poetry:

     poetry add ruff
    
  2. Create a Configuration File: Create a ruff.toml file in your project’s root directory. This file allows you to customize Ruff’s behavior:

     # Enable automatic fixing of linting errors
     fix = true
     line-length = 79
    
     [lint]
     select = ["A", "B", "E", "F", "I", "N", "W"]
     exclude = [".git", "__pycache__", "venv", "build", "dist"]
    
     [format]
     quote-style = "single"
     indent-style = "space"
    
     [lint.pycodestyle]
     max-line-length = 100
    
     [lint.pydocstyle]
     convention = "google"
    
  3. Run Ruff: Once configured, run Ruff to lint your code:

     ruff check
    
  4. Set Up Scripts: For ease of use, add scripts in your pyproject.toml to automate linting and formatting:

     [tool.poetry.scripts]
     lint = "ruff check"
     format = "ruff format"
     lint-and-format = "ruff check --fix . && ruff format ."
    
  5. Integrate with VSCode: Configure Visual Studio Code to use Ruff for linting and formatting on save. Create or modify the .vscode/settings.json file. Before doing so, remember to install the ruff VSCode extension.

     {
         "editor.formatOnSave": true,
         "editor.codeActionsOnSave": {
           "source.fixAll": "explicit",
           "source.organizeImports": "explicit"
         },
         "editor.defaultFormatter": "charliermarsh.ruff"
     }
    
  6. Implement Pre-Commit Hooks: Use pre-commit hooks to ensure that code is linted and formatted before it’s committed. Add Ruff as a pre-commit hook in your .pre-commit-config.yaml:

     repos:
       - repo: https://github.com/astral-sh/ruff-pre-commit
         rev: v0.7.1
         hooks:
           - id: ruff
             args: [--fix]
           - id: ruff-format
    

    After configuring pre-commit, run:

     poetry add pre-commit
     pre-commit install
    

By following these steps, newcomers can set up Ruff effectively, ensuring a consistent and high-quality codebase that is easy to maintain and contribute to. This process not only enhances individual coding practices but also fosters a collaborative environment for open-source projects.

Conclusion

After ensuring that all files followed the Ruff standards, I squashed all the commits in my working branch. Squashing is useful as it combines all commit changes into a single commit, making the history cleaner.

As you can see in the picture below, I had to change almost all of my files to make them adhere to the style guidelines that I had opted for.

This final, organized commit was then merged with the main branch, resulting in a consistent codebase.

By implementing everything that I talked about in this blog, I was able to create a codebase that adheres to consistent styling and formatting, ensuring long-term maintainability for contributors.