Navigating Open Source Development

Welcome to Week 01 of the open source journey!

This week, we focused on collaborating with classmates by cloning their repositories, testing their code, and opening 3-5 issues based on bugs we discovered or enhancements we thought could be made. This exercise was an awesome way to get comfortable with reading and running someone else’s code, which is pretty much the core of contributing to open source projects.

Introduction

I teamed up with one of my classmates, Peter Wan, who is working on a CLI tool called gimme_readme that automatically generates a README.md file by analyzing your code. It simplifies the process of creating documentation and ensures that the README reflects the project structure. In return, Peter tested my tool, github-echo, a command-line tool designed to provide in-depth, actionable insights about GitHub repositories. It’s built to extract and present information that’s often challenging to decipher manually.

We tested each other's code and opened issues in each other’s repositories, highlighting areas where features could be added, improvements made, or enhancements introduced based on our findings.

Communication Approach

At the beginning of this course, we found that synchronous communication worked best for us. We've been great friends for a while, so our interactions were mostly casual. However, when it came to working on GitHub, we made sure to keep things professional, especially when writing detailed issue descriptions.

Most of our collaboration happened over Discord, where we’d discuss each other's code, brainstorm features, and talk about improvements. This setup allowed us to cross-reference issues in each other's repositories and get comfortable with the flow of opening issues and then submitting pull requests (PRs) to resolve them. It was a great learning experience, as we constantly bounced ideas off each other and learned from our different approaches.

While synchronous communication was great for building momentum and staying engaged early on, we recognized that it wouldn’t always be sustainable. Ideally, open-source collaboration should be more asynchronous so that contributors don't need to meet in real time. With this in mind, now that we’re familiar with the flow and how to contribute to each other’s repositories, we’re planning to transition to a more asynchronous style.

This shift will allow us to work at our own pace while still staying in sync through well-documented issues and PRs. It’s all part of adapting to a more open-source mindset, where contributors can collaborate from anywhere, on their own time.

Testing someone else’s code

Testing someone else’s code can be either a breeze or a headache, and it mostly depends on how well the project is documented. Luckily for us, we both had pretty solid documentation, so things were fairly straightforward. We even discussed how it felt to run each other’s code, and the key takeaway was simple: it's easy as long as the instructions are clear.

What made our experience even more interesting was the fact that I’m using a Mac, while my friend is on Windows. This gave us the opportunity to check whether our documentation was truly OS-independent. It turned out to be one of the most valuable lessons because ensuring your project runs smoothly across different systems is crucial. For example, when I tried to run Peter’s tool on my MacBook, we discovered that his code snippet looking for a config file in the root directory didn’t work on macOS. These kinds of issues are essential to catch when you're developing software meant for a wide audience.

On the flip side, when Peter tested my code, he struggled with setting up a GitHub token. This highlighted a flaw in my documentation—it wasn't clear enough. As developers, certain things seem intuitive to us, but they may not be obvious to someone running our code for the first time. It was a good reminder that detailed, beginner-friendly documentation is key to a smooth user experience.

Another thing worth mentioning is that we didn’t actually setup a full fledged development environment for each other’s repositories on our computers to battle-test the code against various use cases. As we were always in sync and talking about our code, we had a pretty good idea of what the other person was working on. So instead, we just downloaded each other’s packages from npm and PyPI and ran them to see how they performed for end users. Based on this, we analyzed what could be improved and what features could be added to make the overall user experience better.

In the end, our testing was very user-centric, focusing on how the tools performed for end users rather than digging into code optimization or making developer-specific suggestions. It was a great way to look at our projects from a fresh perspective, and we learned a lot about the importance of cross-platform compatibility and clear documentation.

Filing GitHub Issues

GitHub Issues are a way to track bugs, feature requests, and improvements for a project. You'd file an issue in someone else's repository to report problems, suggest enhancements, or ask questions related to the project. It's a key tool for open-source collaboration, helping maintainers keep track of community feedback.

We opened issues in each others repos based on what we found about each others project in the previous section:

Issues I filed in Peter’s repository

During our project, Peter and I were working on different timelines. I had most of my features in place, but Peter was still working on his. To help him catch up and ensure we met the Release 0.1 spec provided by our professor, I decided to file a few issues in his repository to guide him in adding the required features. Here’s a list of the issues I created and how things turned out:

Implement support to output to stdout or a file specified by the user (Enhancement)
Since Peter already had the -o or --output flag in place, this issue was about implementing its functionality. The idea was to allow users to specify whether the output should be sent to the terminal or saved to a file, giving more flexibility to the tool.
Status: Implemented.
Allow the user to specify an API provider (Enhancement)
Peter’s code initially defaulted to using the Google Gemini API for fetching information. We had discussed allowing users to choose from multiple API providers, like OpenAI's ChatGPT API or Grok API. While he didn’t implement support for ChatGPT, he did allow users to choose between Grok and Gemini.
Status: Implemented with Grok and Gemini API options.
Publish gimme_readme as a package in the npm registry
This issue focused on packaging up Peter's tool and publishing it on the npm registry, allowing other users to easily install and run it via npm install.
Status: Successfully published. You can find it on the npm registry here: gimme_readme.

Issues Filed by Peter in My Repository

Throughout our collaboration, Peter and I frequently discussed each other's projects, so I was well aware of the issues he planned to file. These were focused on improving the development process and adding new features to meet the Release 0.1 spec. Below are the issues Peter created in my repository, along with how I implemented each one.

Initialize a testing framework + update the CI pipeline

My GitHub Actions CI Pipeline was already automating the process of checking if my source code was properly formatted. Peter suggested expanding it by adding a new job that involves running tests automatically. This was more of a setup step, meant to get my project ready for test-driven development (TDD).
To implement this, I initialized a basic test. The test checks if running the command with the --version flag outputs the correct version number. While there’s only one test for now, the goal was to set up pytest and make sure the CI pipeline could run tests successfully.
In addition, I modified the CI pipeline so that it now runs tests alongside Python code checks and markdown lint checks.
I also set up branch protection rules. This means any code changes will only be merged after a pull request (PR) is made, and I verify that the tests pass. Again, the primary goal was to set up a TDD mindset, not to achieve full test coverage just yet.

Containerize the application using Docker

We decided that Dockerizing the application to make it easier for users who may not want to manually handle different Python environments. Docker allows the tool to run in an isolated environment, so users don’t need to worry about setting up dependencies or managing a Python3 or Conda environment.
Dockerizing the app was straightforward. I didn’t use a multi-stage build because I’m using Poetry for Python dependency management. Unlike npm, where you can simply copy over node_modules, Poetry works differently, and a multi-stage build isn't as beneficial in this context.
The Docker container allows users to run the tool without needing to download any additional requirements, making the setup much simpler for them.

Allow the user to control the temperature of the AI model via a flag

Another enhancement we discussed was giving users more control over the AI model’s behavior by allowing them to adjust the temperature via the -t or --temperature flag.
In generative AI models, temperature is a parameter that controls the randomness or creativity of the output. A temperature of 0 makes the model more deterministic, producing predictable and fact-based responses. A higher temperature, closer to 1, makes the model more creative and diverse in its answers.
While Gemini allows a temperature range of 0 to 2, I limited it between 0 and 1, as going above 1 tends to produce overly creative responses that can become vague or irrelevant.
The implementation of this feature was simple. I added another command that allows users to set the temperature when generating responses, giving them more flexibility in how they want the model to behave.

Conclusion and Learning

Testing someone else's code was a great exercise in identifying potential issues, bugs, or areas for improvement. I realized that effective communication, especially in open source, often happens asynchronously. Writing clear and concise GitHub issues is crucial so that the person reading them understands exactly what needs to be fixed or improved, even if you’re in different time zones or working on the project at different times.

This whole flow of opening issues served as a central point for organizing and tracking problems, bugs, or suggestions in our projects. y using this system, we not only tracked issues for each other’s repositories but also ensured that every issue had a clear resolution process, from opening an issue to closing it with a pull request (PR).

I also learned the value of testing across different operating systems. This is essential in open source projects where contributors and users might be on any platform.

In short, the whole process of testing, reviewing, and filing issues gave me a much better grasp of how open source projects operate. It showed me how to approach someone else's code respectfully, how to communicate suggestions clearly, and how to manage a project efficiently with minimal real-time communication.