Make a stale bot, learn GitHub Actions

A few months ago I had the chance to play with the first Beta of GitHub Actions but for some reasons it didn’t click for me: the point-and-click interface, the HCL syntax and the confusion in the market place put me back to my comfort zone, in the good company of the usual CI/CD systems we all have been using for years now.

Fast forward to a few weeks ago, when GitHub announced that Actions were finally approaching general availability; a couple of days later, my personal GitHub account was promptly enrolled in the Beta program. It was time to give it another try, and this is the story of my second attempt at learning Actions.

Introducing the Stale Bot

When I learn something new, it’s vital for me to have some sort of prototype I can change and evolve as I ramp up on the subject. That’s why after few basic experiments and a couple of easy projects ported from other CI/CD platforms, I challenged myself to write a GitHub bot implemented in a single Workflow with minimum recourse to 3rd party Actions.

The requirements are admittedly idiotic but they made a good excuse to learn how to:

use scheduled Workflows
execute jobs conditionally
interact with the GitHub API from a Workflow

Requirements

A stale bot is an automated mechanism responsible for marking issues and pull requests as “stale” when there’s no activity around them after a given amount of time. For the sake of simplicity, we’ll limit the scope of the bot to issues only.

The requirements are the following. For each issue on a GitHub repository:

Apply the label stale after 30 days without any kind of activity,
remove the label stale when there’s activity on the issue,
close the issue if it’s been stale for more than 15 days.

Create the Workflow

First of all, let’s create a GitHub Workflow. You can do it from the GitHub UI or by simply creating a yaml file under .github/workflows/ at the root of your repo. You might have different Workflows defined for a single repo and the name of the file can be whatever, in my case it is stalebot.yaml. We also have to give our Workflow a name that will be displayed in GitHub UI: let’s call it “Stale Bot”. Our yaml file will look like this:

name: Stale Bot

Determine when the Workflow has to run

We have to tell GitHub when we want to run our Workflow and we can do it with the keyword on. According to our requirements, the Workflow should be triggered by different events:

Every night, to attach the stale label and to close issues that’ve been stale long enough.
Every time the issue is edited, or a label or milestone is attached.
Every time somebody leaves a comment on the issue.

To run a Workflow periodically at a given time, we can use the schedule keyword, followed by one or more intervals defined in POSIX syntax, like in a crontab file. For example, if we want our bot to run every night at midnight, we add:

on:
  schedule:
    - cron: "0 0 * * *"

To also run the workflow every time an issue has activity, we add the issues keyword, so that the previous snippet will look like this:

on:
  schedule:
    - cron: "0 0 * * *"
  issues:

GitHub API sends different events for different things that happen to issues: by leaving the issues mapping empty, we accept the defaults. You can see what’s the default for any value of the on keyword in the docs page about events; in our case, the default value for issues is “any possible kind of interaction” which is a bit too much. That’s not a problem because we can limit the kind of events that trigger our bot by listing them explicitly in a sequence under the keyword types, like this:

on:
  schedule:
    - cron: "0 0 * * *"
  issues:
    types: [edited, milestoned, labeled]

Our bot will be then summoned when an issue is edited, a milestone is added or a label is attached. Since we also want the bot to be activated when somebody leaves a comment on the issue, we need one more event on our on mapping, specifically the issue_comment event:

on:
  schedule:
    - cron: "0 0 * * *"
  issues:
    types: [edited, milestoned, labeled]
  issue_comment:

The default types for the issue_comment event work for us, no need to add a types keyword here.

Define the jobs

A Workflow consists of a set of jobs that will be run in parallel; each job consists of a sequence of steps that will be run one after another. We’re going to define a single job named stale-bot-logic containing one step for each requirement we have, for a total of three steps. Since we can choose the platform on which our workflow will run, we pick Linux, specifically ubuntu-latest. Let’s add this to our workflow yaml definition:

on:
  schedule:
    - cron: "0 0 * * *"
  issues:
    types: [edited, milestoned, labeled]
  issue_comment:

jobs:
  stale-bot-logic:
    runs-on: ubuntu-latest
    steps:

A step can either run a command on the host running the Workflow, or run a Docker container or execute a special kind of node.js applications called Actions. There are a lot of Actions available, ready to be included in a Workflow; to have a better idea you can browse the GitHub market place - chances are that somebody else already solved a problem you’re having and no wheels will be reinvented that day.

Mark issue stale

The first step of the job will be marking an issue as stale after a certain amount of time: this means we need to query the GitHub API and guess what: there’s an Action for that. The Action we need is called actions/github-script and it’ll be the only one we’re going to use. It exposes a Javascript client that can make API calls straight from the yaml code. The Workflow will then look more or less like this (on section omitted for brevity):

jobs:
  stale-bot:
    runs-on: ubuntu-latest
    steps:
      - name: Mark stale
        uses: actions/github-script@0.2.0
        with:
          github-token: ${{ github.token }}
          script: // Here it goes some Javascript code that uses the GitHub API client

The uses keyword will tell the CI system we want to use a certain action at the version 0.2.0. Actions may expose few parameters you can use to configure their behaviour (refer to each Action’s docs for the details) and that’s what the with keyword is for: we’ll pass in a valid GitHub token using the param github-token and some Javascript code using the param script.

The logic for this job will be the following:

Get a list of all open issues.
For each one, get the time and date of the last update.
If the difference between now and the last update for an issue is above a certain threshold, attach a label called stale to it.

The code will look like this (I’ll paste only the job definition for brevity):

- name: Mark stale
  uses: actions/github-script@0.2.0
  with:
    github-token: ${{github.token}}
    script: |
      // Fetch the list of all open issues
      const opts = github.issues.listForRepo.endpoint.merge({
        ...context.repo,
        state: 'open',
      });
      const issues = await github.paginate(opts);

      // How many days of inactivity before we mark the issue stale
      const elapsedDays = 30
      // Get the time window in milliseconds
      const elapsed = elapsedDays * 24 * 60 * 60 * 1000;

      const now = new Date().getTime();
      for (const issue of issues) {
        // Ignore issues below the threshold
        if (now - new Date(issue.updated_at).getTime() < elapsed) {
          continue;
        }

        // If we got here, mark as stale.
        github.issues.addLabels({
          ...context.repo,
          issue_number: issue.number,
          labels: ['stale']
        });
      }

That’s pretty much all the code we need but there’s one detail we need to take care of: according to the on configuration we defined earlier, the Workflow will run every day as a cron job and also every time an issue is updated. There’s no need to have the “Mark Stale” job run every time an issue is updated (obviously an issue that gets updated is not stale) and we want to skip its execution in this case. In other words, we want to run the step only when the Workflow was kicked off by the cron scheduler. To do so, we can use the if keyword on our job:

- name: Mark stale
  uses: actions/github-script@0.2.0
  if: github.event_name == 'schedule'
  with: ...

The job will be then skipped unless the statement in the if value is true. That’s Javascript code and you can access several objects called contexts from there, in this case we’re accessing the event_name field of the github context.

Remove stale label

Sometimes a stale issue gets noticed and some activity kicks off again. In this case, we need to remove the stale label as soon as possible. We use the github-script Action again and this time we’ll ask the CI system to skip the job unless the source event is either an issue update or a new comment. Note how we use a slightly more complex if clause here:

- name: Remove stale
  if: github.event_name == 'issues' || github.event_name == 'issue_comment'
  uses: actions/github-script@0.2.0

This time the logic will be the following:

Get all the labels of the issue that triggered the Workflow.
Search whether it has a stale label and remove it.

The final job definition will than be:

- name: Remove stale
  if: github.event_name == 'issues' || github.event_name == 'issue_comment'
  uses: actions/github-script@0.2.0
  with:
    github-token: ${{github.token}}
    script: |
      // Fetch the list of labels attached to the issue that
      // triggered the workflow
      const opts = github.issues.listLabelsOnIssue.endpoint.merge({
        ...context.repo,
        issue_number: context.issue.number
      });
      const labels = await github.paginate(opts);

      for (const label of labels) {
        // If the issue has a label named 'stale', remove it
        if (label.name === 'stale') {
          await github.issues.removeLabel({
            ...context.repo,
            issue_number: context.issue.number,
            name: 'stale'
          })
          return;
        }
      }

Close stale

We’ve one last job left to complete our stale bot: an issue should be closed in case it’s been marked as stale for a certain amount of time. The logic is the following:

Load all the issues having the stale label attached.
For each one, compute the difference between now and the last update of the issue (we assume it concides with the moment the stale label was added)
If the difference is above our threshold, close the issue.

The job definition looks like this:

- name: Close stale
  if: github.event_name == 'schedule'
  uses: actions/github-script@0.2.0
  with:
    github-token: ${{github.token}}
    script: |
      // Fetch the list of all open issues that have the 'stale' label
      // attached
      const opts = github.issues.listForRepo.endpoint.merge({
        ...context.repo,
        state: 'open',
        labels: ['stale'],
      });
      const issues = await github.paginate(opts);

      // How many days of inactivity before we close a stale issue
      const elapsedDays = 15
      const elapsed = elapsedDays * 24 * 60 * 60 * 1000;

      const now = new Date().getTime();
      for (const issue of issues) {
        // Ignore issues below the threshold
        if (now - new Date(issue.updated_at).getTime() < elapsed) {
          // This is how to print debug informations
          console.log('skip issue ' + issue.number);
          continue;
        }

        // If we arrive here, the issue has to be closed
        console.log('closing issue ' + issue.number);
        await github.issues.update({
          ...context.repo,
          issue_number: issue.number,
          state: 'closed'
        });
      }

You can find the complete workflow here.

Caveat and final considerations

First thing first, this bot shouldn’t be considered feature complete by any mean, the goal was just ramping up up on a new technology without getting bored. Now that you’re more familiar with GitHub Actions, a bunch of different and possibly more efficient workflows to solve the same problems might have popped into your mind already; if you need a stale bot for your project, I wouldn’t see any problem on keep going this way and make the implementation better.

That said, there’s a subtle problem with this approach that I think it’s worth discussing: the code we wrote is just a piece of text defined in a yaml file that GitHub will happily pass to the entrypoint of the actions/github-script action. This means few terrible things:

No syntax highlighting
No code validation
No tests

Which can turn into respectively:

A miserable user experience for authors and contributors
A frustrating development cycle
Bugs and poor performance

This is actually very interesting because this problem can help us to better define what the boundaries of a Workflow should be. I’d phrase like this:

If a workflow contains more than two lines of logic in plain text form, it’s time to write a custom Action.

With a custom Action (whether Javascript or container based) you can dramatically simplify the workflow definition while having tests and the ability to exercise your code locally. You could also look at this problem symmetrically:

If your problem can be solved with two lines of logic, writing a custom Action is probably overkill

I’m a big fun of custom Actions and I enjoy Typescript, but I have to say they don’t come for free: you basically need to write a full fledged Node application and whether this is a problem or not, it’s up to you.

/about/