My First GitHub Action

Aug 29, 2023 in PROJECTS
projects opensource github
8 min read

Background

I recently had an issue with GitHub Actions where I needed to be able to clone my repo (with the checkout action) but also recurse submodules for private repos. I learned a lot in the process of making this work and gained a much better understanding of GitHub Actions security boundaries.

In general, the action I mentioned looks something like this:

- uses: actions/checkout@v3
  with:
    submodules: recursive

However, when the submodule is a private repo in your organization, this won’t succeed. I looked into how to authenticate to GitHub API’s inside a GitHub Action workflow and quickly found the guide for automatic token authentication using the GITHUB_TOKEN secret. This is the preferred way to authenticate inside a GitHub Actions workflow, but this token is only scoped for the repository the workflow originates from. In other words, my submodule clone (which is cloning a separate repo) still fails because GITHUB_TOKEN only has permissions in the originating repo.

Per the docs, I was left with two choices:

Use a PAT (🙃️)
Authenticate with a GitHub App

Although a lot more work has gone into making fine-grained PATs, the post literally says:

For long-term automation needs, we recommend using GitHub Actions or GitHub Apps wherever possible. GitHub Apps provide the same highly targeted permissions options and administrator controls available with fine-grained PATs. They’re also long-lived, and they’re not associated with an individual user who may leave your company or project.

Since the only reasonable approach to solving this problem appeared to be GitHub App authentication, I decided to dig into what this would take. GitHub has a docs page describing how to make this auth happen, and even suggests using a third-party action. Since the action they referenced is not official (and needs access to my private repositories), I decided to try making the action myself.

Types of GitHub Actions

When looking into GitHub Actions, you find there are three types of custom actions:

JavaScript
Docker
Composite

JavaScript is not my preferred language, and composite actions are basically just combining existing actions together, so I ended up going for the Docker type. This allowed me to write my action in golang and have it execute in a Docker container.

How it works

The action itself is fairly straightforward. You take the PEM key that you get when you create a GitHub app, use it to create a JWT, then use that JWT to call the GitHub API and get a token with a short lifetime that can be used in workflows as auth. The checkout action now looks like this:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: aro5000/gha-private-auth@v1
        id: token-generator
        with:
          pem: ${{ secrets.APP_PEM }}
          appId: ${{ secrets.APP_ID }}
          installId: ${{ secrets.APP_INSTALL_ID }}
      - uses: actions/checkout@v3
        with:
          submodules: recursive
          token: ${{ steps.token-generator.outputs.token }}

All of this behavior is driven by the action.yml, which is where all of the inputs and the execution parameters are defined. The action.yml file is really interesting from a security perspective because ultimately an empty repo with just an action.yml file could be a valid GitHub Action.

Docker Action Security Considerations

The action itself was fun to create, and certainly solved a problem for me, but understanding the security implications of GitHub Actions is where most of my learnings happened.

action.yml

Specifically, the run: section in the action.yml file is where the definition of how the action executes is specified. One of the examples they give in the docs is:

runs:
  using: 'docker'
  image: 'Dockerfile'

image: 'Dockerfile' is saying “use the Dockerfile in the repo to build the container image on the runner before executing the action”. In other words, every time the action is executed, an entire docker build process happens, which is quite slow. From a security perspective, this might be ideal because you know that the action image is being built from the originating source code. However, this is incredibly slow. I almost gave up on this project altogether when I thought this was the only option because a docker build on every execution added 10’s of seconds to an otherwise very quick action. A while later, I found there was another solution. You can actually specify a pre-built Docker image in this section:

runs:
  using: 'docker'
  image: 'docker://ghcr.io/aro5000/gha-private-auth'

This made the action execution considerably faster (an alpine image with a golang binary is pretty small) and made it feel like a normal GitHub action execution speed. Once I got over my initial excitement of getting this to work, I started to think “huh, so you can just specify any Docker image”? 🤔️

It’s true, any publicly accessible Docker image (at the time of this writing, private container registries are not supported) can be used in the action.yml file, even if it has nothing to do with the repo. The part that makes this worse is that even if you are trying to do the right thing, you have to be really thoughtful about specifying the container image version in your action.yml file. For example, many people will want to ensure they are using a specific version of a GitHub Action, so they will reference it via a git SHA. This is one way you could do this for my action:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: aro5000/gha-private-auth@923a32b52fcbf6a9adcee9c52e0fc68cabddc18d
        id: token-generator
        with:
          pem: ${{ secrets.APP_PEM }}
          appId: ${{ secrets.APP_ID }}
          installId: ${{ secrets.APP_INSTALL_ID }}
      - uses: actions/checkout@v3
        with:
          submodules: recursive
          token: ${{ steps.token-generator.outputs.token }}

This basically says, “use the version of this GitHub Action at commit 923a32b52fcbf6a9adcee9c52e0fc68cabddc18d. Generally this approach works pretty well because a git hash is immutable so you can be sure you are getting the same action.yml file each time. The problem arises if the action.yml file specifies a mutable container image tag in the definition. Take the following example:

runs:
  using: 'docker'
  image: 'docker://ghcr.io/aro5000/gha-private-auth:v1.2.3'

At first this might seem reasonable, the container image tag is specified to the same version v1.2.3 as the git tag. However, tags can be changed. If the repo was compromised, or I turned into a bad person, I could easily push a “new” version to ghcr.io/aro5000/gha-private-auth:v1.2.3 which contains code to steal secrets, or cause other harm. In order to have an immutable container image with your immutable git hash, you need to specify the SHA of the image tag in your action.yml file as well:

runs:
  using: 'docker'
  image: 'docker://ghcr.io/aro5000/gha-private-auth@sha256:341d6e7ca956bc9e67c24c137b4eaed71b1929414caa271c03d2f54f313346c4'

This configuration actually pins an immutable container image to the immutable git tag and gives you some guarantee that you will be executing the same code each time. Unfortunately, I think most of this is a moot point because most Docker GitHub Actions I’ve seen (including one of the examples they reference in the docs) specify a mutable container image tag. Further, people generally reference a “stable tag” of the action to ensure they are getting regular updates, so you don’t get that guarantee anyways (uses: aro5000/gha-private-auth@v1 vs uses: aro5000/gha-private-auth@923a32b52fcbf6a9adcee9c52e0fc68cabddc18d).

Secrets

The way custom actions consume secrets was pleasantly surprising to me. In my action, I originally tried to just read directly from secrets specified as environment variables at the workflow level, but this did not work. When using a custom action, you must pass these variables into the action explicitly, either as inputs or action-scoped environment variables, so at least that is an intentional step. Note, I used GitHub hosted runners, and if you use self-hosted runners, there may be ways to read environment variables that exist on the host itself. (The next section shows how docker.sock is mounted 🙃️).

Permissions

Docker permissions was another interesting consideration I discovered while building this action. In my action, the whole point is that I print the token at the end so it can be used as an output that can then be used in downstream actions. The way GitHub Actions facilitates this is via a special environment variable $GITHUB_OUTPUT which is where you are supposed to write your output in a key=value format. When I create a Dockerfile, I generally scope the user and permissions to be very restrictive. Especially in golang apps, you generally just need to execute the binary in the image, so using an unprivileged user is normally trivial to implement. When trying to run my custom action for the first time, I was getting permission issues like this:

/app/entrypoint.sh: line 7: can't create /github/file_commands/set_output_d1c48a68-1759-44f9-8191-9a3460fe85c2: Permission denied

I ended up having to allow my image to run as root for this command to work, but looking closer at the action logs, it started to make sense why this was the case. The way GitHub Actions runs a Docker action is by mounting a TON of values from the host into the container image. The following is a log from the actual run while I was testing (notice how it mounts /var/run/docker.sock!):

I guess it makes sense to have all of this if you (for example) needed your Docker custom action to run Docker commands and manipulate things on the host, but it was surprising to me how much host-level permission was given by default.

Final thoughts

I learned a ton going through the process of actually building a GitHub Action, and have a much deeper understanding of the execution process. GitHub Actions are a really powerful way to share automation, but hopefully this blog post shows you that there are several security implications to consider. The consumer of a GitHub action likely doesn’t know or care what kind of action is being executed in their workflow, so you should probably consider applying some limitations for actions available inside your organization. If there is an action you want to use that is Docker type, you should ensure that an immutable image tag is being referenced, or to be on the safe side you should copy the action.yml file and the container image to a registry that you control.

Let me know in the comments if there are other security considerations I’m missing that would be helpful for people to know. Thanks for reading!