CI/CD pipeline - get list of changed files

Hi folks,

On my Gitlab CI/CD pipeline, is there a way I can get a list of the changed files?

Basically, I’ve got some linting set up - but the repository contains a number of separate files which don’t work together to form an application, they’re just individual files hosted in the same place.

I’d like to lint only files changed on commit/push. Is this possible?

Many thanks,
Duncan

1 Like

I have the same question! From what I can tell (below), we’ll have to use the git command: git diff-tree --no-commit-id --name-only -r <commit hash> in our CI scripts to obtain the list of changed files, one per line, and then iterate over that list.

How to get a list of changed files in a commit (GitLab Forum)
How to list all the files in a commit? (Stack Overflow).

2 Likes

Thanks for this @quilty. This worked nicely.

files=(git diff-tree --no-commit-id --name-only -r $CI_COMMIT_SHA)

Can then use a standard Bash for loop to iterate through the files.

1 Like

If there are multiple commits in a merge request, how to get all changed files?

2 Likes
git diff

Can work with 2 commits SHAs

Reference:

In case the pipeline runs on merge requests, one option it is to compare it with the target branch.
git diff-tree --name-only --no-commit-id $CI_MERGE_REQUEST_TARGET_BRANCH_SHA

2 Likes

I am running the below command in my pipeline and it throws an error that
git command not found

command:
git diff-tree --no-commit-id --name-only -r

I am trying to capture list of changed files in my pipeline , if there is another way

This command does not work for me.
I get the error fatal: ambiguous argument '{branch_name}': unknown revision or path not in the working tree., even though the merge request originates from this branch.

A quick
git branch -a
returns
* (HEAD detached at b487990)

Which explains why it does not work but I don’t understand why the head is detached in the first place.

FInally found a solution starting from @mpp’s idea, and an answer in this thread:

  1. If this is a private repository, you will need to add a new ssh key in the CI.

  2. Fetch the branch you want to merge:
    git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME

  3. Compare the current commit with the branch you want to merge with:
    git diff --name-only $CI_COMMIT_SHA $CI_MERGE_REQUEST_TARGET_BRANCH_NAME

Voilà !

2 Likes

You saved my life!

I’ve been experimenting with this, but unless I’m missing something - this only seems to work for public repos. For a private repo it seems that I would need to set up some credentials - for example having an account with a key checked into the repo so I can pass it to the git command, which isn’t crazy but seems a little round-about and potentially insecure for a use case that will become more and more popular - taking action based on a change set, not the full file set.

None of the above really worked for me on a CI/CD pipeline in gitlab but what I ultimately did was very similar to what @loicm did, here’s the complete solution, these are the commands added to my .gitlab-ci.yml:

git checkout $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME
DIVERGE=$(git merge-base origin/$CI_MERGE_REQUEST_TARGET_BRANCH_Name $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME)
FILES_CHANGED=$(git diff --name-only $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME $DIVERGE)
echo -e "Files changed since diverge: $FILES_CHANGED"

For some reason a lot of these pre-defined environment variables are empty for me during CI build for a merge request:

somejob:
  script:
    - echo $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME
    - echo $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
    - echo $CI_MERGE_REQUEST_SOURCE_BRANCH_SHA
    - echo $CI_MERGE_REQUEST_TARGET_BRANCH_SHA

I tried multiple suggestions above and none of them work for me because these variables have no values. The only one that has value is $CI_COMMIT_SHA

hi guys, I just wanted to drop my solution. It’s based on all the answers here so I wanted to share back, maybe it is useful for someone. I wanted to find all changed yaml files in my merge-request to run a job against it. Solution is pretty easy once you know it:

script:
    - git fetch
    - git diff --name-only origin/$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME | grep '\.yaml$' | xargs yamllint
2 Likes

I’m glad someone posted a solution for this, and git diff-tree --name-only does work well… if you have git installed inside the Docker image! Turns out that some images don’t have git installed, and they run as a regular user, so you can’t (easily) install git to pull those file lists.

I ended up having to do some unholy hacks to workaround this problem… It would be great if there would be some variable that would contain the file list, or maybe even some artifact (because variables might be too small for larger lists)…

Basically: the runner has git installed and clones the repo. It could generate that file list and store it in a well-known file location to be reused inside the container…

2 Likes

Similarly, if it’s only file names that you want, you can use something like:

git ls-files '*.yml' | xargs -n1 yamllint 

or:

git ls-files -z  '*.yml' | xargs -0 mdl

if all the filenames should be on the same line.

@anarcat I don’t it’s so hacky to abuse artefacts for this!

1 Like

The only thing that finally worked (after many trial and errors) for all the changes instead of last commit of multi commit MR was:

git diff-tree --no-commit-id --name-only -r origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME -r $CI_COMMIT_SHA

The trick part is it doesn’t work without origin/ part, you get

fatal: ambiguous argument 'master': unknown revision or path not in the working tree.

without it.

2 Likes