Deploy with LFTP uploads all files (even unchanged ones)

dieterblancke · March 19, 2020, 10:00am

I am trying to deploy to an (S)FTP server using the lftp command.

The problem is that it uploads all files, even the unchanged ones. I know I could limit it to only look at files with a changed size, but this makes it so very, very minor changes don’t get deployed (; added or number change from 0 - 1 or something).

Currently I use this command:

lftp -c “set net:timeout 5; set net:max-retries 3; set net:reconnect-interval-base 5; set ftp:ssl-force yes; set ftp:ssl-protect-data true; set sftp:auto-confirm yes; set ssl:verify-certificate no; open $host:$port; user $username $password; mirror $exclusions -v -c -P 10 -R …/ $remoteFolder”

With this command, it basically uploads all files that have been changed (time wise). The problem is that all files have been changed at the time the job starts (according to ls -la, see screenshot).
So I was wondering, is there a way to keep the times the file last changed in the gitlab repository (instead of the time the job started at).

Thanks for reading this block of text.

aljaxus · March 19, 2020, 11:33am

on filesystem level? I doubt, because the repository is “copied” every time you start the job. Or actually… you can try changing from “clone” to “pull” CI strategy. This only works if you’re not a virtualized runner (eg docker, parallels, vurtialbox, kubernetes) because these runners, by default, create a new environment for each job, so it will just fallback back to “clone” strategy.

If you’re using “shell”, “ssh” or a runner that, by default, doesn’t delete the repo after the job finishes, add the following at the top of your .gitlab-ci.yml file and test if the file changed/updated date changes or not.

variables:
  GIT_STRATEGY: fetch

Keep in mind that the runner needs to preserve the repository between jobs. You are also able to set GIT_STRATEGY per job - say for only “deploy” job. This is basically just an environment variable.
You can have one runner on a VM that is used only for deployment and doesn’t ever remove the repository.

dieterblancke · March 19, 2020, 12:00pm

I currently do use docker (I’m using the shared Gitlab.com runners currently), this is my .gitlab-ci.yml:

https://hastebin.com/utedajeruc.pl

Looking at the documentation, it doesn’t say that it’s impossible with docker, but I would have to look into how it might be possible (https://docs.gitlab.com/ee/ci/yaml/#git-strategy).

So, I was thinking, would it work with this scenario?

A “setup” job that uses the fetch strategy and that caches it’s result throughout pipelines, and other jobs then fetch the results of this setup job.

Cached directories / files don’t have this issue, if I cache /vendor/ for example, it does keep the older modified date it seems (as it doesn’t get uploaded every time).

I don’t fully know how I would implement this though.

aljaxus · March 19, 2020, 7:05pm

Well yeah, I suppose you can also use cache. Have you tried this method ? Did it work out?
I wasn’t able to reply earlier, because new accounts can not post more than X replies / day on this forum… Tbh they should up this limit because it doesn’t seem like any staff members are really paying attention to it anyway…

dieterblancke · March 24, 2020, 9:52am

This took me way longer then it should.

Anyways, I fixed it by adding an additional stage called “setup” where I execute the following: https://pastebin.com/bGCFd3iX

I found this utility called “git-restore-mtime”, threw it into a setup job, after the job I create an artifact of it so it can be used in future jobs within the pipeline.

aljaxus · March 24, 2020, 10:17am

Hah, interesting. Glad you managed to fix it

dieterblancke · March 24, 2020, 10:19am

Yea, and as a note, caching did seem to work somewhat. The issue was that the git fetch ran before the cache got loaded. Hence why that wasn’t an option.

Running on a docker on a local machine has other options as you can setup a shared folder where the repositories get stored (at least that’s what I understood from it), so on that you could run the fetch strategy. But on the public gitlab shared runners, git-restore-mtime seems like the best option currently.

aljaxus · March 24, 2020, 10:19am

I really think you should remove the

  artifacts:
    paths:
     - '*'
    expire_in: 1 hour

If you really have to do it like this, at least use cache and not artifacts…

Okay… Read Deploy with LFTP uploads all files (even unchanged ones)

dieterblancke · March 24, 2020, 10:23am

I mean, I could try caching for that. The only reason I used artifacts is because I want to use them in between jobs in the same pipeline.

Cache I currently use to cache node_modules & vendor between all jobs & pipelines. I would have to look at how I cache the repository in between jobs in the same pipeline and on top of that node_modules & vendor throughout jobs & pipelines.

pavel.kutac · December 5, 2020, 4:34pm

I was dealing similar issues. I deal with it by creating custom Docker image containing FTP Deployment (handling changes) and LFTP for handling parallel upload.

If you are interested, I wrote and article about it here: https://dev.to/arxeiss/parallel-incremental-ftp-deploy-in-ci-pipeline-2511

Topic		Replies	Views
Deploy only changes with LFTP GitLab CI/CD runner , lftp	2	1711	July 21, 2023
Pipeline FTP uploads all files instead of only changed GitLab CI/CD	2	2798	October 1, 2020
Deploy with CI on FTP Server How to Use GitLab	1	8147	June 24, 2018
Deploy via FTP via CI? GitLab CI/CD	14	41553	February 28, 2023
With LFTP How do I send a specific file to server overwriting if it exists? GitLab CI/CD ci	0	698	July 18, 2020

Deploy with LFTP uploads all files (even unchanged ones)

Related topics