Bitbucket Cloud import to GitLab.com never completes

We’re trying to migrate a number of repositories from Bitbucket cloud to GitLab.com. The repos themselves are not too big (200-400MB) and most complete without a problem. But two of them have large LFS data (38GB and 17GB), and these are being problematic.

The first time I tried the imports, they all completed, but there was a lot of LFS objects missing. I guessed that only the objects referenced at the head were included, but I wasn’t sure.

Question 1: Is it expected that the Bitbucket Cloud importer would import all LFS objects?

I have since invited users to the GitLab.com group, with the hope of getting the Merge Request authors to line up. I also purchased 40GB of extra storage to accommodate all the LFS data (if I can somehome get it there).

Now I’ve deleted those repos and tried to import again, but the ones with large LFS seem to stay in the “importing” state indefinately. The GitLab usage page shows one of them as consuming 7.7GB, but this is not increasing (it’s been running over night). I’ve tried a few times, but deleting the project and starting again, both using the Bitbucket Cloud and the From URL importers, and they always seem to get stuck, with a varying data size shown on the usage page.

Question 2: How can I move forward with this? Is there any diagnistic information available in GitLab.com? I’ve tried hitting the import status API endpoint but that doesn’t give any more information - just that the import is “Started” and there are apparently no errors.

We are currently on the free tier, intending to use Premium, but are reluctant to upgrade if we can’t even get past the importing.

Any ideas how we could proceed?

With some further experimentation, and some pointers from GitLab, I managed to answer my questions, and have a workaround for migrating large amounts of LFS data.

Answer 1:
Yes, the Bitbucket Cloud importer should import all LFS objects, but this can timeout if the LFS data is large. I think the import status api endpoint said it failed with a 24h timeout. I suspect that there was some throttling going on, possibly from Bitbucket, but no way of confirming that.

Answer 2:
This is the workaround I’ve used, if it’s helpful to others.

  1. Clone the repo locally from Bitbucket.
  2. Use git lfs fetch --all to pull down the complete LFS data set.
  3. Add .lfsconfig file with dummy url prop to master branch and push to the Bitbucket repo ( [lfs] url = https://foo ). This prevents the GitLab importer from trying to copy all the LFS data.
  4. Import the repo into GitLab using the Bitbucket Cloud importer.
  5. Remove the .lfsconfig file from the new GitLab repo using the Web UI.
  6. Enable LFS for the repo using the GitLab API.
  7. Clone the GitLab repo locally. This will fail to checkout the LFS objects.
  8. Copy all the LFS objects ( .git/lfs/objects ) from the Bitbucket clone to the new GitLab clone.
  9. Push all the LFS objects to the GitLab repo using find .git/lfs/objects/ -type f -execdir basename {} \; | xargs git lfs push origin --object-id
  10. Clone the GitLab repo locally again, to a different folder. There should be no errors this time.
  11. Use git lfs fetch --all to pull down the complete LFS data set. Validate that the .git/lfs/objects folders for this repo clone and the original Bitbucket repo clone are the same (using something like KDiff3).