Fatal: unable to access 'https://git.demo.example.com/user.name/pipeline-testing.git/': Could not resolve host: git.demo.example.com

Problem to solve

Pipeline execution using kubernetes executor fails after throwing below error:

fatal: unable to access 'https://git.domain.example.com/user.name/pipeline-testing.git/': Could not resolve host: git.domain.example.com

This error pops up during the cloning stage of my pipeline, which utilizes the Kubernetes executor. I’ve been battling this for a week and haven’t been able to crack the code.

I have setup k8s in VM. In the same VM I have registered runner using the gitlab-runner register command.

VM details: Ubuntu 22, aarch64

Version: 16.9.1
Git revision: 782c6ecb
Git branch: 16-9-stable
GO version: go1.21.7
Built: 2024-02-28T16:51:21+0000
OS/Arch: linux/arm64

Kubectl version:
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.7

Gitlab-runner register command used:
gitlab-runner register -n --url https://git.domain.example.com/ -r demoToken --name test-runner-1 --executor kubernetes --kubernetes-namespace "gitlab-runner" --run-untagged --locked --kubernetes-host https://x.x.x.x:6443 --kubernetes-image localhost:5000/demo_environment --kubernetes-helper-image-autoset-arch-and-os --kubernetes-helper-image-flavor ubuntu

Here’s what I’ve tried so far:

  • In other post it has been mentioned that this DNS issue occurs due to the default helper image. So added --kubernetes-helper-image-flavor ubuntu in the register command, but still the issue persists. Even tried using multiple other helper images.

  • Verified DNS configuration within the container (manually generated container with the image used for execution environment) using docker exec and cat /etc/resolv.conf.

  • Checked if the pod can resolve the hostname using ping domain.example.com. Everything is working fine in the manually created container.

  • Explored alternative approaches like using the direct Git URL with an access token (with caution and temporary increased scope).

  • I also tried using clone_url, extra_hosts, pre_get_sources_script, host_aliases, dns_config, and so on in the config.toml as mentioned in other related posts. But nothing seems to be solving the issue. I can’t even take a look inside the pod as the pod is deleted immediately after job failed.

However, the issue persists.

What I’m hoping to achieve:

  • Effectively troubleshoot and resolve the DNS resolution problem to ensure smooth pipeline execution.
  • Gain insights and best practices from experienced DevOps professionals to prevent similar issues in the future.

I’d greatly appreciate any assistance you can offer, including:

  • Potential causes or troubleshooting steps I might have overlooked.
  • Recommendations for secure access methods within the pipeline environment.
  • Any other relevant suggestions or best practices to address DNS issues in DevOps.

Please feel free to share any questions or require further details about my setup. I’m eager to learn and resolve this challenge with the help of the community.

Thank you in advance for your support!

Additional Notes:

  • Feel free to replace domain.example.com with the actual domain name you’re facing issues with.
  • Consider including relevant code snippets or configuration details (anonymized if necessary) to provide more context for those who might be able to assist.