Is `artifacts:reports:[codequality:]` affected by `artifacts:expire_in:`?

I have a GitLab CI/CD pipeline for merge-requests, where jobs use both artifacts:paths: and artifacts:reports:codequality: combined with artifacts:expire_in:. The trouble-shooting section for Code Quality contains this:

Code Quality reports from the source or target branch may be missing for comparison on the merge request, so no information can be displayed.

Missing report on the target branch can be due to:

  • The artifacts:expire_in CI/CD setting can cause the Code Quality artifacts to expire faster than desired.

As my company is notoriously short on storage space, our IT department plans to disable Keep artifacts from most recent successful jobs globally for all projects. My understanding is, that this will also affect the CodeQuality reports for all branches, including any target branch. As such the next merge-request will (probably) no longer find any CodeQuality report for the target branch and no report will be shown in the merge-request.

Is that reading correct?

If there some variant to only expire (large) artifacts, but keep the (CodeQuality) reports of long-living/protected/selected branches so the next merge-request targeting them would show the CodeQuality report?

artifacts:reports:codequality: states this:

The artifacts:expire_in value is set to 1 week.

That affects both source and target branches, but the report would still expire after 1 week for the target branch when Keep artifacts from most recent successful jobs is disabled. Mimicking its behaviour I would like to have a variant of expire_in: never, but only for the last successful job of a ref: I only want the last CC-report of the last job, not all CC-reports of all historic jobs. (I assume all artifacts/reports get auto-expired when the associated ref gets deleted.)

PS: The documentation for artifacts:reports: and CI/CD artifacts reports types contain the following statement:

… You can use artifacts:expire_in to set an expiration date for the artifacts.

AFAIK there is no fine-grained expiry control for artifacts:reports types.

Workaround: I did reason with Claude on a few alternative ideas, sharing them below. Note: I did not test the code and CI/CD config on whether it works, only high-level review for accuracy.

  1. Option A: Store reports in the Generic Package Registry

This is probably the cleanest workaround. After the code quality job runs, a second job uploads the JSON report as a generic package artifact keyed by branch name (or commit SHA). Before comparing in an MR, a job downloads the target branch’s report.

The catch: generic packages also have expiration/cleanup policies, so the same storage pressure problem could resurface. But it’s decoupled from CI artifact expiry, and you can manage retention independently.

"""
Approach A – Upload Code Quality report to the Generic Package Registry.

Reads gl-code-quality-report.json produced by the code_quality job and
uploads it to the project's Generic Package Registry, keyed by branch name.
Overwriting the same package version means only the latest report per branch
is retained, which mirrors the "keep most recent successful job" behaviour
without relying on the global instance setting.

Environment variables (all provided automatically by GitLab CI):
  CI_SERVER_URL          GitLab instance URL
  CI_PROJECT_ID          Numeric project ID
  CI_COMMIT_REF_NAME     Branch or tag name (used as package version)
  CI_JOB_TOKEN           Short-lived token with package registry write access
  PYTHON_GITLAB_TOKEN    Optional – personal/project token for richer API use.
                         Falls back to CI_JOB_TOKEN if not set.
"""

import json
import os
import sys
from pathlib import Path

import gitlab

REPORT_FILE = Path("gl-code-quality-report.json")
PACKAGE_NAME = "code-quality"
PACKAGE_FILE_NAME = "report.json"


def sanitise_version(ref: str) -> str:
    """
    Generic Package Registry versions must match:
      [A-Za-z0-9._+-]+
    Replace the most common problematic character (/) from branch names like
    feature/my-branch.
    """
    return ref.replace("/", "-")


def main() -> None:
    server_url = os.environ["CI_SERVER_URL"]
    project_id = os.environ["CI_PROJECT_ID"]
    ref_name = os.environ["CI_COMMIT_REF_NAME"]

    # Prefer a long-lived token for richer API access; fall back to job token.
    token = os.environ.get("PYTHON_GITLAB_TOKEN")
    job_token = os.environ.get("CI_JOB_TOKEN")

    if not REPORT_FILE.exists():
        print(f"ERROR: {REPORT_FILE} not found. Did the code_quality job run?")
        sys.exit(1)

    report_size = REPORT_FILE.stat().st_size
    print(f"Report file: {REPORT_FILE} ({report_size} bytes)")

    # Validate JSON before uploading
    try:
        issues = json.loads(REPORT_FILE.read_text())
        print(f"Report contains {len(issues)} issue(s).")
    except json.JSONDecodeError as exc:
        print(f"ERROR: Report is not valid JSON: {exc}")
        sys.exit(1)

    # Connect to GitLab
    if token:
        gl = gitlab.Gitlab(server_url, private_token=token)
    elif job_token:
        gl = gitlab.Gitlab(server_url, job_token=job_token)
    else:
        print("ERROR: Neither PYTHON_GITLAB_TOKEN nor CI_JOB_TOKEN is set.")
        sys.exit(1)

    project = gl.projects.get(project_id)
    package_version = sanitise_version(ref_name)

    print(f"Uploading to package: {PACKAGE_NAME} / version: {package_version}")

    project.generic_packages.upload(
        package_name=PACKAGE_NAME,
        package_version=package_version,
        file_name=PACKAGE_FILE_NAME,
        path=str(REPORT_FILE),
    )

    # Build the download URL for reference in job logs
    download_url = (
        f"{server_url}/api/v4/projects/{project_id}/packages/generic"
        f"/{PACKAGE_NAME}/{package_version}/{PACKAGE_FILE_NAME}"
    )
    print(f"Upload successful.")
    print(f"Download URL: {download_url}")
    print()
    print("To fetch this report in another job using the CI job token:")
    print(
        f'  curl --header "JOB-TOKEN: $CI_JOB_TOKEN" \\\n'
        f"    \"{download_url}\" \\\n"
        f"    -o target-report.json"
    )


if __name__ == "__main__":
    main()

.gitlab-ci.yml

# ---------------------------------------------------------------------------
# Approach A – Upload report to Generic Package Registry
#
# Uses the branch name as the package version so each branch has exactly one
# current report. Uploading overwrites the previous version, matching the
# behaviour of "keep only the latest per ref."
#
# To download the target branch report in another job:
#   curl --header "JOB-TOKEN: $CI_JOB_TOKEN" \
#     "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/generic/code-quality/${CI_MERGE_REQUEST_TARGET_BRANCH_NAME}/report.json" \
#     -o target-report.json
# ---------------------------------------------------------------------------
upload_code_quality_report:
  stage: report
  image: python:3.12-slim
  needs:
    - job: code_quality
      artifacts: true
  before_script:
    - pip install python-gitlab --quiet
  script:
    - python scripts/upload_code_quality_report.py
  rules:
    # Run on all branches so the target branch always has a fresh report
    - if: $CI_COMMIT_BRANCH
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  1. Option B: GitLab Pages

Less ideal. Pages are public (or require auth on self-managed), and you’d be storing structured JSON data as a static file. Linking from MRs is possible but awkward. There’s no clean API to “fetch the Pages artifact for branch X.” Not recommended for this use case.

  1. Option C: MR comments with the data

Feasible and has zero storage concerns. A job parses the code quality JSON and posts a summary as an MR note. It won’t give the native MR widget diff view, but it surfaces the information.

"""
Approach C – Post a Code Quality summary as an MR note.

Parses gl-code-quality-report.json and creates a structured comment on the
merge request. If a previous comment from this script already exists (detected
via a marker string in the note body), it is replaced rather than duplicated.

Severity ordering and emoji follow the standard Code Climate/GitLab convention:
  blocker > critical > major > minor > info

Environment variables (all provided automatically by GitLab CI):
  CI_SERVER_URL              GitLab instance URL
  CI_PROJECT_ID              Numeric project ID
  CI_MERGE_REQUEST_IID       MR internal ID (only set in MR pipelines)
  CI_COMMIT_SHA              Used to link issues to the correct commit
  CI_PROJECT_URL             Used to build file links
  PYTHON_GITLAB_TOKEN        Project/personal token with api scope.
                             CI_JOB_TOKEN does NOT have MR note write access,
                             so this variable is required.

Optional tuning via CI variables:
  CQ_MAX_ISSUES              Max issues shown in full detail (default: 25)
  CQ_SEVERITIES              Comma-separated list of severities to include.
                             Default: blocker,critical,major,minor,info
"""

import json
import os
import sys
from pathlib import Path
from collections import defaultdict

import gitlab

REPORT_FILE = Path("gl-code-quality-report.json")
NOTE_MARKER = "<!-- gitlab-code-quality-comment -->"

SEVERITY_ORDER = ["blocker", "critical", "major", "minor", "info"]
SEVERITY_EMOJI = {
    "blocker": "🚫",
    "critical": "🔴",
    "major": "🟠",
    "minor": "🟡",
    "info": "🔵",
}


def severity_key(issue: dict) -> int:
    sev = issue.get("severity", "info").lower()
    try:
        return SEVERITY_ORDER.index(sev)
    except ValueError:
        return len(SEVERITY_ORDER)


def build_file_link(project_url: str, commit_sha: str, path: str, line: int | None) -> str:
    if line:
        return f"{project_url}/-/blob/{commit_sha}/{path}#L{line}"
    return f"{project_url}/-/blob/{commit_sha}/{path}"


def build_note_body(issues: list, project_url: str, commit_sha: str, max_issues: int) -> str:
    lines = [NOTE_MARKER]

    if not issues:
        lines.append("## ✅ Code Quality")
        lines.append("")
        lines.append("No issues found in this pipeline run.")
        return "\n".join(lines)

    # Counts by severity
    counts = defaultdict(int)
    for issue in issues:
        counts[issue.get("severity", "info").lower()] += 1

    summary_parts = []
    for sev in SEVERITY_ORDER:
        if counts[sev]:
            emoji = SEVERITY_EMOJI.get(sev, "")
            summary_parts.append(f"{emoji} {counts[sev]} {sev}")

    total = len(issues)
    lines.append("## 🔍 Code Quality Report")
    lines.append("")
    lines.append(f"**{total} issue(s) found:** " + " · ".join(summary_parts))
    lines.append("")

    # Sort by severity then file path
    sorted_issues = sorted(issues, key=lambda i: (severity_key(i), i.get("location", {}).get("path", "")))
    shown = sorted_issues[:max_issues]

    lines.append("| Severity | Location | Description |")
    lines.append("|----------|----------|-------------|")

    for issue in shown:
        sev = issue.get("severity", "info").lower()
        emoji = SEVERITY_EMOJI.get(sev, "")
        label = f"{emoji} {sev.capitalize()}"

        loc = issue.get("location", {})
        path = loc.get("path", "")
        line = loc.get("lines", {}).get("begin")

        if path:
            file_url = build_file_link(project_url, commit_sha, path, line)
            location_cell = f"[`{path}`]({file_url})" + (f" line {line}" if line else "")
        else:
            location_cell = "—"

        description = issue.get("description", "").replace("|", "\\|").replace("\n", " ")
        # Truncate very long descriptions
        if len(description) > 120:
            description = description[:117] + "..."

        lines.append(f"| {label} | {location_cell} | {description} |")

    if total > max_issues:
        remaining = total - max_issues
        lines.append("")
        lines.append(f"_…and {remaining} more issue(s) not shown. Download the full report artifact for details._")

    lines.append("")
    lines.append(
        f"<sub>Generated by `comment_code_quality` job · commit `{commit_sha[:8]}`</sub>"
    )

    return "\n".join(lines)


def find_existing_note(mr, marker: str):
    """Return the first MR note that contains our marker, or None."""
    for note in mr.notes.list(iterator=True):
        if marker in note.body:
            return note
    return None


def main() -> None:
    server_url = os.environ["CI_SERVER_URL"]
    project_id = os.environ["CI_PROJECT_ID"]
    project_url = os.environ["CI_PROJECT_URL"]
    commit_sha = os.environ["CI_COMMIT_SHA"]
    mr_iid = os.environ.get("CI_MERGE_REQUEST_IID")
    token = os.environ.get("PYTHON_GITLAB_TOKEN")

    max_issues = int(os.environ.get("CQ_MAX_ISSUES", "25"))
    severity_filter_raw = os.environ.get("CQ_SEVERITIES", "")
    severity_filter = (
        [s.strip().lower() for s in severity_filter_raw.split(",") if s.strip()]
        if severity_filter_raw
        else None
    )

    if not mr_iid:
        print("Not an MR pipeline (CI_MERGE_REQUEST_IID not set). Skipping.")
        sys.exit(0)

    if not token:
        print("ERROR: PYTHON_GITLAB_TOKEN is required to post MR notes.")
        print("CI_JOB_TOKEN does not have the necessary permissions.")
        sys.exit(1)

    if not REPORT_FILE.exists():
        print(f"ERROR: {REPORT_FILE} not found. Did the code_quality job run?")
        sys.exit(1)

    try:
        issues = json.loads(REPORT_FILE.read_text())
    except json.JSONDecodeError as exc:
        print(f"ERROR: Report is not valid JSON: {exc}")
        sys.exit(1)

    print(f"Loaded {len(issues)} issue(s) from {REPORT_FILE}.")

    # Apply optional severity filter
    if severity_filter:
        before = len(issues)
        issues = [i for i in issues if i.get("severity", "info").lower() in severity_filter]
        print(f"Severity filter {severity_filter}: {before} → {len(issues)} issue(s).")

    gl = gitlab.Gitlab(server_url, private_token=token)
    project = gl.projects.get(project_id)
    mr = project.mergerequests.get(mr_iid)

    body = build_note_body(issues, project_url, commit_sha, max_issues)

    existing = find_existing_note(mr, NOTE_MARKER)
    if existing:
        print(f"Found existing note (id={existing.id}). Updating.")
        existing.body = body
        existing.save()
        print("Note updated.")
    else:
        print("No existing note found. Creating new note.")
        mr.notes.create({"body": body})
        print("Note created.")


if __name__ == "__main__":
    main()

.gitlab-ci.yml

# ---------------------------------------------------------------------------
# Approach C – Post MR comment with Code Quality summary
#
# Parses gl-code-quality-report.json and creates (or updates) a note on the
# MR. Only runs in MR pipelines. Uses a bot-style header to find and replace
# any previous comment from this job, avoiding duplicate notes.
# ---------------------------------------------------------------------------
comment_code_quality:
  stage: report
  image: python:3.12-slim
  needs:
    - job: code_quality
      artifacts: true
  before_script:
    - pip install python-gitlab --quiet
  script:
    - python scripts/comment_code_quality.py
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

Nice idea and thank you for your work, but I really do not want to re-implement parts from GitLab because there is no artifacts:reports:[codequality:]expire_in: "after next $CI_JOB_NAME for $CI_COMMIT_REF_NAME succeeds": There’s a lot more functionality in GitLab like

  • filtering out issues by fingerprint, which already exist in the target branch / before the fork
  • linking issues to file location

"Keep Latest Artifacts" Improvements (#14065) · Epics · GitLab.org · GitLab lists many more issues related to artifacts, so I’m not alone. I’d better spend my time adding such a feature to GitLab itself rather then re-implementing pars of GitLab myself.

Great idea. If you need help where to start: https://contributors.gitlab.com/

I just wanted to chip in, as I created an open feature proposal requesting to add the ability to keep only security artifacts/reports from latest successful pipelines. Not sure if it would help in your case, but if you’re interested give the issue a :+1: :slight_smile:

I just exactly did that already yesterday when I searched for pre-existing work-items and found your issue :wink: