Skip to main content

GitLab CI Fleet Indexer

The GitLab CI pipeline (gitlab-batho.yaml) provides the same incremental patching strategy as the GitHub Actions workflow, adapted for GitLab's CI/CD system.

Pipeline Flow

Full Pipeline YAML

Copy the following into .gitlab-ci.yml:

stages:
- index

batho-indexer:
stage: index
image: python:3.12
timeout: 30 minutes
rules:
# Run on changes to main, or on any Merge Request
- if: '$CI_COMMIT_BRANCH == "main"'
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
before_script:
# Install unzip for extracting GitLab artifacts, then install Batho
- apt-get update && apt-get install -y unzip curl
- pip install batho
script:
- echo "Attempting to download previous Batho artifact..."
- |
# Fetch the artifact zip from the last successful pipeline on this branch
# Uses --fail to silently skip if this is the very first run
curl -s --fail --location --output artifacts.zip \
--header "JOB-TOKEN: $CI_JOB_TOKEN" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/jobs/artifacts/${CI_COMMIT_REF_NAME}/download?job=${CI_JOB_NAME}" || echo "No previous artifact found."

- |
# Extract the database if the download was successful
if [ -f artifacts.zip ]; then
echo "Unzipping previous artifact..."
unzip -o artifacts.zip
rm artifacts.zip
fi

- |
# Batho stores the code graph as Arrow IPC files packed into artifact_<dirname>.batho
if ls artifact_*.batho 1> /dev/null 2>&1; then
echo "✅ Found existing Batho artifact. Running incremental patch..."
batho load --root . artifact_*.batho --force
batho patch --root . --verbose
else
echo "⚠️ No existing artifact found. Running full build..."
batho build --root . --full --verbose
fi

# Export the updated .batho/ bundle into a transport artifact
batho export --root .

artifacts:
name: "batho-database-$CI_COMMIT_SHORT_SHA"
paths:
- artifact_*.batho
expire_in: 90 days

Key Configuration

KeyValuePurpose
StageindexSingle-stage pipeline
Imagepython:3.12Official Python Docker image
Timeout30 minutesFail fast if indexing hangs
Rulesmain branch + MR eventsRun on commits and merge requests
Before scriptapt-get install unzip curlInstall artifact extraction tools
Installpip install bathoPulls latest stable from PyPI
Artifact namebatho-database-$CI_COMMIT_SHORT_SHAUnique per commit
Expiration90 daysLong enough for agent access

First-Run Behavior

On the first pipeline run, the artifact download (curl --fail) returns a 404 and the script continues to a full batho build --full.

AI Agent Access

Agents can download and restore the graph:

# Download latest artifact from main branch
curl --header "JOB-TOKEN: $CI_JOB_TOKEN" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/jobs/artifacts/main/download?job=batho-indexer" \
--output batho-database.zip
unzip batho-database.zip

# Restore the Arrow IPC graph store
batho load --root . artifact_*.batho

Platform Notes

  • Branch Handling: Downloads from CI_COMMIT_REF_NAME to support branch-specific artifact chains.
  • Job Name: The artifact download URL references CI_JOB_NAME — it must match the job definition (batho-indexer).
  • Token: Uses CI_JOB_TOKEN for authenticated artifact access — no extra secrets needed.

Troubleshooting

IssueCauseResolution
Artifact download failsFirst run (no previous artifact)Expected — pipeline continues with full build
batho load failsSchema version mismatchDelete artifact to trigger full rebuild
Build timeoutLarge repositoryIncrease timeout or split into multiple jobs
Artifact size quotaBundle exceeds storage limitsImplement cleanup or use external storage