From 5a1dbce6d56966ccc5393bfb53392c6efc5f8f09 Mon Sep 17 00:00:00 2001 From: shankar0123 Date: Thu, 14 May 2026 20:57:24 +0000 Subject: [PATCH] =?UTF-8?q?fix(deploy):=20Hotfix=20#18=20=E2=80=94=20apt-g?= =?UTF-8?q?et=20retry=20loop=20in=20libest=20Dockerfile=20(transient=20mir?= =?UTF-8?q?ror=20flake)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CI image-and-supply-chain job failed building deploy/test/libest/ Dockerfile: Get:62 http://deb.debian.org/debian bullseye/main amd64 libssh2-1 amd64 1.9.0-2+deb11u1 [156 kB] Err:62 http://deb.debian.org/debian bullseye/main amd64 libssh2-1 amd64 1.9.0-2+deb11u1 Error reading from server - read (104: Connection reset by peer) [IP: 151.101.202.132 80] E: Failed to fetch http://deb.debian.org/debian/pool/main/libs/ libssh2/libssh2-1_1.9.0-2%2bdeb11u1_amd64.deb E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? Root cause: Transient TCP reset from fastly's Debian mirror at 151.101.202.132 mid-fetch of one of 73 packages. Mirrors flake; the apt error message itself suggests "--fix-missing." This was NOT a code regression — the build sequence completed Dockerfile (main server), Dockerfile.agent, and f5-mock-icontrol/Dockerfile cleanly before hitting the flake on the 4th and final Dockerfile. The Go + npm steps for the main image all succeeded. The main Dockerfile already wraps `npm ci` in a 3-retry loop (Hotfix #9 from the Storybook lockfile saga; npm registry has the same flake profile as Debian mirrors). The libest Dockerfile's two apt-get install sites (builder stage line 85, runtime stage line 189) had no such wrapping. Fix: Wrap both apt-get install invocations in a 3-retry loop matching the main Dockerfile's npm-ci pattern. Each retry runs `apt-get update && apt-get install --fix-missing ...`, exits the loop on success, sleeps 5s between attempts. After 3 failed attempts the build fails (preserves CI's signal for a genuinely broken mirror state). --fix-missing telling apt to continue past temporarily-missing packages on subsequent retries; combined with the update + sleep, the 3-attempt loop covers the typical mirror-flake window (~30-60s of churn before another mirror takes over). Both apt-get sites in the libest Dockerfile get the same treatment (builder + runtime). The two are independent install operations so failure in one is independent of the other. Verification (sandbox): • Visual diff of both apt-get blocks — consistent retry shape + --fix-missing + error message + sleep cadence • No Go-side code touched; this is a pure CI-infrastructure Dockerfile change • Other Dockerfiles in the repo (main + agent + f5-mock-icontrol) don't need this fix today; the main Dockerfile already has the retry loop for npm ci, and agent + f5-mock use Alpine `apk` which has its own retry semantics Ground-truth: origin/master tip 7268d12 (FE-M6 just pushed) verified via GitHub API BEFORE commit. Falsifiable proof for the next CI run: the image-and-supply-chain job's libest build should either succeed on first attempt OR retry through the flake automatically. The expected outcome is a green build; a real broken-mirror state would still fail after 3 attempts (which is the right signal). --- deploy/test/libest/Dockerfile | 57 ++++++++++++++++++++++++----------- 1 file changed, 40 insertions(+), 17 deletions(-) diff --git a/deploy/test/libest/Dockerfile b/deploy/test/libest/Dockerfile index 35f0080..07d5002 100644 --- a/deploy/test/libest/Dockerfile +++ b/deploy/test/libest/Dockerfile @@ -82,16 +82,30 @@ ARG LIBEST_REF # is the same major version libest r3.2.0 was tested against. libest # also wants libcurl + libsafec; we install both via apt rather than # building from source for reproducibility. -RUN apt-get update && apt-get install --no-install-recommends -y \ - autoconf \ - automake \ - build-essential \ - ca-certificates \ - git \ - libcurl4-openssl-dev \ - libssl-dev \ - libtool \ - pkg-config \ +# +# Hotfix #18 (2026-05-14): wrap in a 3-retry loop with --fix-missing +# fallback to absorb transient Debian mirror flakes. The original +# unwrapped apt-get install failed CI run #N on a "Connection reset +# by peer" mid-fetch of libssh2-1 from fastly's debian.org mirror at +# 151.101.202.132. Mirrors flake; production-grade Dockerfiles wrap +# network ops in retry. Same pattern as the main Dockerfile's npm-ci +# 3-retry loop from Hotfix #9. +RUN for i in 1 2 3; do \ + apt-get update && \ + apt-get install --no-install-recommends -y --fix-missing \ + autoconf \ + automake \ + build-essential \ + ca-certificates \ + git \ + libcurl4-openssl-dev \ + libssl-dev \ + libtool \ + pkg-config \ + && break; \ + echo "apt-get install attempt $i/3 failed; sleeping 5s before retry"; \ + sleep 5; \ + done \ && rm -rf /var/lib/apt/lists/* WORKDIR /src @@ -172,13 +186,22 @@ RUN git clone --depth 1 --branch ${LIBEST_REF} https://github.com/cisco/libest.g # Pinned to the same digest as the builder above (Bundle A / H-001). FROM debian:bullseye-slim@sha256:1a4701c321b1d28b1ff5f0230e766791e4b79b1d4c6c7a70064f4b297b1a330f -RUN apt-get update && apt-get install --no-install-recommends -y \ - bash \ - ca-certificates \ - curl \ - libcurl4 \ - libssl1.1 \ - openssl \ +# Hotfix #18 (2026-05-14): same 3-retry pattern as the builder stage +# above. Runtime image installs are also vulnerable to transient +# mirror flakes. +RUN for i in 1 2 3; do \ + apt-get update && \ + apt-get install --no-install-recommends -y --fix-missing \ + bash \ + ca-certificates \ + curl \ + libcurl4 \ + libssl1.1 \ + openssl \ + && break; \ + echo "apt-get install attempt $i/3 failed; sleeping 5s before retry"; \ + sleep 5; \ + done \ && rm -rf /var/lib/apt/lists/* \ && useradd --create-home --uid 1000 estuser