Commit Graph

81 Commits

Author SHA1 Message Date
shankar0123 f48520c86a fix: add go.sum and indirect deps for MCP SDK
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 17:00:30 -04:00
shankar0123 956230aec1 feat: M18a — MCP server exposing all 76 API endpoints as AI-native tools
Separate standalone binary (cmd/mcp-server/) using official MCP Go SDK
(modelcontextprotocol/go-sdk v1.4.1) with stdio transport. Stateless HTTP
proxy translates MCP tool calls to certctl REST API requests. 76 tools
across 16 resource domains with typed input structs and jsonschema tags
for automatic LLM-friendly schema generation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 16:49:39 -04:00
shankar0123 ff20b33b75 docs: add OpenAPI 3.1 spec covering all 78 API operations
Generate api/openapi.yaml from handler and domain source code. Covers
all 76 endpoints under /api/v1/ plus /health and /ready (78 total).
Includes full request/response schemas, domain model definitions,
enum types (CertificateStatus, JobType, RevocationReason, etc.),
reusable pagination envelope, error responses, and common parameters.

This spec serves as the contract for the upcoming MCP server (M18a)
and enables Swagger/Redocly interactive documentation immediately.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 16:21:24 -04:00
shankar0123 b47f56d60a docs: restructure V2 roadmap milestones, add missing env vars to README
Restructure remaining V2 milestones to reflect split ordering:
M18a (MCP Server, V2.1) ships first, M16 split into M16a (notifier
connectors, parallel with M19) and M16b (CLI + bulk import, after
discovery), M18 split into M18a/M18b, compliance mapping docs added.

Add 13 previously undocumented env vars to Configuration table: CORS,
rate limiting, migrations path, scheduler intervals, DNS-01 scripts,
step-ca key/password. Update scheduler diagram with 5th loop
(short-lived expiry check).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 16:08:40 -04:00
shankar0123 e9e9c6c8fc docs: add M19 immutable API audit log to V2 roadmap, complete API endpoint documentation
- Add M19 (Immutable API Audit Log) to V2 roadmap — HTTP middleware
  logging every API call to audit_events table
- Mark M14 (Observability) as complete in README roadmap
- Add 16 missing endpoints to API Overview (profiles, agent-groups,
  job approve/reject, audit/notification detail)
- Add Observability section with stats + metrics endpoints
- Fix endpoint count (78 → 76 under /api/v1/)
- Fix page count (17 → 19 pages)
- Note V2 audit trail expansion in Security section

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 00:33:42 -04:00
shankar0123 ee75f149ae feat: M14 — Observability (dashboard charts, agent fleet, stats API, metrics, structured logging, rollback)
Backend: StatsService with 5 aggregation methods, JSON metrics endpoint, slog-based
structured logging middleware. Stats API: dashboard summary, certificates-by-status,
expiration timeline, job trends, issuance rate. 23 new backend tests.

Frontend: Recharts-powered dashboard with 4 charts (status pie, expiration heatmap,
job trends line, issuance bar), agent fleet overview page with OS/arch grouping and
version breakdown, deployment rollback buttons on version history. 7 new frontend tests.

78 API endpoints, 744+ total tests (658 Go + 86 Vitest).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 19:46:13 -04:00
shankar0123 2f65dd1a61 docs: correct test counts, endpoint count, and M15b file locations
README.md:
- Endpoint count: 72 → 71 (actual route registrations)
- Test count: 660+ → 677+ (497 Go functions + 101 subtests + 79 frontend)
- Corrected per-layer test breakdown

architecture.md:
- DER CRL and OCSP responder are implemented (were marked "planned for M15b")
- Added OCSP endpoint and short-lived cert exemption documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 19:20:07 -04:00
shankar0123 28bef63569 fix: handle 'not found' errors as 404 in CRL/OCSP handlers
The error routing only checked for "issuer not found" but not
"certificate not found", causing cert-not-found errors to fall
through to a generic 500. Broadened the check to match any
"not found" error string for both CRL and OCSP handlers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 19:06:37 -04:00
shankar0123 ed989d81fd fix: wire issuer registry in revocation tests, correct CRL/OCSP handler test URLs
Service tests: newRevocationTestService() was missing SetIssuerRegistry(),
causing all 8 CRL/OCSP tests to fail with "issuer registry not configured".

Handler tests: CRL tests used /api/v1/issuers/{id}/crl but handler parses
/api/v1/crl/{id}. OCSP tests used query string ?serial=X but handler
expects path param /api/v1/ocsp/{id}/{serial}. Fixed all 9 test URLs.

All issues pre-date CI on v2-dev — introduced during M15b.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 16:02:36 -04:00
shankar0123 d4fd46155e fix: unused certRepo variable and missing revocation wiring
- revocation_test.go: certRepo unused in TestGenerateDERCRL_Success,
  replaced with blank identifier
- lifecycle_test.go: missing revocationRepo init and setter calls
  (SetRevocationRepo, SetNotificationService, SetIssuerRegistry)
  that negative_test.go already had

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 15:55:44 -04:00
shankar0123 5407fabe1d fix: remove extra context.Context args from CRL/OCSP tests
GenerateDERCRL and GetOCSPResponse don't take a context parameter,
but all 8 test calls passed context.Background() as the first arg.
Pre-existing issue never caught because CI wasn't running on v2-dev.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 15:40:38 -04:00
shankar0123 ed41d21eac fix: use service-layer types in adapter tests
GenerateCRL test was passing []issuer.RevokedCertEntry instead of
[]CRLEntry, causing go vet failure. Also fixed OCSPSignRequest
references to use service-layer type for consistency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 15:21:03 -04:00
shankar0123 5854d4406d fix: remove unused variable failing go vet in CI
expectedCRL was declared but never used in TestIssuerConnectorAdapter_GenerateCRL_Success.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 15:18:30 -04:00
shankar0123 20de13e48e ci: add v2-dev branch to CI triggers
Pushes to v2-dev were not running CI because the workflow only
triggered on master. Add v2-dev to the push branch list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 15:13:22 -04:00
shankar0123 8af4e42f44 feat: M13 — GUI operations (bulk ops, deployment timeline, policy editor, target wizard, audit export, short-lived creds)
Bulk certificate operations: multi-select checkboxes on certificates list with
bulk action bar for triggering renewal, revocation (with RFC 5280 reason modal
and progress bar), and owner reassignment across selected certificates.

Deployment status timeline: visual 4-step lifecycle pipeline (Requested → Issued
→ Deploying → Active) on certificate detail page, powered by per-cert job queries
with animated status indicators for active steps and failure states.

Inline policy editor: edit/save/cancel interface on certificate detail page for
changing renewal policy and certificate profile assignments via dropdown selectors
with lazy-loaded policy and profile lists.

Target connector configuration wizard: 3-step modal (Select Type → Configure →
Review) with type-specific configuration fields for NGINX, Apache, HAProxy, F5
BIG-IP, and IIS targets including required field validation.

Audit trail export: CSV and JSON download buttons on audit page with applied
filters preserved in export. Added action filter input for narrower searches.

Short-lived credentials dashboard: new page at /short-lived showing ephemeral
certificates (profile TTL < 1 hour) with live TTL countdown, auto-refresh every
10 seconds, profile lookup, and stats bar (active/expired/profiles).

DataTable enhanced with optional selectable/selectedKeys/onSelectionChange props
for checkbox multi-select with select-all toggle and row highlighting.

Frontend tests expanded from 53 to 79: full API client endpoint coverage for
profiles, owners, teams, agent groups, revocation, approval/rejection, policy
violations, and issuer creation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 15:07:10 -04:00
shankar0123 762c523d59 feat: M15b — OCSP responder, DER CRL, short-lived exemption, revocation GUI
Backend:
- Embedded OCSP responder: GET /api/v1/ocsp/{issuer_id}/{serial} returns
  signed OCSP responses (good/revoked/unknown) using CA key
- DER-encoded X.509 CRL: GET /api/v1/crl/{issuer_id} returns proper DER CRL
  signed by issuing CA with 24h validity window
- Short-lived cert exemption: certs with profile TTL < 1 hour skip CRL/OCSP
  (expiry is sufficient revocation for ephemeral workloads)
- Extended issuer connector interface with GenerateCRL and SignOCSPResponse
- Local CA implements full CRL/OCSP signing; ACME and step-ca return
  appropriate "use native endpoint" errors
- IssuerConnectorAdapter bridges new methods between layers

Frontend:
- Revoke button on certificate detail page with RFC 5280 reason modal
- Revocation banner with reason display and timestamp
- Revocation status indicators in lifecycle section
- "Revoked" filter option in certificates list
- API client: revokeCertificate() function and Certificate type extensions

Tests: ~31 new tests across connector, service, handler, and adapter layers
Docs: milestones renumbered (M13-M14, M16-M18), M15b marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 14:39:10 -04:00
shankar0123 12e6150219 docs: remove version labels from public docs to avoid telegraphing roadmap
Strip explicit V2/V3 version labels from planned features in README,
architecture, connectors, and demo docs. F5/IIS now say "interface only"
and "implementation planned" without version targets. DigiCert and other
future issuers say "planned" without version numbers. Keeps completed
milestones detailed (social proof) while keeping future work abstract.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 13:50:03 -04:00
shankar0123 a93e9f677c docs: restructure roadmap for V2/V3 product strategy
Trim V2 roadmap to free-tier features only (GUI operations, CLI, notifiers,
Prometheus metrics, OCSP, MCP server, filesystem discovery). Move enterprise
features to V3 with deliberately vague descriptions. Remove specific version
references for F5/IIS implementations and SSE/WebSocket from docs. Add
roadmap.md to gitignore for private strategy tracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 13:19:37 -04:00
shankar0123 d5f63dc082 docs: update README, architecture, and demo docs for M15a revocation
Update test counts (525+ → 600+), table counts (17 → 18), endpoint
counts (68 → 70), add revocation/CRL endpoints to API overview, add
certificate_revocations table to schema docs, update M15 roadmap to
show M15a complete and M15b remaining.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 11:03:37 -04:00
shankar0123 5d98e373e3 feat: M15a — certificate revocation API, CRL endpoint, and revocation notifications
Implements core revocation infrastructure: POST /api/v1/certificates/{id}/revoke
with all 8 RFC 5280 reason codes, JSON-formatted CRL at GET /api/v1/crl, webhook
and email revocation notifications, best-effort issuer notification, and immutable
revocation audit trail. Includes 48 new tests across service, handler, integration,
and domain layers (600+ total). Fixes 3 pre-existing test bugs (team_test error
matching, agent_group delete status code, team handler per_page validation).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 10:59:18 -04:00
shankar0123 d881403d11 fix: correct agent_group_members seed data — reference actual agent IDs
The seed_demo.sql referenced nonexistent agent IDs (agent-web-1, agent-api-1,
agent-db-1) in the agent_group_members table, causing a FK constraint violation
on fresh database initialization. Fixed to use the actual agent IDs defined
earlier in the same file (ag-web-prod, ag-web-staging, ag-iis-prod).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 10:28:41 -04:00
shankar0123 690765b53e test: comprehensive test expansion — 330+ to 525+ tests, close M11b coverage gaps
Add 195+ new tests across service, handler, connector, and integration layers:
- Service tests: team (23), owner (21), agent_group (25), issuer (18), issuer_adapter (6)
- Handler tests: teams (26), owners (21)
- NGINX target connector tests (13): config validation, deployment, reload
- Integration tests: 19 M11b endpoint subtests (teams, owners, agent groups CRUD)
- CI pipeline: add ./internal/connector/target/... to test coverage path
- Docs: update test counts to 525+ across README, architecture, CLAUDE.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 23:43:32 -04:00
shankar0123 aa183efdca docs: cross-validate all documentation against codebase, fix 21 inaccuracies
Fact-checked every doc file against actual source code. Key corrections:
- Table count 14→17 (added profiles, agent_groups, agent_group_members)
- Endpoint count 55→68 (counted from router.go)
- Test count 250+→330+ (99 service + 165 handler + 53 frontend + connectors)
- Dashboard views 14→16 pages (counted from web/src/pages/)
- step-ca marked implemented (was "Planned V2") across all docs
- ACME DNS-01 marked implemented (was "planned") in concepts.md
- Removed ADCS as separate planned connector (handled via sub-CA mode)
- Fixed pointer types in connectors.md interface docs (*string, *time.Time)
- Added 3 missing tables to architecture.md ER diagram
- Added 5 missing env vars to README config table
- Updated M11/M12 to  in README roadmap
- Issuer count in quickstart demo data 3→4 (added step-ca)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 23:12:23 -04:00
shankar0123 f5fed74d6f feat: M12 — sub-CA mode, ACME DNS-01 challenges, step-ca issuer connector
Sub-CA mode: Local CA loads CA cert+key from disk (CERTCTL_CA_CERT_PATH +
CERTCTL_CA_KEY_PATH) to operate as subordinate CA under enterprise root
(e.g., ADCS). Supports RSA, ECDSA, PKCS#8 keys. Validates IsCA and
KeyUsageCertSign. Falls back to self-signed when paths unset.

DNS-01 challenges: Pluggable DNSSolver interface with script-based hook
implementation. User-provided scripts create/cleanup _acme-challenge TXT
records for any DNS provider. Configurable propagation wait. Enables
wildcard certs and non-HTTP-accessible hosts.

step-ca connector: Smallstep private CA via native /sign API with JWK
provisioner auth. Issuance, renewal, revocation. Registered as iss-stepca.

23 new tests across 3 files. CI test path widened to ./internal/connector/issuer/...

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 22:55:50 -04:00
shankar0123 d7a4d40d47 docs: fix SC-081v3 voting claim — not unanimous, zero opposition with 5 abstentions
The ballot passed 25-0-5 among CAs and 4-0-0 among browsers.
Not unanimous due to 5 CA abstentions (Entrust, IdenTrust,
Japan Registry Services, SECOM, TWCA).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 16:40:51 -04:00
shankar0123 9a12ee18b2 docs: codify proxy agent pattern, sub-CA capability, IIS dual-mode design
Three architectural decisions from user feedback:

1. Pull-only deployment model — server never initiates outbound
   connections. Network appliances (F5, Palo Alto, FortiGate, Citrix)
   use a proxy agent in the same network zone. Added as design principle
   #2 across all docs.

2. IIS dual-mode — agent-local PowerShell (primary/recommended) + proxy
   agent WinRM (for agentless targets). Replaces the previous WinRM-only
   design. Updated connectors.md, architecture.md, demo-advanced.md.

3. Sub-CA to ADCS — Local CA can load a pre-signed CA cert+key from
   disk, so all issued certs chain to the enterprise root. Replaces the
   planned standalone ADCS issuer connector. Updated concepts.md,
   connectors.md, demo-advanced.md issuer diagram.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 21:45:18 -04:00
shankar0123 b0549e6f05 feat: M11b — ownership tracking, agent groups, interactive renewal approval
Ownership: owners/teams GUI pages, notification email resolution via
resolveRecipient (owner_id → owner.email lookup). Agent groups: dynamic
device grouping by OS/arch/IP CIDR/version with manual include/exclude
membership, migration 000004, full CRUD stack (domain → repo → service →
handler → frontend). Interactive approval: AwaitingApproval job state,
approve/reject API endpoints with reason tracking. Tests: 12 agent group
handler tests, 8 approve/reject job handler tests, integration tests
updated for 13-param RegisterHandlers. Docs updated across architecture,
concepts, and seed data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 21:02:35 -04:00
shankar0123 a579a84c7f feat: M11a — certificate profiles, crypto policy enforcement, short-lived cert expiry
Add certificate profiles as named enrollment templates that control allowed
key algorithms, max TTL, permitted EKUs, required SAN patterns, and optional
SPIFFE URI SANs. CSR submissions are validated against profile rules at
signing time (key type + minimum size). Short-lived certs (TTL < 1 hour)
auto-expire via a new scheduler loop — expiry acts as revocation, no
CRL/OCSP needed.

New files:
- Migration 000003: certificate_profiles table, FK columns on
  managed_certificates/renewal_policies, key metadata on certificate_versions
- domain/profile.go: CertificateProfile + KeyAlgorithmRule structs
- repository/postgres/profile.go: full CRUD with JSONB marshaling
- service/profile.go: ProfileService with validation + audit logging
- service/crypto_validation.go: CSR-against-profile validation (RSA/ECDSA/Ed25519)
- handler/profiles.go: 5 HTTP endpoints under /api/v1/profiles
- web/src/pages/ProfilesPage.tsx: profiles management page

Modified:
- renewal.go: CSR validation in CompleteAgentCSRRenewal, ExpireShortLivedCertificates
- scheduler.go: 30s short-lived expiry check loop
- certificate.go (repo): nullable profile FK, key metadata on versions
- main.go: profile repo/service/handler wiring, 8-param NewRenewalService
- router.go: 12-param RegisterHandlers with profile routes
- seed_demo.sql: 4 demo profiles (standard, mtls, short-lived, high-security)
- Frontend: types, API client, routing, sidebar nav

Tests: 40 new tests across handler (15), service (13), crypto validation (12)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 20:39:49 -04:00
shankar0123 7450fcfb07 docs: add 47-day cert lifespan motivation, update roadmap, cross-validate all docs
README: lead with CA/Browser Forum Ballot SC-081v3 (47-day certs by 2029)
and certctl's end-to-end automation positioning. Update architecture
diagram and target lists to include Apache/HAProxy. Update roadmap
with new M15 (Revocation Infrastructure), renumbered M16-M18, and
V3.1 cert-manager/IAM Roles Anywhere additions.

concepts.md: rewrite "Why Do Certificates Expire?" with shrinking
lifespan timeline and automation imperative.

quickstart.md: add 47-day framing in intro.

architecture.md: add Apache/HAProxy to system diagram, target connector
diagram, deployment section, and ER diagram (agent metadata columns).
Update planned targets list for V3.1. Fix test count (230+).

connectors.md: fix notifier planned version reference (V2 not V2.1).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 19:28:02 -04:00
shankar0123 07275bf92f feat: M10 — agent metadata collection, Apache httpd + HAProxy target connectors
Agents now report OS, architecture, IP address, hostname, and version
via heartbeat using runtime.GOOS, runtime.GOARCH, and net.Dial. New
migration adds columns to agents table. Heartbeat handler, service,
and repository updated to accept and persist metadata. GUI shows
OS/Arch in agent list and full system info in agent detail page.

Apache httpd connector: separate cert/chain/key files, apachectl
configtest validation, graceful reload. HAProxy connector: combined
PEM file (cert+chain+key), optional config validation, reload.
Both wired into agent binary's target connector switch.

14 tests for new connectors. All existing tests updated for new
Heartbeat/UpdateHeartbeat signatures. Docs updated across README,
architecture, concepts, and connectors guides.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 02:19:28 -04:00
shankar0123 e06ea310a8 docs: update all documentation for v1.0.0 release
- Fix demo certificate count: 14 → 15 across README, quickstart,
  demo-guide (wildcard cert was added but count never updated)
- Fix negative_test subtest count: 12 → 14 in architecture.md
- Update README roadmap: v1.0.0 released (no longer "tag pending")
- Update status badge: "active development" → "v1.0.0"
- Remove stale POSTGRES_IMPLEMENTATION.md and POSTGRES_PATTERNS.md
  (scaffold-era dev notes, not referenced anywhere)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:43:18 -04:00
shankar0123 60b6464c76 ci: add release workflow for Docker image publishing on tag push
Builds and pushes certctl-server and certctl-agent images to ghcr.io
when a version tag (v*) is pushed. Also creates a GitHub Release with
auto-generated release notes and Docker pull instructions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
v1.0.0
2026-03-20 01:35:41 -04:00
shankar0123 76618be2b1 docs: update architecture and quickstart for v1.0 hardening changes
- Architecture: correct test count (127 handler tests), 5 rule types,
  scheduler timeout table, ErrorBoundary, logging section, .env.example
- Quickstart: production credentials section referencing deploy/.env.example

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:37 -04:00
shankar0123 bb7a78352e fix: frontend error handling — ErrorBoundary, type-safe errors, stable keys
- React ErrorBoundary wrapping entire app for graceful crash recovery
- fetchJSON error handling uses try/catch instead of .catch() chain
- CertificateDetailPage: instanceof checks replace unsafe type casts
- DataTable: keyField prop replaces array index keys

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:32 -04:00
shankar0123 7618c5a734 fix: externalize credentials and add agent key volume persistence
- POSTGRES_PASSWORD and CERTCTL_API_KEY read from .env file
- Added deploy/.env.example with documentation
- Agent key volume (agent_keys) for key persistence across restarts
- Agent healthcheck via pgrep
- Resource limits: server 1CPU/512M, agent 0.5CPU/256M

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:27 -04:00
shankar0123 1a2e9f9212 perf: add 5 database indexes for scheduler query optimization
- idx_jobs_status_scheduled_at: job processor queries
- idx_certificate_versions_cert_created: latest version lookups
- idx_audit_events_timestamp_desc: audit trail pagination
- idx_agents_online_heartbeat: health check partial index
- idx_deployment_targets_agent_name: unique constraint on agent+name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:21 -04:00
shankar0123 829c7c0064 fix: add operation-level context timeouts to scheduler loops
Prevents runaway operations from blocking scheduler goroutines:
- Renewal check: 5 minute timeout
- Job processor: 2 minute timeout
- Agent health check: 1 minute timeout
- Notification processor: 1 minute timeout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:15 -04:00
shankar0123 239a1792d2 fix: harden agent with backoff, panic recovery, and error handling
- Exponential backoff on consecutive poll/heartbeat failures (max 5min)
- Panic recovery wrapper on agent.Run goroutine
- All 9 silent reportJobStatus errors now logged properly
- Key read failures return error and report job failure
- CommonName validation before CSR creation
- KeyDir permissions enforced with os.Chmod after MkdirAll
- splitPEMChain rewritten to use encoding/pem instead of string parsing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:10 -04:00
shankar0123 e03a75ed9a fix: replace fmt.Printf with structured slog logging across all services
All 10 service files now use slog.Error for failure logging instead of
fmt.Printf. Audit event recording errors are checked and logged rather
than silently discarded. Adds consistent structured context (resource IDs,
operation names) to all error log statements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:03 -04:00
shankar0123 393cf548b2 docs: expand V2/V3/V4 roadmap with high-value features from competitive analysis
V2.0: Apache httpd, HAProxy targets, crypto policy enforcement, cert ownership
V2.1: PagerDuty/OpsGenie notifiers
V2.2: Compliance scoring
V2.3 (new): MCP server, CT Log monitoring, DigiCert issuer, filesystem discovery
V3: Restructured into discovery engine, cloud/network targets (AWS, Azure, Palo
Alto, FortiGate, Citrix, K8s), extended issuers (Entrust, GlobalSign, Google CAS,
EJBCA, Vault), ServiceNow, Ansible, compliance mapping
V4+: LDAP auth, API key scoping, multi-tenancy, Docker Secrets, Tomcat/JKS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 18:13:21 -04:00
shankar0123 2e7bed9bbe docs: add agent metadata collection and dynamic device grouping to V2 roadmap
Community feedback requested fleet inventory and policy-based targeting.
Agents will report OS, platform, IP, hostname via heartbeat; dynamic
grouping enables policy scoping by agent criteria instead of manual assignment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 11:55:11 -04:00
shankar0123 eeaf914590 docs: add ADCS issuer connector to V2 roadmap
Active Directory Certificate Services (ADCS) added as a planned
issuer connector across README, architecture, connectors, and
demo-advanced docs. Requested by community feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 21:00:25 -04:00
shankar0123 66f04f7afe style: run gofmt -s across all Go files
Fixes Go Report Card gofmt score from 52% to 100%.
Pure formatting changes — no logic modifications.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 19:32:29 -04:00
shankar0123 7422fe0829 docs: add step-ca and OpenSSL CA to V2 roadmap, fix F5/IIS status
- Added step-ca and OpenSSL/Custom CA as planned V2 issuer connectors
  across README, architecture, connectors, and demo-advanced docs
- Fixed F5 BIG-IP and IIS target status from "Implemented" to
  "Interface only" — both are stubs with mapped-out flows but no
  actual API calls yet
- Updated all diagrams and tables to be consistent across docs
- DNS-01, step-ca, OpenSSL, F5, IIS all listed under V2.0 roadmap

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 09:50:10 -04:00
shankar0123 e0fa20b369 docs: clarify ACME is HTTP-01 only, DNS-01 planned for V2
The concepts guide implied DNS-01 was supported. Made it explicit
that v1 uses HTTP-01 and DNS-01 (wildcards) is on the V2 roadmap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 23:39:35 -04:00
shankar0123 5c787eea12 docs: add DNS-01 challenge support to V2 roadmap
DNS-01 enables wildcard certificates and validation for hosts that
can't serve HTTP on port 80. Planned with provider adapters
(Cloudflare, Route53) and custom script hooks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 21:58:04 -04:00
shankar0123 b603ebcc99 docs: fix README headline — source-available, not open source
BSL 1.1 is not OSI-approved open source. Changed headline to
"Self-Hosted Certificate Lifecycle Platform" to be accurate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 20:24:22 -04:00
shankar0123 8c87cd26df chore: remove CLAUDE.md from repo
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 19:34:35 -04:00
shankar0123 976059054b docs: add 9 dashboard screenshots
Screenshots of all major dashboard views: dashboard, certificates,
agents, jobs, notifications, policies, issuers, targets, audit trail.
Referenced from README.md screenshot grid added in a10989f.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 15:13:37 -04:00
shankar0123 a10989f5eb docs: add dashboard screenshots to README
Added 9 screenshots showing all dashboard views: dashboard overview,
certificates list, agents fleet, jobs queue, notifications inbox,
policies, issuers, targets, and audit trail. Screenshots are displayed
in a 2-column grid in the README.

Note: actual .png files need to be added to docs/screenshots/ — this
commit includes the README markup and directory placeholder.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 15:02:16 -04:00