Aller au contenu

PD-103 — Agent Developer — Module M10 : capture-idempotence

1. Identite agent

  • Agent : agent-developer (Agent B — Claude)
  • Story : PD-103
  • Module : M10 — capture-idempotence
  • Wave : 2 (depend de M13 pour la table capture_events)
  • Date : 2026-04-03

2. Resume

Module M10 implemente le service d'idempotence pour le flux de capture probatoire. Il est responsable de : 1. La construction du payload canonique JSON deterministe (cles triees, valeurs normalisees) 2. Le calcul du fingerprint payload_canonical_sha256 (SHA-256 du JSON canonique) 3. La verification d'idempotence sous advisory lock PostgreSQL (pg_advisory_xact_lock) 4. La decision 200 idempotent / 409 Conflict / null (nouveau)

Fichier cree : - src/modules/capture/services/capture-idempotence.service.ts

Aucun fichier existant modifie.

3. Artefacts livres

Fichier Role Lignes
src/modules/capture/services/capture-idempotence.service.ts Service idempotence : canonical payload, SHA-256 fingerprint, advisory lock, 200/409/null ~180

4. Architecture

4.1 Pattern utilise

Pattern identique au DepositService existant (PD-60) : - pg_advisory_xact_lock(hashtext(userId || '/' || capture_id)) pour serialisation - Lookup sous lock, comparaison fingerprint, decision idempotence - Transaction fournie par l'appelant (M9 CaptureIngestService)

4.2 Payload canonique d'idempotence

Conforme a la spec v3 §5.12 — definition normative :

{
  "aes_gcm_nonce_b64": "<base64>",
  "aes_gcm_tag_b64": "<base64>",
  "capture_id": "<uuid-v4-lowercase>",
  "content_hash": "<hash_sha3_256-lowercase>",
  "dek_wrapped_b64": "<base64>",
  "kek_id": "<opaque-key-id>",
  "mime_type": "image/png",
  "size_bytes": 12345,
  "upload_object_key": "<s3-object-reference>"
}

Regles contractuelles : 1. capture_id normalise lowercase avant serialisation (INV-103-31) 2. content_hash = lower(hash_sha3_256) (alias canonique) 3. Cles JSON triees alphabetiquement 4. Serialisation UTF-8 deterministe, sans espace superflu 5. Champs optionnels OCR (ocr_text, ocr_confidence, ocr_language) exclus 6. payload_canonical_sha256 = SHA-256(canonical_json_bytes)

4.3 Diagramme de flux

M9.createCapture(manager, dto, userId)
  |
  v
M10.acquireLockAndCheckIdempotence(manager, userId, captureId, canonicalFields)
  |
  +-- 1. Normalise capture_id lowercase
  +-- 2. pg_advisory_xact_lock(hashtext(userId || '/' || captureId))
  +-- 3. SELECT FROM capture_events WHERE capture_id = $1
  |     |
  |     +-- NOT FOUND --> return { action: 'CREATE', fingerprint }
  |     |
  |     +-- FOUND + same fingerprint --> return { action: 'IDEMPOTENT', existing }
  |     |
  |     +-- FOUND + different fingerprint --> throw ConflictException (409)

5. Code source

5.1 src/modules/capture/services/capture-idempotence.service.ts

// ============================================================================
// src/modules/capture/services/capture-idempotence.service.ts — PD-103 / M10
// ----------------------------------------------------------------------------
// Service d'idempotence pour le flux de capture probatoire.
// Construit le payload canonique, calcule le fingerprint SHA-256,
// et verifie l'idempotence sous advisory lock PostgreSQL.
//
// Alignement : PD-103-specification.md v3, §5.12
// Invariants couverts : INV-103-31, INV-103-37
// Pattern : identique a DepositService.checkIdempotenceOrConflict (PD-60)
// ============================================================================

import { ConflictException, Injectable, Logger } from '@nestjs/common';
import { createHash } from 'node:crypto';
import { EntityManager } from 'typeorm';

// =============================================================================
// Types
// =============================================================================

/**
 * Champs requis pour construire le payload canonique d'idempotence.
 * Conforme a §5.12 — definition normative.
 * Les champs optionnels OCR sont exclus par contrat (INV-103-37).
 */
export interface CanonicalPayloadFields {
  readonly captureId: string;
  readonly contentHash: string;
  readonly aesGcmNonceB64: string;
  readonly aesGcmTagB64: string;
  readonly dekWrappedB64: string;
  readonly kekId: string;
  readonly mimeType: string;
  readonly sizeBytes: number;
  readonly uploadObjectKey: string;
}

/** Resultat de la verification d'idempotence */
export type IdempotenceCheckResult =
  | { readonly action: 'CREATE'; readonly payloadCanonicalSha256: string }
  | { readonly action: 'IDEMPOTENT'; readonly existingCaptureEventId: string; readonly existingState: string };

// =============================================================================
// Service
// =============================================================================

@Injectable()
export class CaptureIdempotenceService {
  private readonly logger = new Logger(CaptureIdempotenceService.name);

  // ---------------------------------------------------------------------------
  // Public API
  // ---------------------------------------------------------------------------

  /**
   * Acquiert un advisory lock sur (userId, captureId), puis verifie l'idempotence.
   *
   * DOIT etre appele dans une transaction existante (manager fourni par M9).
   *
   * @returns CREATE si capture_id inconnu (avec fingerprint calcule)
   * @returns IDEMPOTENT si capture_id connu + fingerprint identique (200)
   * @throws ConflictException si capture_id connu + fingerprint divergent (409)
   */
  async acquireLockAndCheckIdempotence(
    manager: EntityManager,
    userId: string,
    fields: CanonicalPayloadFields,
  ): Promise<IdempotenceCheckResult> {
    // 1. Normaliser capture_id en lowercase (INV-103-31)
    const normalizedCaptureId = fields.captureId.toLowerCase();

    // 2. Serialisation lock : pg_advisory_xact_lock(hashtext(userId || '/' || captureId))
    //    Pattern identique a DepositService (PD-60, INV-60-02)
    await manager.query(`SELECT pg_advisory_xact_lock(hashtext($1 || '/' || $2))`, [
      userId,
      normalizedCaptureId,
    ]);

    // 3. Calculer le fingerprint canonique (INV-103-37)
    const payloadCanonicalSha256 = this.computeCanonicalFingerprint({
      ...fields,
      captureId: normalizedCaptureId,
      contentHash: fields.contentHash.toLowerCase(),
    });

    // 4. Lookup sous lock
    const existing = await manager.query(
      `SELECT id, state, payload_canonical_sha256
       FROM vault_secure.capture_events
       WHERE capture_id = $1
       LIMIT 1`,
      [normalizedCaptureId],
    ) as Array<{ id: string; state: string; payload_canonical_sha256: string }>;

    if (existing.length === 0) {
      // Capture inconnue — continuer la creation
      return { action: 'CREATE', payloadCanonicalSha256: payloadCanonicalSha256 };
    }

    const record = existing[0];

    // Comparaison fingerprint
    if (record.payload_canonical_sha256 === payloadCanonicalSha256) {
      // Meme fingerprint — replay idempotent (200)
      this.logger.log(
        `Idempotent replay detected for capture_id=${normalizedCaptureId}`,
      );
      return {
        action: 'IDEMPOTENT',
        existingCaptureEventId: record.id,
        existingState: record.state,
      };
    }

    // Fingerprint divergent — conflit (409)
    this.logger.warn(
      `Conflict detected for capture_id=${normalizedCaptureId}: ` +
        `stored=${record.payload_canonical_sha256.substring(0, 16)}..., ` +
        `received=${payloadCanonicalSha256.substring(0, 16)}...`,
    );
    throw new ConflictException({
      error: 'CONFLICT',
      message: 'capture_id already used with different payload',
    });
  }

  // ---------------------------------------------------------------------------
  // Canonical payload construction (§5.12)
  // ---------------------------------------------------------------------------

  /**
   * Calcule le fingerprint SHA-256 du payload canonique.
   *
   * Regles (§5.12) :
   * - Cles JSON triees alphabetiquement
   * - Serialisation UTF-8 deterministe sans espace superflu
   * - capture_id lowercase, content_hash lowercase
   * - Champs OCR exclus
   */
  computeCanonicalFingerprint(fields: CanonicalPayloadFields): string {
    const canonicalJson = this.buildCanonicalJson(fields);
    return createHash('sha256').update(canonicalJson, 'utf8').digest('hex');
  }

  /**
   * Construit le JSON canonique pour le fingerprint d'idempotence.
   * Expose pour les tests uniquement.
   */
  buildCanonicalJson(fields: CanonicalPayloadFields): string {
    // Objet avec cles deja en ordre alphabetique (conforme §5.12)
    const canonical = {
      aes_gcm_nonce_b64: fields.aesGcmNonceB64,
      aes_gcm_tag_b64: fields.aesGcmTagB64,
      capture_id: fields.captureId.toLowerCase(),
      content_hash: fields.contentHash.toLowerCase(),
      dek_wrapped_b64: fields.dekWrappedB64,
      kek_id: fields.kekId,
      mime_type: fields.mimeType,
      size_bytes: fields.sizeBytes,
      upload_object_key: fields.uploadObjectKey,
    };

    // JSON.stringify avec replacer triant les cles garantit l'ordre
    // meme si un moteur JS ne respecte pas l'ordre d'insertion
    return JSON.stringify(canonical, Object.keys(canonical).sort());
  }
}

6. Decisions architecturales

Decision Rationale Alternatives considerees Trade-offs
Serialisation canonique via JSON.stringify avec replacer tri de cles Deterministe, pur Node.js, pas de dependance tierce. Le replacer avec Object.keys().sort() est une garantie supplementaire meme si l'objet est deja construit dans l'ordre. RFC 8785 (JSON Canonicalization Scheme) via lib tierce canonicalize RFC 8785 est plus robuste pour les edge cases Unicode, mais ajoute une dependance. Les champs PD-103 sont tous ASCII/base64, pas de risque Unicode.
Advisory lock via hashtext() PostgreSQL Coherent avec le pattern DepositService (PD-60). hashtext() retourne un int32 deterministe. hashToInt() custom (pattern FolderService) hashtext() est natif PostgreSQL, zero maintenance. Collision possible mais acceptable pour un lock de serialisation (pas un lock d'exclusion long).
Query SQL directe pour le lookup (pas de Repository) Le service recoit un EntityManager transactionnel de M9. Les queries directes sur vault_secure.capture_events evitent les problemes de RLS context setting qui ne sont pas encore positionnes a ce stade du flux. manager.findOne(CaptureEvent, ...) via entity TypeORM L'entity TypeORM sera utilisee par M9 pour l'INSERT. Le lookup M10 est plus simple en SQL direct et evite de coupler M10 a l'entity.

7. Mapping invariants

Invariant Mecanisme Observable
INV-103-31 (capture_id lowercase) fields.captureId.toLowerCase() applique avant serialisation et lookup capture_id toujours lowercase dans les queries et le JSON canonique
INV-103-37 (fingerprint canonique) buildCanonicalJson() + SHA-256 ; champs OCR exclus ; cles triees ; 200/409 payload_canonical_sha256 stable pour un meme payload logique ; variations OCR n'affectent pas le fingerprint

8. Mapping tests

Test ID Reference spec Mecanisme Observable
TC-NOM-15 INV-103-31, INV-103-37, CA-103-18, CA-103-21, CA-103-22 acquireLockAndCheckIdempotence() avec capture_id uppercase puis lowercase action: 'IDEMPOTENT' retourne, fingerprint identique malgre casse differente
TC-INV-06 INV-103-31, INV-103-37 4 variantes : (A) replay identique, (B) payload divergent, (C) OCR seul change, (D) casse differente (A) IDEMPOTENT, (B) 409 Conflict, (C) IDEMPOTENT (OCR exclu), (D) IDEMPOTENT (normalisation)
TC-ERR-13 INV-103-37, ER-103-13 acquireLockAndCheckIdempotence() avec fingerprint divergent ConflictException thrown, HTTP 409
TC-NR-10 INV-103-37 Recalcul computeCanonicalFingerprint() sur meme payload logique Valeur identique a chaque appel

9. Tests unitaires proposes

// src/modules/capture/__tests__/capture-idempotence.service.spec.ts

import { ConflictException } from '@nestjs/common';
import { CaptureIdempotenceService, CanonicalPayloadFields } from '../services/capture-idempotence.service';

describe('CaptureIdempotenceService', () => {
  let service: CaptureIdempotenceService;

  const baseFields: CanonicalPayloadFields = {
    captureId: 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d',
    contentHash: 'abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789',
    aesGcmNonceB64: 'dGVzdG5vbmNlMTIz',
    aesGcmTagB64: 'dGVzdHRhZzEyMzQ1Njc4OQ==',
    dekWrappedB64: 'ZGVrd3JhcHBlZGI2NA==',
    kekId: 'kek-2026-04-01',
    mimeType: 'image/png',
    sizeBytes: 524288,
    uploadObjectKey: 'captures/a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d.enc',
  };

  beforeEach(() => {
    service = new CaptureIdempotenceService();
  });

  describe('buildCanonicalJson', () => {
    it('should produce deterministic JSON with sorted keys', () => {
      const json = service.buildCanonicalJson(baseFields);
      const parsed = JSON.parse(json);
      const keys = Object.keys(parsed);
      expect(keys).toEqual([...keys].sort());
    });

    it('should normalize capture_id to lowercase', () => {
      const json = service.buildCanonicalJson({
        ...baseFields,
        captureId: 'A1B2C3D4-E5F6-4A7B-8C9D-0E1F2A3B4C5D',
      });
      const parsed = JSON.parse(json);
      expect(parsed.capture_id).toBe('a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d');
    });

    it('should normalize content_hash to lowercase', () => {
      const json = service.buildCanonicalJson({
        ...baseFields,
        contentHash: 'ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789',
      });
      const parsed = JSON.parse(json);
      expect(parsed.content_hash).toBe(
        'abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789',
      );
    });

    it('should exclude OCR fields', () => {
      const json = service.buildCanonicalJson(baseFields);
      expect(json).not.toContain('ocr_text');
      expect(json).not.toContain('ocr_confidence');
      expect(json).not.toContain('ocr_language');
    });

    it('should include all 9 canonical fields', () => {
      const json = service.buildCanonicalJson(baseFields);
      const parsed = JSON.parse(json);
      expect(Object.keys(parsed)).toHaveLength(9);
      expect(parsed).toHaveProperty('aes_gcm_nonce_b64');
      expect(parsed).toHaveProperty('aes_gcm_tag_b64');
      expect(parsed).toHaveProperty('capture_id');
      expect(parsed).toHaveProperty('content_hash');
      expect(parsed).toHaveProperty('dek_wrapped_b64');
      expect(parsed).toHaveProperty('kek_id');
      expect(parsed).toHaveProperty('mime_type');
      expect(parsed).toHaveProperty('size_bytes');
      expect(parsed).toHaveProperty('upload_object_key');
    });

    it('should serialize size_bytes as number, not string', () => {
      const json = service.buildCanonicalJson(baseFields);
      expect(json).toContain('"size_bytes":524288');
      expect(json).not.toContain('"size_bytes":"524288"');
    });
  });

  describe('computeCanonicalFingerprint', () => {
    it('should return 64-char lowercase hex', () => {
      const fp = service.computeCanonicalFingerprint(baseFields);
      expect(fp).toMatch(/^[a-f0-9]{64}$/);
    });

    it('should be stable across calls (TC-NR-10)', () => {
      const fp1 = service.computeCanonicalFingerprint(baseFields);
      const fp2 = service.computeCanonicalFingerprint(baseFields);
      expect(fp1).toBe(fp2);
    });

    it('should produce same fingerprint regardless of capture_id casing (TC-INV-06 D)', () => {
      const fpLower = service.computeCanonicalFingerprint(baseFields);
      const fpUpper = service.computeCanonicalFingerprint({
        ...baseFields,
        captureId: baseFields.captureId.toUpperCase(),
      });
      expect(fpLower).toBe(fpUpper);
    });

    it('should differ when a canonical field changes (TC-INV-06 B)', () => {
      const fp1 = service.computeCanonicalFingerprint(baseFields);
      const fp2 = service.computeCanonicalFingerprint({
        ...baseFields,
        sizeBytes: 999999,
      });
      expect(fp1).not.toBe(fp2);
    });
  });

  describe('acquireLockAndCheckIdempotence', () => {
    let mockManager: { query: jest.Mock };

    beforeEach(() => {
      mockManager = {
        query: jest.fn(),
      };
    });

    it('should return CREATE when capture_id not found', async () => {
      // Lock query succeeds
      mockManager.query.mockResolvedValueOnce(undefined);
      // Lookup returns empty
      mockManager.query.mockResolvedValueOnce([]);

      const result = await service.acquireLockAndCheckIdempotence(
        mockManager as any,
        'user-123',
        baseFields,
      );

      expect(result.action).toBe('CREATE');
      expect((result as any).payloadCanonicalSha256).toMatch(/^[a-f0-9]{64}$/);
    });

    it('should return IDEMPOTENT when same fingerprint (TC-INV-06 A)', async () => {
      const expectedFp = service.computeCanonicalFingerprint(baseFields);

      mockManager.query.mockResolvedValueOnce(undefined);
      mockManager.query.mockResolvedValueOnce([
        { id: 'event-uuid', state: 'UPLOADED', payload_canonical_sha256: expectedFp },
      ]);

      const result = await service.acquireLockAndCheckIdempotence(
        mockManager as any,
        'user-123',
        baseFields,
      );

      expect(result.action).toBe('IDEMPOTENT');
    });

    it('should throw ConflictException when fingerprint differs (TC-ERR-13)', async () => {
      mockManager.query.mockResolvedValueOnce(undefined);
      mockManager.query.mockResolvedValueOnce([
        { id: 'event-uuid', state: 'UPLOADED', payload_canonical_sha256: 'a'.repeat(64) },
      ]);

      await expect(
        service.acquireLockAndCheckIdempotence(mockManager as any, 'user-123', baseFields),
      ).rejects.toThrow(ConflictException);
    });

    it('should acquire advisory lock with userId and normalized captureId', async () => {
      mockManager.query.mockResolvedValueOnce(undefined);
      mockManager.query.mockResolvedValueOnce([]);

      await service.acquireLockAndCheckIdempotence(
        mockManager as any,
        'user-123',
        { ...baseFields, captureId: 'A1B2C3D4-E5F6-4A7B-8C9D-0E1F2A3B4C5D' },
      );

      // First call = advisory lock
      expect(mockManager.query).toHaveBeenCalledWith(
        `SELECT pg_advisory_xact_lock(hashtext($1 || '/' || $2))`,
        ['user-123', 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d'],
      );
    });

    it('should query with normalized lowercase capture_id (INV-103-31)', async () => {
      mockManager.query.mockResolvedValueOnce(undefined);
      mockManager.query.mockResolvedValueOnce([]);

      await service.acquireLockAndCheckIdempotence(
        mockManager as any,
        'user-123',
        { ...baseFields, captureId: 'A1B2C3D4-E5F6-4A7B-8C9D-0E1F2A3B4C5D' },
      );

      // Second call = lookup query
      const lookupCall = mockManager.query.mock.calls[1];
      expect(lookupCall[1]).toEqual(['a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d']);
    });
  });
});

10. Contrat d'interface avec M9 (capture-ingest)

M9 (CaptureIngestService) utilise M10 comme suit :

// Dans CaptureIngestService.createCapture() — extrait simplifie
return this.dataSource.transaction(async (manager) => {
  // 1. Verification idempotence (M10)
  const idempotenceResult = await this.captureIdempotenceService
    .acquireLockAndCheckIdempotence(manager, userId, {
      captureId: dto.captureId,
      contentHash: dto.hashSha3256,
      aesGcmNonceB64: dto.aesGcmNonceB64,
      aesGcmTagB64: dto.aesGcmTagB64,
      dekWrappedB64: dto.dekWrappedB64,
      kekId: dto.kekId,
      mimeType: dto.mimeType,
      sizeBytes: dto.sizeBytes,
      uploadObjectKey: dto.uploadObjectKey,
    });

  if (idempotenceResult.action === 'IDEMPOTENT') {
    // Retourner 200 avec l'etat existant
    return { status: 200, captureEventId: idempotenceResult.existingCaptureEventId };
  }

  // 2. Unwrap DEK (M11)
  // 3. Persister capture_events (avec payloadCanonicalSha256)
  // 4. Retourner 202
});

11. Hypotheses

ID Hypothese Impact si faux
H-M10-01 JSON.stringify avec replacer triant les cles produit une sortie deterministe en Node.js 20+ pour les types utilises (string, number). Si indeterminisme detecte, migrer vers une lib canonicalize (RFC 8785). Risque faible : les champs sont tous ASCII/base64.
H-M10-02 La table capture_events dans le schema vault_secure est accessible via query SQL directe dans une transaction TypeORM. Si RLS bloque, il faut positionner app.current_user_id avant le SELECT. M9 est responsable de ce setting.
H-M10-03 pg_advisory_xact_lock(hashtext(...)) est suffisant pour serialiser les operations sur un meme capture_id dans un environnement single-primary PostgreSQL. Si multi-primary (CockroachDB, etc.), migrer vers Redis lock. Hors perimetre PD-103.

12. Perimetre hors module

  • La persistance de capture_events (INSERT) est hors perimetre M10 — responsabilite de M9
  • L'audit log dans capture_audit_log est hors perimetre M10 — responsabilite de M9
  • La gestion des erreurs HTTP (mapping ConflictException → 409) est assuree par NestJS nativement
  • Le positionnement de app.current_user_id pour RLS est responsabilite de M9

13. Matrice de couverture test

Test-ID Fichier de test
TC-NOM-15 capture-idempotence.service.spec.ts (replay casse differente → 200)
TC-INV-06 A capture-idempotence.service.spec.ts (replay identique → IDEMPOTENT)
TC-INV-06 B capture-idempotence.service.spec.ts (payload divergent → 409)
TC-INV-06 C capture-idempotence.service.spec.ts (OCR seul change → IDEMPOTENT, implicite par exclusion OCR)
TC-INV-06 D capture-idempotence.service.spec.ts (casse differente → meme fingerprint)
TC-ERR-13 capture-idempotence.service.spec.ts (ConflictException thrown)
TC-NR-10 capture-idempotence.service.spec.ts (stabilite fingerprint)