PD-295 — Plan d'implémentation¶

Mode : équilibré — outil interne mono-utilisateur, stack Python/FAISS/Ollama/Markdown/YAML déjà en place, aucune refonte. Priorité à la simplicité et au zéro-SPOF non bloquant.

Résolution des réserves Gate 3 (RESERVE) : le plan tranche techniquement 4 écarts majeurs sans modifier spec ni tests (Art. I §Immuabilité).

Réserve Écart spec Décision plan Référence

M-01 Écart 1 — reuse_score non normalisé Score = somme pondérée brute (pas un ratio). Seuils 0.3/0.6 appliqués à la somme brute. Cohérent avec TC-NOM-05 (1.0 / 1.8). Documenté en §2.3 + H-01. §2.3, H-01

M-02 Écart 2 — filtres exacts vs KNN clarifications Récupération clarifications = filtre exact {domain, project} + tri récence desc, top_k=3. Le diagramme §5bis est stylisé. Documenté en §2.5 + H-02. §2.5, H-02

M-03 Écart 3 — tags_hash non canonique Canonicalisation stricte : lowercase NFC → dédup → tri ASC → join \n → SHA-256 hex lowercase. Fonction unique compute_tags_hash() partagée entre producteurs learnings et scoring. §2.6, INV-plan-03

M-06 Écart 6 — atomicité JSONL Pattern tempfile + fsync + rename atomique (POSIX) pour toute réécriture. Appends gardés par flock advisory (LOCK_EX) sur sentinelle .lock par fichier. Reader utilise snapshot (open fd à l'entrée). §2.7, INV-plan-06

1. Découpage en composants¶

Module	Responsabilité	Propriétaire (agent step 6b)	Fichiers
`data-format`	Canonicalisation tags_hash, validation schémas D-295-*, helpers JSONL atomiques (tempfile+rename, flock). Pas de logique métier.	`agent-tooling-core`	`scripts/lib/data_format.py`, `scripts/lib/jsonl_atomic.py`
`veille-indexer` (B1)	Scan `ProbatioVault-doc/docs/veille/*/.md`, extraction frontmatter, production `data/veille.jsonl`, embeddings Ollama, index FAISS `data/veille-index.faiss` + `data/veille-embeddings.npy`.	`agent-tooling-indexer`	`scripts/veille-index.py`, `scripts/lib/veille_parser.py`
`veille-search`	CLI `/veille-search` avec filtres `--impact`, `--verdict`, KNN top_k.	`agent-tooling-indexer`	`scripts/veille-search.py`
`clarifications-store` (B2)	Persistance verbatim `PD-XX-clarifications.md` (4 réponses PO), indexation `data/clarifications.jsonl` + FAISS dédié.	`agent-tooling-clarifications`	`scripts/clarifications-persist.py`, `scripts/clarifications-index.py`
`clarifications-search`	CLI `/clarifications` filtre exact `--domain --project` + tri récence.	`agent-tooling-clarifications`	`scripts/clarifications-search.py`
`scoring` (B3)	Lecture `learnings-injections.jsonl`, calcul `reuse_score` (formule INV-295-05), écriture `data/learnings-scores.jsonl` jointable via `{story, gate, tags_hash}`.	`agent-tooling-scoring`	`scripts/compute-reuse-scores.py`
`lifecycle` (B4)	Migration initiale `scope=story`, moteur de transitions (FSM), promotion `story→domain→global`, éviction vers `learnings-archive.jsonl`.	`agent-tooling-lifecycle`	`scripts/learnings-lifecycle.py`, `scripts/lib/scope_fsm.py`
`injection` (B5)	Orchestration step 0 : récupération 3 sources, assemblage bloc Markdown 3 sections (5/3/3 max), mode dégradé non bloquant, stderr sur erreur.	`agent-tooling-injection`	`scripts/gov-learnings-inject.py` (extension existante), `scripts/lib/inject_block.py`
`gov-step-0-glue`	Hook dans `.claude/commands/gov-step-0.md` / `scripts/gov-step.sh` pour déclencher `clarifications-store` après recueil PO et `injection` au début du step 0.	`agent-tooling-glue`	`scripts/gov-step.sh` (patch), `.claude/commands/gov-step-0.md` (patch)
`analyze-compounding`	Mesure KPI CS-1..CS-4 post-merge (hors Gate 8).	`agent-tooling-metrics`	`scripts/analyze-compounding.py` (extension)

Aucune modification d'autres modules (cross-module guards) : PD-295 est 100% confiné à ProbatioVault-ia-governance/scripts/ et ProbatioVault-ia-governance/data/. Aucune route HTTP, aucun guard NestJS, aucun schéma SQL.

2. Flux techniques¶

2.1 Flux B1 — Indexation veille¶

veille-index.py parcourt ProbatioVault-doc/docs/veille/**/*.md avec glob.glob(recursive=True).
Pour chaque fichier : parse frontmatter YAML via python-frontmatter. Si frontmatter manquant ou invalide, fiche ignorée, print(f"[WARN] veille: {path} skipped: {err}", file=sys.stderr).
Validation schéma : champs requis title, date, verdict_veille ∈ {signal,bruit,veille}, impact_pv ∈ {fort,modere,faible,aucun}. Validation via data_format.validate_veille_record().
Extraction : H1 comme fallback title, résumé = premier paragraphe sous ## Résumé ou les 400 premiers chars.
Construction ligne JSONL : {path, title, date, verdict_veille, impact_pv, summary, tags[]}. Écriture via jsonl_atomic.write_all() (tempfile + fsync + rename).
Embeddings : pour chaque résumé, appel Ollama POST /api/embeddings modèle nomic-embed-text, timeout 30000 ms. Dimension attendue 768, sinon raise ValueError.
FAISS : IndexFlatIP 768 dim, mapping index position → ligne veille.jsonl. Persistance data/veille-index.faiss et data/veille-embeddings.npy via rename atomique.

2.2 Flux B2 — Clarifications persistées¶

gov-step-0-glue collecte les 4 réponses PO via gov_ask_po text (helper existant scripts/lib/gov-interact.sh, conforme mode Ringbearer).

clarifications-persist.py --story PD-XXX --project backend écrit le fichier Markdown :

# PD-XXX — Clarifications PO
Date : 2026-04-12
Domain : tooling
Project : ia-governance

## Q1. {question 1}
{verbatim R1}

## Q2. {question 2}
...

Règle de ré-entrée : si PD-XXX-clarifications.md existe déjà, créer PD-XXX-clarifications-{YYYY-MM-DD-HHMMSS}.md à côté (append historique). Cf. §9 Écart 12 — les versions successives sont toutes indexées, la plus récente est retournée par défaut.
clarifications-index.py lit tous les PD-*-clarifications*.md du workspace, produit data/clarifications.jsonl (un enregistrement par question, soit 4 lignes par fichier) avec {story_id, domain, project, question_idx, verbatim, date}, puis index FAISS data/clarifications-index.faiss. Embeddings Ollama sur le verbatim concaténé.

2.3 Flux B3 — Scoring de réutilisation (résolution M-01)¶

compute-reuse-scores.py lit data/learnings-injections.jsonl (si absent : traité comme corpus vide + print(..., file=sys.stderr), voir ERR-295-03).
Aggrégation par (story, gate, tags_hash) → {nb_injections, set(stories_gate8_go), set(domains)}.
Application formule brute INV-295-05 :
```
reuse_score = (
    0.4 * nb_injections
    + 0.4 * nb_stories_gate8_go_apres_injection
    + 0.2 * nb_domains_distincts
)
```
Pas de normalisation : le score est une somme pondérée à valeurs dans [0, 999999.9999] (D-295-08). TC-NOM-05 valide score(L1)=1.8 et score(L2)=1.0, ce qui confirme l'intention.
Interprétation des seuils promotion (INV-295-09/10) :
reuse_score >= 0.3 (soit nb_injections>=1 OU gate8_go>=1 OU domains>=2) → éligible domain
reuse_score >= 0.6 (par exemple nb_injections>=2, ou nb_inj=1 + gate8_go=1) → éligible global
Écriture data/learnings-scores.jsonl atomique, une ligne par clé {story, gate, tags_hash, reuse_score, nb_injections, nb_stories_gate8_go_apres_injection, nb_domains_distincts, computed_at}.
Learnings non jointables (aucune ligne learnings-injections.jsonl) → pas de ligne learnings-scores.jsonl produite. Semantique aval (résolution Écart 9) : tout lookup scope_fsm.get_score() sur un learning non scoré retourne 0.0 via dict.get(key, 0.0). Tri secondaire INV-295-07 : score manquant traité comme 0.0, stable tri (sorted(..., key=..., stable=True)).

2.4 Flux B4 — Promotion et éviction¶

learnings-lifecycle.py --migrate : lit learnings.jsonl, pour chaque entrée sans champ scope → ajoute "scope": "story". Idempotent : si scope déjà présent, no-op. Un champ scope_migrated_at est persisté comme sentinelle d'achèvement (résolution Écart 7).
learnings-lifecycle.py --promote : FSM de scope_fsm.py. Pour chaque learning, lookup reuse_score via jointure {story, gate, tags_hash}. Applique strictement :
story → domain si reuse_score >= 0.3
domain → global si reuse_score >= 0.6
Jamais de downgrade : toute transition retour lève TransitionRefused et retourne le learning inchangé, stderr.
learnings-lifecycle.py --evict : pour chaque learning, calcule age_days = (today - date_created).days. Champ de référence = date_created persisté en entête du learning (créé par gov-compounder). Si absent, fallback sur date D-295-12 interprété comme date de création (résolution Écart 14, H-03). Éviction ssi age_days > 56 AND nb_injections == 0. Déplacement vers learnings-archive.jsonl : append dans archive (flock), puis réécriture atomique de learnings.jsonl sans le learning évincé.
ARCHIVED est terminal : scope_fsm.transition(archived, *) retourne toujours TransitionRefused("ARCHIVED terminal, résolution manuelle uniquement").
Les index actifs (learnings-index.faiss, learnings-cache.json) sont reconstruits à la fin d'un run lifecycle par appel à l'existant learnings-index-rebuild.py, en excluant learnings-archive.jsonl du corpus source.

2.5 Flux B5 — Injection unifiée step 0 (résolution M-02)¶

gov-learnings-inject.py reçoit --story PD-XXX --domain tooling --project ia-governance --query-text "...".
Section 1 — learnings (top 5) : embed(query_text) via Ollama, KNN sur learnings-index.faiss (hors archive), tri primaire = score cosine, tri secondaire = reuse_score desc (fallback 0.0), tri tertiaire = date_created desc (tie-break déterministe, résolution Écart 13).
Section 2 — veille (top 3) : embed(query_text), KNN sur veille-index.faiss, filtre post-KNN impact_pv ∈ {fort, modere}, puis limitation à 3. Scénario test recommandé ajouté en §4 pour couvrir INV-295-15/16 (Écart 11).
Section 3 — clarifications (top 3) : filtre exact sur clarifications.jsonl par {domain, project} (pas de KNN, résolution M-02), tri par date desc, limitation à 3. Pas d'embed, pas d'appel Ollama pour cette section → plus robuste (pas d'échec Ollama possible).

Assemblage Markdown strict :

## Learnings injectés (reuse_score)
- [PD-XXX gate 5] {extrait} (score=1.8)
- ...

## Fiches veille pertinentes
- [2026-04-10] {title} (impact={impact_pv})
- ...

## Clarifications PO passées (domain/project)
- [PD-YYY] Q1: {verbatim 200 chars}
- ...

Mode dégradé non bloquant par section : | Cas | Comportement | | --- | --- | | 0 résultat | Section contient _aucun résultat_ | | Exception (IO/FAISS/Ollama) | try/except Exception as e: print(f"[inject] section {name} failed: {e}", file=sys.stderr); section="_source indisponible_" |
Sortie sur stdout → capturée par gov-step-0.md et concaténée au prompt step 0.

2.6 Canonicalisation tags_hash (résolution M-03)¶

Fonction unique data_format.compute_tags_hash(tags: list[str]) -> str :

import hashlib, unicodedata

def compute_tags_hash(tags: list[str]) -> str:
    if tags is None:
        tags = []
    normalized = []
    for t in tags:
        t = unicodedata.normalize("NFC", t).strip().lower()
        if not t.startswith("#"):
            t = f"#{t}"
        normalized.append(t)
    canonical = "\n".join(sorted(set(normalized)))
    return hashlib.sha256(canonical.encode("utf-8")).hexdigest()

Règles contractuelles (INV-plan-03) : - Tout producteur de ligne learnings.jsonl OU learnings-scores.jsonl DOIT appeler compute_tags_hash() — aucune autre implémentation. - Test unitaire obligatoire test_tags_hash_canonical_idempotent : permutations, casse, dédup, espaces, NFD/NFC → même hash. - Invariant cross-module dans code-contracts.yaml : forbidden: "Implémenter sa propre canonicalisation tags_hash".

2.7 Atomicité JSONL (résolution M-06)¶

Module scripts/lib/jsonl_atomic.py — 3 primitives :

def write_all(path: Path, lines: Iterable[dict]) -> None:
    """Réécriture totale : tempfile + fsync + rename atomique POSIX."""
    tmp = path.with_suffix(path.suffix + ".tmp")
    with open(tmp, "w", encoding="utf-8") as f:
        for line in lines:
            f.write(json.dumps(line, ensure_ascii=False) + "\n")
        f.flush()
        os.fsync(f.fileno())
    os.replace(tmp, path)  # atomique POSIX

def append_locked(path: Path, line: dict, timeout: float = 5.0) -> None:
    """Append guardé par flock advisory LOCK_EX sur sentinelle .lock."""
    lock_path = path.with_suffix(path.suffix + ".lock")
    with open(lock_path, "a+") as lock_fd:
        _acquire_flock(lock_fd, fcntl.LOCK_EX, timeout)
        with open(path, "a", encoding="utf-8") as f:
            f.write(json.dumps(line, ensure_ascii=False) + "\n")
            f.flush()
            os.fsync(f.fileno())
        fcntl.flock(lock_fd, fcntl.LOCK_UN)

def read_snapshot(path: Path) -> list[dict]:
    """Lecture snapshot : open fd puis read all. Un rename concurrent ne corrompt pas."""
    with open(path, "r", encoding="utf-8") as f:
        return [json.loads(l) for l in f if l.strip()]

Règles contractuelles (INV-plan-06) : - Toute écriture sur veille.jsonl, clarifications.jsonl, learnings-scores.jsonl, learnings.jsonl, learnings-archive.jsonl DOIT passer par jsonl_atomic.*. - Interdit : open(path, "w") direct, > shell redirect, pandas.to_json(path). - Le reader (injection) utilise read_snapshot() → un rename concurrent du writer ne corrompt jamais la lecture (Linux/macOS : rename(2) atomique et le fd pointe sur l'inode d'origine). - H-01 (spec §9) est maintenue côté fonctionnel mais techniquement renforcée par ces primitives : même en cas de violation accidentelle du mono-instance, pas de corruption.

2bis. Diagramme de dépendances agents step 6b¶

graph LR
    subgraph Wave1["Wave 1 — Fondations (parallèle)"]
        CORE[agent-tooling-core<br/>data-format + jsonl_atomic]
    end
    subgraph Wave2["Wave 2 — Indexers (parallèle, dépend de CORE)"]
        VEI[agent-tooling-indexer<br/>veille-indexer + search]
        CLR[agent-tooling-clarifications<br/>clarifications-store + search]
        SCO[agent-tooling-scoring<br/>reuse_score]
    end
    subgraph Wave3["Wave 3 — Lifecycle + Injection"]
        LIF[agent-tooling-lifecycle<br/>FSM scope + évicition]
        INJ[agent-tooling-injection<br/>bloc unifié 5/3/3]
    end
    subgraph Wave4["Wave 4 — Glue + Metrics"]
        GLU[agent-tooling-glue<br/>gov-step-0 hook]
        MET[agent-tooling-metrics<br/>analyze-compounding]
    end
    subgraph Wave5["Wave 5 — Tests"]
        TST[agent-tooling-tests<br/>pytest + fixtures]
    end

    CORE --> VEI
    CORE --> CLR
    CORE --> SCO
    SCO --> LIF
    VEI --> INJ
    CLR --> INJ
    LIF --> INJ
    INJ --> GLU
    LIF --> MET
    GLU --> TST
    MET --> TST

5 waves. La wave 1 (CORE) doit être mergée avant que les autres démarrent, car elle fournit les primitives partagées (tags_hash, jsonl_atomic). La wave 5 consomme l'ensemble pour tests intégrés.

3. Mapping invariants → mécanismes¶

Invariant ID	Exigence	Mécanisme	Composant	Observable	Risque
INV-295-01	Scan veille non bloquant	`glob.glob("*/.md", recursive=True)` + `try/except` par fichier avec stderr	`veille-indexer`	stderr `[WARN] veille: skipped` + ligne produite pour fiches valides	Frontmatter YAML ambigu
INV-295-02	Conformité §5.1 veille	`data_format.validate_veille_record()` regex/enums D-295-13/14	`data-format`	Validation échec → ligne rejetée + stderr	Champs D-295 sous-spécifiés (Écart 5) → voir H-04
INV-295-03	Verbatim brut 4 réponses PO	Écriture Markdown sans transformation (copie byte-pour-byte)	`clarifications-store`	Hash SHA-256 R1..R4 input == hash des sections Q1..Q4 output	Ré-entrée (Écart 12) → §2.2 step 3
INV-295-04	Indexation clarifications	`clarifications-index.py` produit JSONL + FAISS	`clarifications-store`	`ls data/clarifications.jsonl data/clarifications-index.faiss` OK	Corruption JSONL → `jsonl_atomic`
INV-295-05	Formule `reuse_score` exacte	Fonction `compute_reuse_score(inj, g8, doms)` constantes `0.4, 0.4, 0.2` hardcodées	`scoring`	Test unitaire TC-NOM-05 vérifie 1.8 / 1.0	Normalisation non demandée — M-01 résolu
INV-295-06	Jointure univoque	Clé composite `(story, gate, tags_hash)` + assert unicité à l'écriture	`scoring`	Set de clés = nb lignes	tags_hash divergent → M-03 résolu
INV-295-07	Tri secondaire par reuse_score	`sorted(hits, key=lambda h: (-h.similarity, -h.reuse_score, -h.date_created))`	`injection`	Test TC-NOM-07	Score absent → `dict.get(key, 0.0)`
INV-295-08	Migration `scope=story`	`learnings-lifecycle.py --migrate` idempotent	`lifecycle`	Sentinelle `scope_migrated_at` persistée	Ré-exécution (Écart 7) → idempotence §2.4
INV-295-09	Seuil 0.3 story→domain	`if score >= 0.3: fsm.promote(story, domain)`	`lifecycle`	TC-NOM-09 bords 0.29/0.30	—
INV-295-10	Seuil 0.6 domain→global	`if score >= 0.6: fsm.promote(domain, global)`	`lifecycle`	TC-NOM-09 bords 0.59/0.60	—
INV-295-11	Éviction stale conjonctive	`if age_days > 56 and nb_injections == 0: archive()`	`lifecycle`	TC-NOM-10 matrix 4 cas	Base de date (Écart 14) → H-03
INV-295-12	Archive hors index actif	Rebuild `learnings-index.faiss` exclut `learnings-archive.jsonl`	`lifecycle`	TC-NOM-11 : archived hit absent	Reader utilise snapshot pendant rebuild
INV-295-13	Bloc 3 sections 5/3/3 max	`hits[:5]`, `hits[:3]`, `hits[:3]` + assembleur Markdown fixe	`injection`	Parsing Markdown : 3 `##` sections, cardinalités	—
INV-295-14	`aucun résultat` non bloquant	`if not hits: section="_aucun résultat_"`	`injection`	TC-NOM-13	—
INV-295-15	Erreur source non bloquante	`try/except Exception` par section + `print(..., file=sys.stderr)`	`injection`	TC-NOM-14 / TC-ERR-04/05 : stderr + section vide + exit 0	—
INV-295-16	Exclusions respectées	Aucune dépendance à hashlib HMAC, Vault SDK, JCS lib, regex PII, etc.	Tous	`grep -r "hmac\\|vault\\|jcs\\|PII" scripts/` → 0 hit hors tests	—
INV-295-TR-01..09	Transitions autorisées	`scope_fsm.ALLOWED_TRANSITIONS` dict	`lifecycle`	TC-NOM-16	—
INV-295-TR-02/04/07/08/10..13	Transitions interdites	`scope_fsm.transition()` raise `TransitionRefused`	`lifecycle`	TC-ERR-07, TC-ERR-11 : stderr + scope inchangé	—

Invariants ajoutés par le plan (non spec)¶

ID	Règle	Justification	Test cible
INV-plan-03	Canonicalisation `tags_hash` unique via `data_format.compute_tags_hash()`	M-03 — éviter divergences entre producteurs	`test_tags_hash_canonical_idempotent`
INV-plan-06	Toute écriture JSONL via `jsonl_atomic.*` (tempfile+rename ou flock append)	M-06 — anti-corruption	`test_jsonl_atomic_concurrent_rename`
INV-plan-07	Hit manquant score traité comme 0.0 (pas null)	Écart 9 — déterminisme tri	`test_sort_missing_score_as_zero`
INV-plan-08	Tri tertiaire par `date_created` desc	Écart 13 — tie-break déterministe	`test_tiebreak_date_created`
INV-plan-09	Base d'âge = `date_created` puis fallback D-295-12	Écart 14 — sémantique explicite	`test_eviction_age_reference`

4. Mapping critères d'acceptation → mécanismes¶

Critère ID	Mécanisme(s)	Composant	Observable	Risque
CA-295-01	§2.1 step 1-5	`veille-indexer`	`wc -l data/veille.jsonl == nb fiches valides`	—
CA-295-02	`data_format.validate_veille_record()`	`data-format`	Pass/fail par ligne	Écart 5 — champs non énumérés
CA-295-03	§2.2 step 2	`clarifications-store`	Diff byte verbatim input/output	—
CA-295-04	§2.2 step 4	`clarifications-store`	`ls` + search returns hit	M-02 résolu (filtre exact)
CA-295-05	`compute_reuse_score()` pure function	`scoring`	Test unitaire TC-NOM-05	—
CA-295-06	Clé composite + assert unicité	`scoring`	`len(set(keys)) == len(lines)`	M-03 résolu
CA-295-07	§2.5 step 2 tri secondaire	`injection`	TC-NOM-07	—
CA-295-08	§2.4 step 1 migration	`lifecycle`	`jq '.scope == "story"' learnings.jsonl` sur entrées pre-migration	—
CA-295-09	§2.4 step 2 seuils stricts	`lifecycle`	TC-NOM-09, TC-NOM-16	—
CA-295-10	§2.4 step 3 règle conjonctive	`lifecycle`	TC-NOM-10 matrix	—
CA-295-11	§2.4 step 5 rebuild index hors archive	`lifecycle`	TC-NOM-11	—
CA-295-12	§2.5 step 5 assembleur fixe	`injection`	Parsing Markdown 3 `##`, counts 5/3/3 max	—
CA-295-13	§2.5 step 6 mode dégradé	`injection`	TC-NOM-13, TC-ERR-06	—
CA-295-14	§2.5 step 6 try/except	`injection`	TC-NOM-14, TC-ERR-04/05/09	—
CA-295-15	`scope_fsm.transition()`	`lifecycle`	TC-ERR-07	—
CA-295-16	`ARCHIVED` terminal	`lifecycle`	TC-ERR-11	—
CA-295-17	Stack Python/FAISS/Ollama/MD/YAML	Tous	`pip list` + absence Swift/Spring dans repo	—

5. Mapping tests (TC-*) → mécanismes + observables¶

Test ID	Référence spec	Mécanisme(s)	Point(s) d'observation	Niveau
TC-NOM-01	INV-295-01, CA-295-01	§2.1 glob + try/except par fichier	`data/veille.jsonl` 3 lignes, stderr 2 warnings	Integration
TC-NOM-02	INV-295-02, CA-295-02	`validate_veille_record()`	Aucune ligne invalide acceptée	Unit
TC-NOM-03	INV-295-03, CA-295-03	§2.2 step 2	Hash verbatim input/output	Integration
TC-NOM-04	INV-295-04, CA-295-04	§2.2 step 4	Présence JSONL + FAISS + search hit	Integration
TC-NOM-05	INV-295-05, CA-295-05	`compute_reuse_score()`	Scores 1.8 et 1.0 exacts	Unit
TC-NOM-06	INV-295-06, CA-295-06	Clé `(story,gate,tags_hash)` + tags_hash canonique	`len(set(keys))==len(lines)`	Unit + Integration
TC-NOM-07	INV-295-07, CA-295-07	§2.5 step 2 sorted multi-clé	Ordre dans hits	Unit
TC-NOM-08	INV-295-08, CA-295-08	`--migrate` idempotent	`scope_migrated_at` présent, tous en `story`	Integration
TC-NOM-09	INV-295-09/10, CA-295-09	`scope_fsm.promote()` seuils	Scope après run	Unit
TC-NOM-10	INV-295-11, CA-295-10	`--evict` conjonctif	`learnings-archive.jsonl` contient A, pas B/C/D	Integration
TC-NOM-11	INV-295-12, CA-295-11	Rebuild exclut archive	Search ne retourne jamais archived	Integration
TC-NOM-12	INV-295-13, CA-295-12	Assembleur fixe	Parser Markdown	Integration
TC-NOM-13	INV-295-14, CA-295-13	`if not hits: "_aucun résultat_"`	stdout contient `_aucun résultat_`, exit 0	Integration
TC-NOM-14	INV-295-15, CA-295-14	try/except section	stderr + exit 0 + bloc rendu	Integration
TC-NOM-15	CA-295-17	N/A (documentaire)	grep spec	Doc
TC-NOM-16	INV-295-TR-01/03/05/06/09	`scope_fsm.transition()` matrix	FSM table	Unit
TC-ERR-01	ERR-295-01	`validate_*()` par ligne	stderr + lignes valides poursuivies	Integration
TC-ERR-02	ERR-295-02	§2.1 try/except	Idem	Integration
TC-ERR-03	ERR-295-03	Corpus vide → `dict.get(k, 0.0)`	Scoring poursuit, stderr	Integration
TC-ERR-04	ERR-295-04	try/except Ollama timeout	Section vide + stderr, exit 0	Integration
TC-ERR-05	ERR-295-05	try/except FAISS load	Idem	Integration
TC-ERR-06	ERR-295-06	§2.5 step 6	`_aucun résultat_`	Integration
TC-ERR-07	ERR-295-07, INV-295-TR-02/04/07/08	`TransitionRefused`	Scope inchangé + stderr	Unit
TC-ERR-08	ERR-295-08	Assertions bornes §5.2	Test KO	Unit
TC-ERR-09	ERR-295-09	try/except OSError sur write	stderr, step 0 continue	Integration
TC-ERR-10	ERR-295-10	Jointure vide → exclusion	stderr + learning exclu	Unit
TC-ERR-11	INV-295-TR-10..13	`ARCHIVED` terminal	Toutes transitions refusées	Unit
TC-NEG-01	§5.1 D-295	`validate_*()`	Rejet selon colonne Invalide	Unit
TC-NEG-02	§5.2	Assertions bornes	KO	Unit
TC-NEG-03	INV-295-11	`--evict` bords 56/57	Diff scope	Unit
TC-NEG-04	D-295-15/16	`validate_clarification_filename()`	Refus	Unit
TC-NEG-05	Dim 768	Assert `embedding.shape == (768,)`	Index non publié	Unit
TC-NR-01..05	§6	Scan dépendances + behavior	Tests smoke	Integration

Tests de plan (nouveaux, couvrent résolutions)¶

Test ID	Résolution	Mécanisme
TC-PLAN-01	M-03	`test_tags_hash_canonical_idempotent` : 10 permutations → même hash
TC-PLAN-02	M-06	`test_jsonl_atomic_concurrent_write_rename` : 2 processus, 1 writer + 1 reader, aucune corruption
TC-PLAN-03	Écart 9	`test_sort_missing_score_as_zero` : learning sans score classé comme score=0.0
TC-PLAN-04	Écart 11	`test_injection_veille_filter_impact_pv` : fiches `impact_pv=faible` exclues du bloc injection
TC-PLAN-05	Écart 13	`test_tiebreak_date_created_desc` : similarité+score égaux → date plus récente avant

Périmètre de test¶

Niveau	In scope	Hors scope
Unitaire	`data-format`, `scoring`, `scope_fsm`, `injection` assembleur, `jsonl_atomic`	—
Intégration	Flux complets B1..B5 sur fixtures, cross-module `scoring ↔ lifecycle ↔ injection`	—
E2E	`gov-step-0` complet avec mock Ollama sur fixture PD-295 elle-même (self-test)	—
Performance	—	Hors scope : aucune exigence P95/P99 (spec §5.2 colonne percentile `n/a`)
Sécurité	Audit regex injection Markdown, absence de filtres PII (conforme INV-295-16)	Pentest, filtres PII, signature HMAC (explicitement hors spec §2)

Tous les niveaux fonctionnels sont in scope. Les niveaux performance/sécurité avancée sont hors scope par décision contractuelle de la spec (§2 exclusions).

6. Gestion des erreurs¶

ID	Cas	Traitement	Observable
ERR-295-01	Donnée invalide §5.1	Rejet ligne + `print(f"[{module}] rejected: {reason}", file=sys.stderr)`	stderr + ligne absente de l'artefact
ERR-295-02	Frontmatter veille incomplet	Ignorée (§2.1 step 2)	stderr `[veille] skipped: {path}`
ERR-295-03	`learnings-injections.jsonl` absent	Corpus vide, `defaultdict(int)`, scoring continue	stderr `[scoring] injections corpus empty`
ERR-295-04	Échec Ollama (timeout 30s ou 5xx)	`try/except (TimeoutError, HTTPError)` par section	stderr `[inject] ollama {section} failed: {e}`
ERR-295-05	Index FAISS absent/corrompu	`try/except (FileNotFoundError, RuntimeError)` au load	stderr `[inject] faiss {section} failed: {e}`
ERR-295-06	0 résultat valide	Section `_aucun résultat_`	stdout contient la string
ERR-295-07	Transition interdite	`raise TransitionRefused(msg)` capturé par caller, stderr	scope inchangé, compteur `refused_transitions++`
ERR-295-08	Valeur hors bornes §5.2	`assert` Python + test KO	pytest rouge
ERR-295-09	Écriture clarification impossible	`try/except OSError`, step 0 continue sans persistance	stderr `[clarifications] write failed: {path}: {e}`
ERR-295-10	Jointure impossible	Learning exclu de `learnings-scores.jsonl`	stderr `[scoring] no injection record for {key}`

Politique globale : tout stderr suit le format [{module}] {short_msg}: {details} pour parsing aval par /gov-retrospective (terminal.log injection, cf. mémoire projet). Aucun sys.exit(1) dans B1/B2/B5 — le step 0 est non bloquant par contrat.

7. Impacts sécurité¶

Risque	Mitigation	Journalisation
Injection Markdown via verbatim PO (CommonMark passe le texte tel quel)	Le verbatim est inséré dans des blocs Markdown sous `## Qi.` sans interprétation HTML. Claude Code rend en monospace, pas de vecteur XSS actif.	—
Injection Markdown via tags/summary veille	Même analyse. Les fiches veille sont internes, auteur == utilisateur.	—
Fuite verbatim PO dans index FAISS	Accepté par spec (§2 purge RGPD exclue, §10 Écart 10). H-05 explicite cette limite en §8.	Suppression manuelle = supprimer `PD-XXX-clarifications*.md` puis réindexation globale
Corruption JSONL par écriture concurrente	Primitives `jsonl_atomic.*` (INV-plan-06)	stderr `[atomic] lock timeout` si flock bloqué > 5s
Embeddings Ollama captent des secrets en clair	Ollama 100% local (IA-Server ou macbook), aucun appel sortant. Aucune exfiltration possible.	—
Saut de scope `story→global` non spécifié	`scope_fsm` refuse toute transition non explicitement autorisée (whitelist)	stderr

Conformité Art. VI (responsabilité) : tous les try/except Exception sont étroits et journalisent. Aucun pass silencieux. Aucun .catch(() => log) pattern (anti-catch-absorb, cf. learnings universels).

8. Hypothèses techniques¶

ID	Hypothèse	Impact si faux
H-01	`reuse_score` est une somme pondérée brute, pas un ratio. TC-NOM-05 (scores 1.0/1.8) confirme. Les seuils 0.3/0.6 sont appliqués à la somme brute (résolution M-01).	Si l'intention réelle était un ratio normalisé, les seuils seraient trivialement atteints par 1 injection. Dans ce cas, la spec devra être amendée en PD-295bis, pas dans PD-295.
H-02	La recherche clarifications step 0 est filtre exact `{domain, project}` + tri récence, pas KNN vectoriel. Le diagramme §5bis CLR est stylisé. `top_k_clarifications_step0=3` = limite de pagination sur résultat filtré (résolution M-02). Un index FAISS clarifications est quand même construit pour alimenter une future `/clarifications-search` sémantique hors step 0.	Si l'intention est KNN, il suffira de commuter l'appel `store.search_exact(d,p)` vers `store.search_knn(embed)` — modification <10 lignes.
H-03	La base de calcul de l'âge pour éviction est `date_created` persisté par gov-compounder. Fallback sur D-295-12 `date` si absent.	Éviction incorrecte possible sur les learnings antérieurs à l'introduction de `date_created`. Mitigation : script one-shot backfill `date_created = date` si absent lors de la migration initiale B4.
H-04	Les champs veille non listés dans §5.1 (`title`, `summary`) sont validés en best-effort : longueur max 256 / 2000, pas de regex stricte. Les champs §5.1 D-295-13/14 sont stricts.	Écart 5 — si Gate 5 exige regex stricte sur tous champs, ajouter au `validate_veille_record()`.
H-05	Le verbatim PO brut reste persisté sans filtre PII ni purge RGPD. Conforme §2 exclusions. Suppression = effacement manuel fichier + réindexation globale.	Pas d'impact avant PD-295+N où la conformité RGPD serait introduite.
H-06	Ré-exécution step 0 pour la même story : nouveau fichier horodaté à côté de l'existant (`PD-XXX-clarifications-{ts}.md`). Tous sont indexés. Recherche retourne le plus récent par défaut.	Impact Écart 12 : comportement contractualisé au plan.
H-07	Mono-instance (spec H-295-01) est la règle, mais `jsonl_atomic` fournit une défense en profondeur. Aucune garantie de correction sous écriture multi-process réelle hors de ce qui est documenté en §2.7.	—
H-08	Ollama est accessible sur `http://localhost:11434` (laptop) ou `http://192.168.1.82:11434` (IA-Server). Variable `OLLAMA_HOST` résout le fallback.	Si les deux sont down → mode dégradé (sections vides), step 0 continue (INV-295-15).

9. Points de vigilance¶

Écart 5 (spec) : la spec §5.1 n'attribue pas explicitement les champs D-295 à chaque artefact. Le plan interprète — à valider en Gate 5 que l'interprétation des regex applicables par artefact est acceptée.
Écart 8 (cardinalités cibles vs max) : le plan choisit max (moins = acceptable). Si 0 hit → _aucun résultat_, si 1-4 hits → section avec N éléments. Pas d'erreur pour 1 ≤ N < top_k.
Écart 10 (RGPD) : fonctionnalité de suppression ciblée absente. Dette documentée comme H-05.
Écart 11 (filtre impact_pv) : TC-PLAN-04 ajouté pour couvrir la régression silencieuse possible.
Écart 14 (date base) : le champ date_created doit être introduit en même temps que la migration B4, sinon le fallback risque d'être systématique.
Ré-entrée clarifications : accumulation de fichiers PD-XXX-clarifications-{ts}.md peut être bruyante sur les stories relancées souvent. À réévaluer si >5 versions pour une même story.
Dette FAISS index clarifications : construit mais non utilisé par step 0 (H-02). À supprimer si la décision M-02 est confirmée en long terme, ou à brancher sur /clarifications CLI.
Tri cosine FAISS IndexFlatIP : nécessite normalisation L2 des embeddings avant insertion (faiss.normalize_L2). À ne pas oublier dans veille-indexer et learnings-indexer.
Parallélisation agents step 6b : les waves 2 (indexers) sont indépendantes mais toutes consomment data-format. Wave 1 doit finir 100% avant d'ordonnancer wave 2. Surveiller conflits de merge sur scripts/lib/data_format.py.

10. Hors périmètre¶

Signature HMAC, canonicalisation JCS, clé Vault, fail-closed strict, filtre PII, purge RGPD multi-artefacts, rate-limit lecture, reranker neural, BM25 hybride, knowledge graph, skill router, résumés hiérarchiques (spec §2).
Migration moteur de stockage, refonte pipeline existant hors B1..B5.
Toute exigence SaaS multi-tenant ou conformité externe.
Performances percentiles P95/P99 (aucune dans spec §5.2).
Restauration depuis ARCHIVED (état terminal, résolution manuelle hors automation).
Validation Enum exacte du champ project D-295-04 (non énumérée en spec — traité comme string en best-effort, flaggé dette).
Tie-break parfait lorsque similarité ET reuse_score ET date_created sont égaux (ultra-rare, accepté comme non déterministe au-delà de l'ordre stable Python).

Réserve	Écart spec	Décision plan	Référence
M-01	Écart 1 — `reuse_score` non normalisé	Score = somme pondérée brute (pas un ratio). Seuils 0.3/0.6 appliqués à la somme brute. Cohérent avec TC-NOM-05 (1.0 / 1.8). Documenté en §2.3 + H-01.	§2.3, H-01
M-02	Écart 2 — filtres exacts vs KNN clarifications	Récupération clarifications = filtre exact `{domain, project}` + tri récence desc, `top_k=3`. Le diagramme §5bis est stylisé. Documenté en §2.5 + H-02.	§2.5, H-02
M-03	Écart 3 — `tags_hash` non canonique	Canonicalisation stricte : lowercase NFC → dédup → tri ASC → join `\n` → SHA-256 hex lowercase. Fonction unique `compute_tags_hash()` partagée entre producteurs learnings et scoring.	§2.6, INV-plan-03
M-06	Écart 6 — atomicité JSONL	Pattern tempfile + fsync + rename atomique (POSIX) pour toute réécriture. Appends gardés par flock advisory (`LOCK_EX`) sur sentinelle `.lock` par fichier. Reader utilise snapshot (open fd à l'entrée).	§2.7, INV-plan-06