3 Measuring Identity in Machines
Identity must produce empirical signals. Four measurable dimensions:
Aspect: Decision Persistence
Metric: Mean cosine similarity ≥ 0.75 between embedding trajectories of successive outputs for paraphrased prompts
Interpretation: Stability above 0.75 indicates consistent reasoning style; below 0.5 signals drift.
Aspect: Episodic Recall Accuracy
Metric: Precision/recall on multi-session “who/what/where/when” tasks (Kočiský et al. 2018)
Interpretation: ≥ 80 % recall = effective episodic linkage; < 50 % = context collapse.
Aspect: Preference Drift
Metric: KL-divergence of resonance-weight distributions across epochs
Interpretation: Gradual divergence (< 0.1 nats/epoch) = healthy adaptation; spikes > 0.5 = identity instability.
Aspect: Identity Transferability
Metric: Behavioural Δ before/after importing another model’s memory graph
Use Case: reproducing expert agents across domains or auditing safety replication.
Interpretation: Low Δ (< 10 %) = successful transfer of personality signature.
Thresholds mirror prior similarity-stability studies (Graves et al. 2016; Madaan et al. 2023).
Continuity QA Benchmarks — extend NarrativeQA (Kočiský et al. 2018) or Dialog bAbI (Weston et al. 2016) to multi-session tests:
Session 1: “My dog Max is afraid of fireworks.”
Session 2: “What did I say bothers Max?”
A model with episodic recall answers correctly without re-prompting.
Narrative Reasoning Benchmarks — use CLEVR (Johnson et al. 2017) or GQA (Hudson & Manning 2019) with temporal chains to measure event sequence accuracy.
Forgetting Tests — simulate GDPR deletion (European Commission 2018): delete “Patient ID 1234”; later queries must fail while retaining related knowledge (Cao & Reiter 2021).
Together these benchmarks quantify continuity and controlled forgetting — core conditions for machine identity.
