Tech

AI Agent Audit Logs for HIPAA, SOC 2, and GDPR Compliance

Akshay Sarode May 10, 2025

Direct answer

For HIPAA, SOC 2, GDPR — auditors want: who did what, when, with tamper evidence; retention policy that matches the regulation; PII handling that's documented; export capability. A tamper-evident hash-chained log + clear retention + PII redaction at append + open-source verifier covers the technical bar. The non-technical bar (policy docs, training, access reviews) is on you.

"We have logs" is not "we have an audit log." Auditors care about specific properties. Here's what they actually look for and how to design your agent's audit trail to meet the bar.

The compliance regimes (short summary)

Regime	What it cares about (audit-wise)
HIPAA	Access logs for PHI; min 6-year retention; tamper resistance; access by role
SOC 2	Access logs for production; tamper resistance; review process; retention per policy (typically 1–7 years)
GDPR	Records of processing activity; data subject rights (export, delete); breach notification
PCI-DSS	Access logs for cardholder data environment; min 1-year retention; daily review
ISO 27001	Access logs; review process; tamper resistance; clock sync

The technical bar

1. Tamper evidence

Auditors want to know the log wasn't modified after the fact. A hash chain — row.hash = sha256(prev.hash || row.body) — gives this property cryptographically. Modifying any past row breaks every subsequent hash.

2. Append-only by construction

The chain enforces this. You can't insert a row in the middle without recomputing every subsequent hash. The verifier catches it.

3. Independent verification

An auditor should be able to verify the chain themselves. Open-source the verifier. Anyone with the export + the last verified hash can re-walk the chain and confirm integrity.

4. Retention

HIPAA: 6 years from creation or last use, whichever is later. SOC 2: per policy (typically 1–7 years). PCI-DSS: 1 year minimum. Pick the longest applicable to your business and document it.

5. Clock sync

Timestamps must be from a trusted source. NTP-synced server clock is fine; auditors don't expect you to run an atomic clock. Document the source.

6. PII handling

Don't log PII you can't justify retaining for the full retention period. Redact at append time — once a row is in the chain, you can't modify it without breaking the chain.

What goes in the log

For an AI agent in a regulated environment:

Agent created / authenticated / scope changed
Capability granted / revoked / used (with target resource)
Data accessed (with resource ID, NOT the data itself unless you must)
Data modified (with diff or hash, depending on PII rules)
External communication (email sent, API called, with recipient hash if PII-redacted)
Approval requested / decided (with approver ID, decision, timestamp)
Spend cap thresholds crossed
Sandbox blocks (kernel decisions)
Login / logout / failed auth

What NOT in the log: PHI body content, full PII bodies, sensitive prompt content. Reference these by ID; store the actual data in a separately-controlled store with its own access logging.

The hash chain

row[N].hash = sha256(prev_hash || canonicalize(row.body))
row[0].hash = sha256("" || canonicalize(row[0].body))

canonicalize: stable JSON serialization with sorted keys. The choice is part of the protocol; switching mid-flight invalidates everything.

The verifier

const { verifyChain } = require('@axy/audit-chain');
const result = await verifyChain(firestore, 'audit');
if (!result.valid) {
  alert(`Chain broke at seq=${result.firstBadSeq}`);
}
console.log(`Verified through seq ${result.lastValidSeq} at ${result.verifiedAt}`);

Run hourly. Alert on failure. @axy/audit-chain is open Apache-2.0 (~200 LoC TS, ~200 LoC Python).

PII redaction at append

// At log time
const event = {
  actor: 'agent:abc',
  action: 'postbox.send',
  target_hash: sha256('user@example.com'),  // not the email itself
  body: { subject_hash: sha256(subject), size: bodyText.length },
};
appendChainEntry(firestore, 'audit', event);

Auditors can verify the log without seeing user emails. With the original data (held separately), you can reconstruct identities for legitimate review.

Data subject rights (GDPR)

"Right to deletion" is a tension with append-only logs. The standard resolution: store PII in a deletable store (Firestore docs, S3 objects); reference them in the audit log by hash. When the data subject requests deletion, you delete the original PII; the audit log retains hashes and shape but not the data. The hash chain is unbroken.

Export for auditor review

Three things to export:

The chain rows (JSON dump from your Firestore audit/ collection)
The last verified hash (so the auditor's verifier can confirm integrity at export time)
The verifier source (or pointer to public repo)

The auditor runs the verifier against (1) and (3); confirms (2) matches the recomputed last hash; that's mathematical proof the export wasn't tampered.

What auditors don't see (but you should still have)

Documented retention policy
Documented PII redaction policy
Access controls on who can write to the log (the daemon's service account)
Access controls on who can read the log (operator UID + read-only auditor)
Log review process — periodic human review for anomalies
Incident response runbook for "the chain broke"

What auditors will absolutely ask about

"Show me the log entry for [specific incident]" — by user UID, by date, by action type
"Prove this entry wasn't modified" — verify the chain at and around that entry
"How long do you retain logs and where do they live" — your retention policy + your storage
"How are timestamps trustworthy" — NTP, server clock source
"What's logged when an agent makes an OpenAI API call" — actor, action, target (URL), token count, latency, response code

What we ship

Ujex Audit is the implementation of all of this. The chain library is open-source (Apache-2.0). The retention policy is yours to configure. The verifier runs hourly via Cloud Scheduler. PII redaction helpers are in the SDK.

FAQ

Is hash chaining required for compliance?

Not literally — most regulations say 'tamper-evident' without specifying the mechanism. Hash chains are one well-understood implementation. Database append-only modes + immutability + audit trails on the audit table is another path.

How do I handle the right-to-deletion tension?

Reference PII by hash in the chain; store actual PII separately. Delete the PII when requested; the chain stays intact (it never had the data).

Does Ujex's audit log meet HIPAA / SOC 2 by itself?

It provides the technical primitives. Compliance is your full program — policy docs, BAAs, training, access reviews, incident response. The audit log is one piece.

What about real-time alerting on the audit log?

Cloud Function trigger on each new row; pattern-match for known anomalies (failed auth bursts, capability escalations, large outbound transfers). Out of scope for the audit primitive itself.