R analyses leave no audit trail by default. regulog adds
one — a tamper-evident, hash-chained record of every action, change,
decision, and signature. Every entry is attributed to a named user,
time-stamped in UTC, and cryptographically linked to the previous entry
so that any modification after the fact is detectable.
This vignette walks through the complete API from session initialisation to regulatory export.
1. Initialise a session
regulog_init() creates the session object. Every
subsequent log call is attached to this object.
| Argument | Required | Purpose |
|---|---|---|
app |
Yes | Application or system name |
version |
No | Application version (default: "unknown") |
user |
No | Acting user (default: Sys.info()[["user"]]) |
path |
No | File path for persistent .rlog storage |
hash_algo |
No | Hashing algorithm (default: "sha256") |
log <- regulog_init(
app = "primary-analysis",
version = "1.0.0",
user = "jsmith"
# Provide path = "logs/audit.rlog" in production for persistent storage
)
log
#> <regulog>
#> App: primary-analysis v1.0.0
#> User: jsmith
#> Entries: 0
#> Path: (in-memory only)When path is omitted, the log lives in memory only —
suitable for development and testing. In production, always supply a
path so entries survive the R session.
The genesis record is written immediately on
regulog_init(). Its SHA-256 hash anchors the entire chain —
see vignette("hash-chain") for how the cryptographic
linking works.
2. Log actions
log_action() records a discrete event. The
reason argument is mandatory with no
default — undocumented entries are rejected.
log_action(log,
action = "data_read",
object = "adsl.sas7bdat",
reason = "Reading subject-level dataset for primary efficacy analysis"
)
#> regulog: logged action 'data_read' on 'adsl.sas7bdat'The action and object fields accept any
strings — choose a controlled vocabulary that suits your organisation.
Common patterns:
# Analytical steps
log_action(log,
action = "model_fit",
object = "primary_ANCOVA",
reason = "Fitting ANCOVA: CHG ~ TRT01P + BASE + SITEID per SAP section 6.1"
)
#> regulog: logged action 'model_fit' on 'primary_ANCOVA'
# Data exports
log_action(log,
action = "export",
object = "Table14_1.rtf",
reason = "Primary efficacy table exported for clinical study report"
)
#> regulog: logged action 'export' on 'Table14_1.rtf'
# Review and approval events
log_action(log,
action = "approved",
object = "primary_results_v3",
reason = "QC review complete — all outputs match SAP-specified formats"
)
#> regulog: logged action 'approved' on 'primary_results_v3'
# User can override the session user for a single entry
log_action(log,
action = "co_reviewed",
object = "primary_results_v3",
reason = "Independent statistical review complete",
user = "second.reviewer"
)
#> regulog: logged action 'co_reviewed' on 'primary_results_v3'3. Log field changes
log_change() captures a before/after modification — the
primary mechanism for satisfying 21 CFR Part 11 §11.10(e) change
documentation.
log_change(log,
object = "alpha",
field = "value",
before = "0.05",
after = "0.025",
reason = "Significance level updated per protocol amendment 2 (2026-05-01)"
)
#> regulog: logged change to alpha$valueThe before and after arguments are coerced
to character, so they accept any R value:
# Data correction
log_change(log,
object = "subject_01042",
field = "ae_onset_date",
before = "2026-03-01",
after = "2026-03-11",
reason = "Transcription error — corrected per source CRF page 47, query Q-0192"
)
#> regulog: logged change to subject_01042$ae_onset_date
# Configuration update
log_change(log,
object = "model_config",
field = "covariance_structure",
before = "compound_symmetry",
after = "unstructured",
reason = "Unstructured covariance pre-specified in SAP section 6.1.2"
)
#> regulog: logged change to model_config$covariance_structure
# Population definition change
log_change(log,
object = "analysis_population",
field = "SAFFL_definition",
before = "RANDFL = 'Y'",
after = "RANDFL = 'Y' AND EXOCCUR = 'Y'",
reason = "Protocol amendment 3: safety population requires confirmed dosing"
)
#> regulog: logged change to analysis_population$SAFFL_definition4. Log notes and decisions
log_note() captures free-text annotations — any
rationale, observation, or decision that does not fit a discrete action
verb or a before/after field change. Common uses:
# Outlier decision
log_note(
log,
"Outlier identified for subject 01-042 at Week 16 (AVAL = 98.4,
upper fence = 62.1). Discussed with medical monitor on 2026-06-20.
Retained in primary analysis per SAP section 8.3 — no protocol
deviation recorded. Sensitivity analysis without outlier pre-specified
in SAP section 10.4."
)
#> regulog: note logged
# Protocol deviation
log_note(
log,
"Subject 01-007: visit window deviation at Week 8 (visited Day 61,
window Day 50-58). Classified as minor deviation per deviation
assessment log entry DEV-0031. Subject retained in ITT population."
)
#> regulog: note logged
# Query resolved
log_note(
log,
"Data query Q-0047 resolved 2026-06-15: lab value for subject 01-019
at Screening confirmed as 4.2 mmol/L per site laboratory report.
Original value 42.0 was a decimal error."
)
#> regulog: note logged
# Analysis assumption documented
log_note(
log,
"Missing baseline value for subject 01-033: LOCF imputation applied
per SAP section 7.2 — previous non-missing value (Visit 1) used.
Imputed value: 24.6."
)
#> regulog: note logged5. Logging data reads
Manually calling log_action() for every file read is
error-prone and easy to forget. regulog provides two ways
to log reads explicitly: rl_read() for a single call, and
with_log() for a scoped block where multiple reads share
the same logging context.
Single reads with rl_read()
rl_read(log, reader, ...) calls reader(...)
and logs the result as a data_read ACTION entry — capturing
the resolved file path, row count, and column count automatically.
adsl <- rl_read(log, haven::read_sas, "data/adsl.sas7bdat")
adae <- rl_read(log, haven::read_sas, "data/adae.sas7bdat")rl_read() works with any reader function —
haven::read_sas, readr::read_csv,
data.table::fread, utils::read.csv, or a
custom function — since it wraps the call explicitly rather than
depending on a fixed list of patched functions.
The file path is resolved from a named argument (file,
path, data_file, or input) if
present, falling back to the first unnamed argument — so reordered named
calls still record the correct path:
Scoped logging with with_log()
For a block containing several reads, with_log()
provides a local read() binding so the log
argument doesn’t need to be repeated at every call:
with_log(log, {
adsl <- read(haven::read_sas, "data/adsl.sas7bdat")
adae <- read(haven::read_sas, "data/adae.sas7bdat")
adlb <- read(haven::read_sas, "data/adlb.sas7bdat")
params <- read(readr::read_csv, "config/parameters.csv")
})read() is only available inside the
with_log() block — calling a reader function bare (without
read(...)) is not logged. This is deliberate: every logged
read is visible at its call site, with no implicit or hidden logging
behaviour.
Each logged entry captures the file path, row count, and column count. For example:
action: data_read
object: data/adsl.sas7bdat
reason: haven::read_sas("data/adsl.sas7bdat") — 298 rows, 47 cols
with_log() guarantees expr is evaluated in
an isolated scope: the read() binding for one
with_log() call cannot interfere with another, even across
concurrent sessions (for example, two users in the same Shiny
application). If expr errors, the error propagates normally
and any entries logged before the error remain intact in the chain.
6. Electronic signatures
log_signature() records a named, dated, meaningful
sign-off. Two things happen automatically — no user input required:
-
Signer identity is resolved from the session user
set at
regulog_init()— it cannot be overridden at signing time - Entries covered is captured as the count of prior entries in the session at the moment of signing
log_signature(
log,
"I certify that this primary analysis is accurate and complete,
conducted in accordance with SAP version 2.0 dated 2026-05-01"
)
#> regulog: signature applied by 'jsmith' covering 13 entriesMultiple signatures are supported — for example, a lead statistician and an independent reviewer:
log_signature(
log,
"Statistical analysis complete and accurate per SAP v2.0.
All deviations documented."
)
# Second reviewer — create a new log or log against the same path with
# a different session user
log2 <- regulog_init(
app = "primary-analysis", version = "1.0.0",
user = "second.reviewer",
path = "logs/trial001_audit.rlog"
)
log_signature(
log2,
"Independent QC review complete. Results independently verified."
)7. Verify chain integrity
verify_log() recomputes every entry hash and confirms
each prev_hash links correctly to its predecessor. Works on
both a live regulog object and a .rlog file
path.
verify_log(log)
#> regulog: Log intact: 14 entries, chain unbrokenThe return value carries structured results:
result <- verify_log(log, verbose = FALSE)
cat("Intact: ", result$intact, "\n")
#> Intact: TRUE
cat("Entries checked:", result$n_entries, "\n")
#> Entries checked: 14
cat("First broken: ", result$first_broken, "\n")
#> First broken: NATampering is reliably detected:
saved <- log$entries[[2L]]$reason
log$entries[[2L]]$reason <- "ALTERED REASON"
tamper_result <- suppressWarnings(verify_log(log, verbose = FALSE))
cat("Intact after tamper:", tamper_result$intact, "\n")
#> Intact after tamper: FALSE
cat("First broken entry: ", tamper_result$first_broken, "\n")
#> First broken entry: 2
log$entries[[2L]]$reason <- saved # restoreVerification from a file path requires no live session:
verify_log("logs/trial001_audit.rlog")8. Query the log
filter_log() returns log entries as a
data.frame. All arguments are optional — omitting all
returns every entry.
all_entries <- filter_log(log)
all_entries[, c("entry_id", "type", "action", "user", "reason")]
#> entry_id type action user
#> 1 1 ACTION data_read jsmith
#> 2 2 ACTION model_fit jsmith
#> 3 3 ACTION export jsmith
#> 4 4 ACTION approved jsmith
#> 5 5 ACTION co_reviewed second.reviewer
#> 6 6 CHANGE <NA> jsmith
#> 7 7 CHANGE <NA> jsmith
#> 8 8 CHANGE <NA> jsmith
#> 9 9 CHANGE <NA> jsmith
#> 10 10 NOTE note jsmith
#> 11 11 NOTE note jsmith
#> 12 12 NOTE note jsmith
#> 13 13 NOTE note jsmith
#> 14 14 SIGNATURE signature jsmith
#> reason
#> 1 Reading subject-level dataset for primary efficacy analysis
#> 2 Fitting ANCOVA: CHG ~ TRT01P + BASE + SITEID per SAP section 6.1
#> 3 Primary efficacy table exported for clinical study report
#> 4 QC review complete — all outputs match SAP-specified formats
#> 5 Independent statistical review complete
#> 6 Significance level updated per protocol amendment 2 (2026-05-01)
#> 7 Transcription error — corrected per source CRF page 47, query Q-0192
#> 8 Unstructured covariance pre-specified in SAP section 6.1.2
#> 9 Protocol amendment 3: safety population requires confirmed dosing
#> 10 Outlier identified for subject 01-042 at Week 16 (AVAL = 98.4,\n upper fence = 62.1). Discussed with medical monitor on 2026-06-20.\n Retained in primary analysis per SAP section 8.3 — no protocol\n deviation recorded. Sensitivity analysis without outlier pre-specified\n in SAP section 10.4.
#> 11 Subject 01-007: visit window deviation at Week 8 (visited Day 61,\n window Day 50-58). Classified as minor deviation per deviation\n assessment log entry DEV-0031. Subject retained in ITT population.
#> 12 Data query Q-0047 resolved 2026-06-15: lab value for subject 01-019\n at Screening confirmed as 4.2 mmol/L per site laboratory report.\n Original value 42.0 was a decimal error.
#> 13 Missing baseline value for subject 01-033: LOCF imputation applied\n per SAP section 7.2 — previous non-missing value (Visit 1) used.\n Imputed value: 24.6.
#> 14 I certify that this primary analysis is accurate and complete,\n conducted in accordance with SAP version 2.0 dated 2026-05-01Filter by entry type:
filter_log(log, type = "SIGNATURE")[, c("type", "user", "reason", "after")]
#> type user
#> 1 SIGNATURE jsmith
#> reason
#> 1 I certify that this primary analysis is accurate and complete,\n conducted in accordance with SAP version 2.0 dated 2026-05-01
#> after
#> 1 13Filter by action value:
filter_log(log, action = "approved")[, c("action", "object", "reason")]
#> action object
#> 1 approved primary_results_v3
#> reason
#> 1 QC review complete — all outputs match SAP-specified formatsFilter by user:
filter_log(log, user = "jsmith")[, c("type", "action", "object")]
#> type action object
#> 1 ACTION data_read adsl.sas7bdat
#> 2 ACTION model_fit primary_ANCOVA
#> 3 ACTION export Table14_1.rtf
#> 4 ACTION approved primary_results_v3
#> 5 CHANGE <NA> alpha
#> 6 CHANGE <NA> subject_01042
#> 7 CHANGE <NA> model_config
#> 8 CHANGE <NA> analysis_population
#> 9 NOTE note <NA>
#> 10 NOTE note <NA>
#> 11 NOTE note <NA>
#> 12 NOTE note <NA>
#> 13 SIGNATURE signature jsmithFilter by date range — useful when querying a long-running shared log:
# Entries from today onwards
filter_log(log, from = format(Sys.Date(), "%Y-%m-%d"))[, c("type", "action")]
#> type action
#> 1 ACTION data_read
#> 2 ACTION model_fit
#> 3 ACTION export
#> 4 ACTION approved
#> 5 ACTION co_reviewed
#> 6 CHANGE <NA>
#> 7 CHANGE <NA>
#> 8 CHANGE <NA>
#> 9 CHANGE <NA>
#> 10 NOTE note
#> 11 NOTE note
#> 12 NOTE note
#> 13 NOTE note
#> 14 SIGNATURE signature
# Entries before a cutoff (empty for new log)
filter_log(log, to = "2025-12-31")
#> [1] entry_id timestamp app app_version user type
#> [7] action object field before after reason
#> [13] text meaning entry_hash prev_hash
#> <0 rows> (or 0-length row.names)Combine filters:
filter_log(log,
type = c("ACTION", "NOTE"),
user = "jsmith",
from = "2026-01-01"
)[, c("type", "action", "reason")]
#> type action
#> 1 ACTION data_read
#> 2 ACTION model_fit
#> 3 ACTION export
#> 4 ACTION approved
#> 5 NOTE note
#> 6 NOTE note
#> 7 NOTE note
#> 8 NOTE note
#> reason
#> 1 Reading subject-level dataset for primary efficacy analysis
#> 2 Fitting ANCOVA: CHG ~ TRT01P + BASE + SITEID per SAP section 6.1
#> 3 Primary efficacy table exported for clinical study report
#> 4 QC review complete — all outputs match SAP-specified formats
#> 5 Outlier identified for subject 01-042 at Week 16 (AVAL = 98.4,\n upper fence = 62.1). Discussed with medical monitor on 2026-06-20.\n Retained in primary analysis per SAP section 8.3 — no protocol\n deviation recorded. Sensitivity analysis without outlier pre-specified\n in SAP section 10.4.
#> 6 Subject 01-007: visit window deviation at Week 8 (visited Day 61,\n window Day 50-58). Classified as minor deviation per deviation\n assessment log entry DEV-0031. Subject retained in ITT population.
#> 7 Data query Q-0047 resolved 2026-06-15: lab value for subject 01-019\n at Screening confirmed as 4.2 mmol/L per site laboratory report.\n Original value 42.0 was a decimal error.
#> 8 Missing baseline value for subject 01-033: LOCF imputation applied\n per SAP section 7.2 — previous non-missing value (Visit 1) used.\n Imputed value: 24.6.filter_log() also accepts a .rlog file path
directly — no live session or regulog object required:
filter_log("logs/trial001_audit.rlog",
type = "SIGNATURE",
user = "jsmith"
)9. Convert to data frame
as.data.frame() converts all non-genesis entries to a
flat data frame — same column layout as
export_audit_trail(format = "csv"):
df <- as.data.frame(log)
names(df)
#> [1] "entry_id" "timestamp" "app" "app_version" "user"
#> [6] "type" "action" "object" "field" "before"
#> [11] "after" "reason" "text" "meaning" "entry_hash"
#> [16] "prev_hash"
nrow(df)
#> [1] 1410. Export the audit trail
export_audit_trail() serialises the log to CSV or JSON.
Use signed = TRUE to run verification and stamp
chain_intact and verified_at on every row.
df_export <- export_audit_trail(log, format = "csv", signed = TRUE)
df_export[, c("entry_id", "type", "action", "user", "chain_intact", "verified_at")]
#> entry_id type action user chain_intact
#> 1 1 ACTION data_read jsmith TRUE
#> 2 2 ACTION model_fit jsmith TRUE
#> 3 3 ACTION export jsmith TRUE
#> 4 4 ACTION approved jsmith TRUE
#> 5 5 ACTION co_reviewed second.reviewer TRUE
#> 6 6 CHANGE <NA> jsmith TRUE
#> 7 7 CHANGE <NA> jsmith TRUE
#> 8 8 CHANGE <NA> jsmith TRUE
#> 9 9 CHANGE <NA> jsmith TRUE
#> 10 10 NOTE note jsmith TRUE
#> 11 11 NOTE note jsmith TRUE
#> 12 12 NOTE note jsmith TRUE
#> 13 13 NOTE note jsmith TRUE
#> 14 14 SIGNATURE signature jsmith TRUE
#> verified_at
#> 1 2026-07-01T20:00:47.575090Z
#> 2 2026-07-01T20:00:47.575090Z
#> 3 2026-07-01T20:00:47.575090Z
#> 4 2026-07-01T20:00:47.575090Z
#> 5 2026-07-01T20:00:47.575090Z
#> 6 2026-07-01T20:00:47.575090Z
#> 7 2026-07-01T20:00:47.575090Z
#> 8 2026-07-01T20:00:47.575090Z
#> 9 2026-07-01T20:00:47.575090Z
#> 10 2026-07-01T20:00:47.575090Z
#> 11 2026-07-01T20:00:47.575090Z
#> 12 2026-07-01T20:00:47.575090Z
#> 13 2026-07-01T20:00:47.575090Z
#> 14 2026-07-01T20:00:47.575090Z
# JSON envelope with metadata header
export_audit_trail(log,
format = "json",
signed = TRUE,
path = "outputs/audit_trail.json"
)
# CSV for regulatory submission or spreadsheet review
export_audit_trail(log,
format = "csv",
signed = TRUE,
path = "outputs/audit_trail_TRIAL001_PRIMARY.csv"
)Date filtering is available on export too:
# Only entries from a specific analysis phase
export_audit_trail(log,
format = "csv",
from = "2026-06-01",
to = "2026-06-30",
signed = TRUE,
path = "outputs/audit_june2026.csv"
)11. Entry type reference
| Type | Created by | Mandatory fields | Regulatory purpose |
|---|---|---|---|
ACTION |
log_action() |
action, object, reason
|
Discrete events |
CHANGE |
log_change() |
object, field, before,
after, reason
|
Field modifications |
NOTE |
log_note() |
text |
Decisions and rationale |
SIGNATURE |
log_signature() |
meaning |
Sign-off |
12. Validation (regulated environments)
Any software used in a regulated environment — under 21 CFR Part 11,
EU Annex 11, or GAMP 5 — must be formally qualified before it can be
used to generate or sign electronic records that regulators may inspect.
regulog ships pre-written, executable IQ/OQ/PQ
qualification protocols that cover all three phases.
Running the protocols
Run each script in sequence in the target environment — the R installation that will be used for regulated work:
# Phase 1: Installation Qualification (10 tests)
# Verifies R version, package installation, dependency integrity,
# file system access, and namespace exports.
source(system.file("validation/IQ_regulog.R", package = "regulog"))
# Phase 2: Operational Qualification (26 tests)
# Tests every 21 CFR §11.10 requirement: hash chain integrity,
# tamper detection, user attribution, timestamps, export format,
# electronic signatures, and error isolation.
source(system.file("validation/OQ_regulog.R", package = "regulog"))
# Phase 3: Performance Qualification (7 tests)
# End-to-end clinical workflows: data review, regulatory export,
# multi-user session independence, 500-entry load test, and
# inspector query simulation.
source(system.file("validation/PQ_regulog.R", package = "regulog"))Capturing the qualification record
Retain the output of each run as documented evidence of system qualification. The simplest approach is to capture it to a file:
sink("IQ_execution_record.txt")
source(system.file("validation/IQ_regulog.R", package = "regulog"))
sink()
sink("OQ_execution_record.txt")
source(system.file("validation/OQ_regulog.R", package = "regulog"))
sink()
sink("PQ_execution_record.txt")
source(system.file("validation/PQ_regulog.R", package = "regulog"))
sink()Each execution record includes the timestamp, R version, platform, and the pass/fail result of every test against its acceptance criterion.
Requirements traceability
The RTM maps every OQ test to the regulatory clause it addresses:
read.csv(system.file("validation/RTM_regulog.csv", package = "regulog"))Logging the qualification itself
The qualification run is itself an activity in a regulated
environment and should be logged. Using regulog to audit
its own qualification produces a Part 11-compliant record of who ran it,
when, and the outcome:
log <- regulog_init(
app = "regulog-qualification",
version = "0.2.0",
user = "val.lead",
path = "qualification/audit_trail.rlog"
)
log_action(log,
action = "qualification_start",
object = "regulog 0.2.0",
reason = "IQ/OQ/PQ qualification initiated per SOP-VAL-007"
)
source(system.file("validation/IQ_regulog.R", package = "regulog"))
log_action(log,
action = "IQ_complete",
object = "IQ_regulog.R",
reason = "10 tests passed. Proceeding to OQ."
)
source(system.file("validation/OQ_regulog.R", package = "regulog"))
log_action(log,
action = "OQ_complete",
object = "OQ_regulog.R",
reason = "26 tests passed. Proceeding to PQ."
)
source(system.file("validation/PQ_regulog.R", package = "regulog"))
log_action(log,
action = "PQ_complete",
object = "PQ_regulog.R",
reason = "7 tests passed. Qualification complete."
)
log_signature(log,
"I certify that regulog 0.2.0 has been qualified in this environment
per SOP-VAL-007 and is approved for use in regulated R workflows."
)
verify_log(log)
export_audit_trail(log,
format = "csv",
signed = TRUE,
path = "qualification/audit_trail_export.csv"
)Re-qualification
Any significant change — a new package version, a change to the R environment, or a platform migration — requires re-qualification. Re-run the three protocols in the updated environment and retain the new execution records as evidence that the qualified state has been re-established.
See also vignette("hash-chain") for a detailed
explanation of the tamper detection mechanism, and the qualification
guide on reprostats.org for a fuller discussion of the regulatory
context.
