Getting started with regulog

R analyses leave no audit trail by default. regulog adds one — a tamper-evident, hash-chained record of every action, change, decision, and signature. Every entry is attributed to a named user, time-stamped in UTC, and cryptographically linked to the previous entry so that any modification after the fact is detectable.

This vignette walks through the complete API from session initialisation to regulatory export.

1. Initialise a session

regulog_init() creates the session object. Every subsequent log call is attached to this object.

Argument	Required	Purpose
`app`	Yes	Application or system name
`version`	No	Application version (default: `"unknown"`)
`user`	No	Acting user (default: `Sys.info()[["user"]]`)
`path`	No	File path for persistent `.rlog` storage
`hash_algo`	No	Hashing algorithm (default: `"sha256"`)

log <- regulog_init(
  app     = "primary-analysis",
  version = "1.0.0",
  user    = "jsmith"
  # Provide path = "logs/audit.rlog" in production for persistent storage
)

log
#> <regulog>
#>   App:     primary-analysis v1.0.0
#>   User:    jsmith
#>   Entries: 0
#>   Path:    (in-memory only)

When path is omitted, the log lives in memory only — suitable for development and testing. In production, always supply a path so entries survive the R session.

The genesis record is written immediately on regulog_init(). Its SHA-256 hash anchors the entire chain — see vignette("hash-chain") for how the cryptographic linking works.

2. Log actions

log_action() records a discrete event. The reason argument is mandatory with no default — undocumented entries are rejected.

log_action(log,
  action = "data_read",
  object = "adsl.sas7bdat",
  reason = "Reading subject-level dataset for primary efficacy analysis"
)
#> regulog: logged action 'data_read' on 'adsl.sas7bdat'

The action and object fields accept any strings — choose a controlled vocabulary that suits your organisation. Common patterns:

# Analytical steps
log_action(log,
  action = "model_fit",
  object = "primary_ANCOVA",
  reason = "Fitting ANCOVA: CHG ~ TRT01P + BASE + SITEID per SAP section 6.1"
)
#> regulog: logged action 'model_fit' on 'primary_ANCOVA'

# Data exports
log_action(log,
  action = "export",
  object = "Table14_1.rtf",
  reason = "Primary efficacy table exported for clinical study report"
)
#> regulog: logged action 'export' on 'Table14_1.rtf'

# Review and approval events
log_action(log,
  action = "approved",
  object = "primary_results_v3",
  reason = "QC review complete — all outputs match SAP-specified formats"
)
#> regulog: logged action 'approved' on 'primary_results_v3'

# User can override the session user for a single entry
log_action(log,
  action = "co_reviewed",
  object = "primary_results_v3",
  reason = "Independent statistical review complete",
  user   = "second.reviewer"
)
#> regulog: logged action 'co_reviewed' on 'primary_results_v3'

3. Log field changes

log_change() captures a before/after modification — the primary mechanism for satisfying 21 CFR Part 11 §11.10(e) change documentation.

log_change(log,
  object = "alpha",
  field  = "value",
  before = "0.05",
  after  = "0.025",
  reason = "Significance level updated per protocol amendment 2 (2026-05-01)"
)
#> regulog: logged change to alpha$value

The before and after arguments are coerced to character, so they accept any R value:

# Data correction
log_change(log,
  object = "subject_01042",
  field  = "ae_onset_date",
  before = "2026-03-01",
  after  = "2026-03-11",
  reason = "Transcription error — corrected per source CRF page 47, query Q-0192"
)
#> regulog: logged change to subject_01042$ae_onset_date

# Configuration update
log_change(log,
  object = "model_config",
  field  = "covariance_structure",
  before = "compound_symmetry",
  after  = "unstructured",
  reason = "Unstructured covariance pre-specified in SAP section 6.1.2"
)
#> regulog: logged change to model_config$covariance_structure

# Population definition change
log_change(log,
  object = "analysis_population",
  field  = "SAFFL_definition",
  before = "RANDFL = 'Y'",
  after  = "RANDFL = 'Y' AND EXOCCUR = 'Y'",
  reason = "Protocol amendment 3: safety population requires confirmed dosing"
)
#> regulog: logged change to analysis_population$SAFFL_definition

4. Log notes and decisions

log_note() captures free-text annotations — any rationale, observation, or decision that does not fit a discrete action verb or a before/after field change. Common uses:

# Outlier decision
log_note(
  log,
  "Outlier identified for subject 01-042 at Week 16 (AVAL = 98.4,
   upper fence = 62.1). Discussed with medical monitor on 2026-06-20.
   Retained in primary analysis per SAP section 8.3 — no protocol
   deviation recorded. Sensitivity analysis without outlier pre-specified
   in SAP section 10.4."
)
#> regulog: note logged

# Protocol deviation
log_note(
  log,
  "Subject 01-007: visit window deviation at Week 8 (visited Day 61,
   window Day 50-58). Classified as minor deviation per deviation
   assessment log entry DEV-0031. Subject retained in ITT population."
)
#> regulog: note logged

# Query resolved
log_note(
  log,
  "Data query Q-0047 resolved 2026-06-15: lab value for subject 01-019
   at Screening confirmed as 4.2 mmol/L per site laboratory report.
   Original value 42.0 was a decimal error."
)
#> regulog: note logged

# Analysis assumption documented
log_note(
  log,
  "Missing baseline value for subject 01-033: LOCF imputation applied
   per SAP section 7.2 — previous non-missing value (Visit 1) used.
   Imputed value: 24.6."
)
#> regulog: note logged

5. Logging data reads

Manually calling log_action() for every file read is error-prone and easy to forget. regulog provides two ways to log reads explicitly: rl_read() for a single call, and with_log() for a scoped block where multiple reads share the same logging context.

Single reads with `rl_read()`

rl_read(log, reader, ...) calls reader(...) and logs the result as a data_read ACTION entry — capturing the resolved file path, row count, and column count automatically.

adsl <- rl_read(log, haven::read_sas, "data/adsl.sas7bdat")
adae <- rl_read(log, haven::read_sas, "data/adae.sas7bdat")

rl_read() works with any reader function — haven::read_sas, readr::read_csv, data.table::fread, utils::read.csv, or a custom function — since it wraps the call explicitly rather than depending on a fixed list of patched functions.

The file path is resolved from a named argument (file, path, data_file, or input) if present, falling back to the first unnamed argument — so reordered named calls still record the correct path:

adae <- rl_read(log, readr::read_csv, col_types = "ccd", file = "data/adae.csv")

Scoped logging with `with_log()`

For a block containing several reads, with_log() provides a local read() binding so the log argument doesn’t need to be repeated at every call:

with_log(log, {
  adsl   <- read(haven::read_sas, "data/adsl.sas7bdat")
  adae   <- read(haven::read_sas, "data/adae.sas7bdat")
  adlb   <- read(haven::read_sas, "data/adlb.sas7bdat")
  params <- read(readr::read_csv, "config/parameters.csv")
})

read() is only available inside the with_log() block — calling a reader function bare (without read(...)) is not logged. This is deliberate: every logged read is visible at its call site, with no implicit or hidden logging behaviour.

Each logged entry captures the file path, row count, and column count. For example:

action: data_read
object: data/adsl.sas7bdat
reason: haven::read_sas("data/adsl.sas7bdat") — 298 rows, 47 cols

with_log() guarantees expr is evaluated in an isolated scope: the read() binding for one with_log() call cannot interfere with another, even across concurrent sessions (for example, two users in the same Shiny application). If expr errors, the error propagates normally and any entries logged before the error remain intact in the chain.

6. Electronic signatures

log_signature() records a named, dated, meaningful sign-off. Two things happen automatically — no user input required:

Signer identity is resolved from the session user set at regulog_init() — it cannot be overridden at signing time
Entries covered is captured as the count of prior entries in the session at the moment of signing

log_signature(
  log,
  "I certify that this primary analysis is accurate and complete,
   conducted in accordance with SAP version 2.0 dated 2026-05-01"
)
#> regulog: signature applied by 'jsmith' covering 13 entries

Multiple signatures are supported — for example, a lead statistician and an independent reviewer:

log_signature(
  log,
  "Statistical analysis complete and accurate per SAP v2.0.
   All deviations documented."
)

# Second reviewer — create a new log or log against the same path with
# a different session user
log2 <- regulog_init(
  app = "primary-analysis", version = "1.0.0",
  user = "second.reviewer",
  path = "logs/trial001_audit.rlog"
)

log_signature(
  log2,
  "Independent QC review complete. Results independently verified."
)

7. Verify chain integrity

verify_log() recomputes every entry hash and confirms each prev_hash links correctly to its predecessor. Works on both a live regulog object and a .rlog file path.

verify_log(log)
#> regulog: Log intact: 14 entries, chain unbroken

The return value carries structured results:

result <- verify_log(log, verbose = FALSE)
cat("Intact:        ", result$intact, "\n")
#> Intact:         TRUE
cat("Entries checked:", result$n_entries, "\n")
#> Entries checked: 14
cat("First broken:  ", result$first_broken, "\n")
#> First broken:   NA

Tampering is reliably detected:

saved <- log$entries[[2L]]$reason
log$entries[[2L]]$reason <- "ALTERED REASON"

tamper_result <- suppressWarnings(verify_log(log, verbose = FALSE))
cat("Intact after tamper:", tamper_result$intact, "\n")
#> Intact after tamper: FALSE
cat("First broken entry: ", tamper_result$first_broken, "\n")
#> First broken entry:  2

log$entries[[2L]]$reason <- saved # restore

Verification from a file path requires no live session:

verify_log("logs/trial001_audit.rlog")

8. Query the log

filter_log() returns log entries as a data.frame. All arguments are optional — omitting all returns every entry.

all_entries <- filter_log(log)
all_entries[, c("entry_id", "type", "action", "user", "reason")]
#>    entry_id      type      action            user
#> 1         1    ACTION   data_read          jsmith
#> 2         2    ACTION   model_fit          jsmith
#> 3         3    ACTION      export          jsmith
#> 4         4    ACTION    approved          jsmith
#> 5         5    ACTION co_reviewed second.reviewer
#> 6         6    CHANGE        <NA>          jsmith
#> 7         7    CHANGE        <NA>          jsmith
#> 8         8    CHANGE        <NA>          jsmith
#> 9         9    CHANGE        <NA>          jsmith
#> 10       10      NOTE        note          jsmith
#> 11       11      NOTE        note          jsmith
#> 12       12      NOTE        note          jsmith
#> 13       13      NOTE        note          jsmith
#> 14       14 SIGNATURE   signature          jsmith
#>                                                                                                                                                                                                                                                                                                          reason
#> 1                                                                                                                                                                                                                                                   Reading subject-level dataset for primary efficacy analysis
#> 2                                                                                                                                                                                                                                              Fitting ANCOVA: CHG ~ TRT01P + BASE + SITEID per SAP section 6.1
#> 3                                                                                                                                                                                                                                                     Primary efficacy table exported for clinical study report
#> 4                                                                                                                                                                                                                                                  QC review complete — all outputs match SAP-specified formats
#> 5                                                                                                                                                                                                                                                                       Independent statistical review complete
#> 6                                                                                                                                                                                                                                              Significance level updated per protocol amendment 2 (2026-05-01)
#> 7                                                                                                                                                                                                                                          Transcription error — corrected per source CRF page 47, query Q-0192
#> 8                                                                                                                                                                                                                                                    Unstructured covariance pre-specified in SAP section 6.1.2
#> 9                                                                                                                                                                                                                                             Protocol amendment 3: safety population requires confirmed dosing
#> 10 Outlier identified for subject 01-042 at Week 16 (AVAL = 98.4,\n   upper fence = 62.1). Discussed with medical monitor on 2026-06-20.\n   Retained in primary analysis per SAP section 8.3 — no protocol\n   deviation recorded. Sensitivity analysis without outlier pre-specified\n   in SAP section 10.4.
#> 11                                                                                                  Subject 01-007: visit window deviation at Week 8 (visited Day 61,\n   window Day 50-58). Classified as minor deviation per deviation\n   assessment log entry DEV-0031. Subject retained in ITT population.
#> 12                                                                                                                        Data query Q-0047 resolved 2026-06-15: lab value for subject 01-019\n   at Screening confirmed as 4.2 mmol/L per site laboratory report.\n   Original value 42.0 was a decimal error.
#> 13                                                                                                                                             Missing baseline value for subject 01-033: LOCF imputation applied\n   per SAP section 7.2 — previous non-missing value (Visit 1) used.\n   Imputed value: 24.6.
#> 14                                                                                                                                                                             I certify that this primary analysis is accurate and complete,\n   conducted in accordance with SAP version 2.0 dated 2026-05-01

Filter by entry type:

filter_log(log, type = "SIGNATURE")[, c("type", "user", "reason", "after")]
#>        type   user
#> 1 SIGNATURE jsmith
#>                                                                                                                             reason
#> 1 I certify that this primary analysis is accurate and complete,\n   conducted in accordance with SAP version 2.0 dated 2026-05-01
#>   after
#> 1    13

Filter by action value:

filter_log(log, action = "approved")[, c("action", "object", "reason")]
#>     action             object
#> 1 approved primary_results_v3
#>                                                         reason
#> 1 QC review complete — all outputs match SAP-specified formats

Filter by user:

filter_log(log, user = "jsmith")[, c("type", "action", "object")]
#>         type    action              object
#> 1     ACTION data_read       adsl.sas7bdat
#> 2     ACTION model_fit      primary_ANCOVA
#> 3     ACTION    export       Table14_1.rtf
#> 4     ACTION  approved  primary_results_v3
#> 5     CHANGE      <NA>               alpha
#> 6     CHANGE      <NA>       subject_01042
#> 7     CHANGE      <NA>        model_config
#> 8     CHANGE      <NA> analysis_population
#> 9       NOTE      note                <NA>
#> 10      NOTE      note                <NA>
#> 11      NOTE      note                <NA>
#> 12      NOTE      note                <NA>
#> 13 SIGNATURE signature              jsmith

Filter by date range — useful when querying a long-running shared log:

# Entries from today onwards
filter_log(log, from = format(Sys.Date(), "%Y-%m-%d"))[, c("type", "action")]
#>         type      action
#> 1     ACTION   data_read
#> 2     ACTION   model_fit
#> 3     ACTION      export
#> 4     ACTION    approved
#> 5     ACTION co_reviewed
#> 6     CHANGE        <NA>
#> 7     CHANGE        <NA>
#> 8     CHANGE        <NA>
#> 9     CHANGE        <NA>
#> 10      NOTE        note
#> 11      NOTE        note
#> 12      NOTE        note
#> 13      NOTE        note
#> 14 SIGNATURE   signature

# Entries before a cutoff (empty for new log)
filter_log(log, to = "2025-12-31")
#>  [1] entry_id    timestamp   app         app_version user        type       
#>  [7] action      object      field       before      after       reason     
#> [13] text        meaning     entry_hash  prev_hash  
#> <0 rows> (or 0-length row.names)

Combine filters:

filter_log(log,
  type   = c("ACTION", "NOTE"),
  user   = "jsmith",
  from   = "2026-01-01"
)[, c("type", "action", "reason")]
#>     type    action
#> 1 ACTION data_read
#> 2 ACTION model_fit
#> 3 ACTION    export
#> 4 ACTION  approved
#> 5   NOTE      note
#> 6   NOTE      note
#> 7   NOTE      note
#> 8   NOTE      note
#>                                                                                                                                                                                                                                                                                                         reason
#> 1                                                                                                                                                                                                                                                  Reading subject-level dataset for primary efficacy analysis
#> 2                                                                                                                                                                                                                                             Fitting ANCOVA: CHG ~ TRT01P + BASE + SITEID per SAP section 6.1
#> 3                                                                                                                                                                                                                                                    Primary efficacy table exported for clinical study report
#> 4                                                                                                                                                                                                                                                 QC review complete — all outputs match SAP-specified formats
#> 5 Outlier identified for subject 01-042 at Week 16 (AVAL = 98.4,\n   upper fence = 62.1). Discussed with medical monitor on 2026-06-20.\n   Retained in primary analysis per SAP section 8.3 — no protocol\n   deviation recorded. Sensitivity analysis without outlier pre-specified\n   in SAP section 10.4.
#> 6                                                                                                  Subject 01-007: visit window deviation at Week 8 (visited Day 61,\n   window Day 50-58). Classified as minor deviation per deviation\n   assessment log entry DEV-0031. Subject retained in ITT population.
#> 7                                                                                                                        Data query Q-0047 resolved 2026-06-15: lab value for subject 01-019\n   at Screening confirmed as 4.2 mmol/L per site laboratory report.\n   Original value 42.0 was a decimal error.
#> 8                                                                                                                                             Missing baseline value for subject 01-033: LOCF imputation applied\n   per SAP section 7.2 — previous non-missing value (Visit 1) used.\n   Imputed value: 24.6.

filter_log() also accepts a .rlog file path directly — no live session or regulog object required:

filter_log("logs/trial001_audit.rlog",
  type = "SIGNATURE",
  user = "jsmith"
)

9. Convert to data frame

as.data.frame() converts all non-genesis entries to a flat data frame — same column layout as export_audit_trail(format = "csv"):

df <- as.data.frame(log)
names(df)
#>  [1] "entry_id"    "timestamp"   "app"         "app_version" "user"       
#>  [6] "type"        "action"      "object"      "field"       "before"     
#> [11] "after"       "reason"      "text"        "meaning"     "entry_hash" 
#> [16] "prev_hash"
nrow(df)
#> [1] 14

10. Export the audit trail

export_audit_trail() serialises the log to CSV or JSON. Use signed = TRUE to run verification and stamp chain_intact and verified_at on every row.

df_export <- export_audit_trail(log, format = "csv", signed = TRUE)
df_export[, c("entry_id", "type", "action", "user", "chain_intact", "verified_at")]
#>    entry_id      type      action            user chain_intact
#> 1         1    ACTION   data_read          jsmith         TRUE
#> 2         2    ACTION   model_fit          jsmith         TRUE
#> 3         3    ACTION      export          jsmith         TRUE
#> 4         4    ACTION    approved          jsmith         TRUE
#> 5         5    ACTION co_reviewed second.reviewer         TRUE
#> 6         6    CHANGE        <NA>          jsmith         TRUE
#> 7         7    CHANGE        <NA>          jsmith         TRUE
#> 8         8    CHANGE        <NA>          jsmith         TRUE
#> 9         9    CHANGE        <NA>          jsmith         TRUE
#> 10       10      NOTE        note          jsmith         TRUE
#> 11       11      NOTE        note          jsmith         TRUE
#> 12       12      NOTE        note          jsmith         TRUE
#> 13       13      NOTE        note          jsmith         TRUE
#> 14       14 SIGNATURE   signature          jsmith         TRUE
#>                    verified_at
#> 1  2026-07-01T20:00:47.575090Z
#> 2  2026-07-01T20:00:47.575090Z
#> 3  2026-07-01T20:00:47.575090Z
#> 4  2026-07-01T20:00:47.575090Z
#> 5  2026-07-01T20:00:47.575090Z
#> 6  2026-07-01T20:00:47.575090Z
#> 7  2026-07-01T20:00:47.575090Z
#> 8  2026-07-01T20:00:47.575090Z
#> 9  2026-07-01T20:00:47.575090Z
#> 10 2026-07-01T20:00:47.575090Z
#> 11 2026-07-01T20:00:47.575090Z
#> 12 2026-07-01T20:00:47.575090Z
#> 13 2026-07-01T20:00:47.575090Z
#> 14 2026-07-01T20:00:47.575090Z

# JSON envelope with metadata header
export_audit_trail(log,
  format = "json",
  signed = TRUE,
  path   = "outputs/audit_trail.json"
)

# CSV for regulatory submission or spreadsheet review
export_audit_trail(log,
  format = "csv",
  signed = TRUE,
  path   = "outputs/audit_trail_TRIAL001_PRIMARY.csv"
)

Date filtering is available on export too:

# Only entries from a specific analysis phase
export_audit_trail(log,
  format = "csv",
  from   = "2026-06-01",
  to     = "2026-06-30",
  signed = TRUE,
  path   = "outputs/audit_june2026.csv"
)

11. Entry type reference

Type	Created by	Mandatory fields	Regulatory purpose
`ACTION`	`log_action()`	`action`, `object`, `reason`	Discrete events
`CHANGE`	`log_change()`	`object`, `field`, `before`, `after`, `reason`	Field modifications
`NOTE`	`log_note()`	`text`	Decisions and rationale
`SIGNATURE`	`log_signature()`	`meaning`	Sign-off

12. Validation (regulated environments)

Any software used in a regulated environment — under 21 CFR Part 11, EU Annex 11, or GAMP 5 — must be formally qualified before it can be used to generate or sign electronic records that regulators may inspect. regulog ships pre-written, executable IQ/OQ/PQ qualification protocols that cover all three phases.

Running the protocols

Run each script in sequence in the target environment — the R installation that will be used for regulated work:

# Phase 1: Installation Qualification (10 tests)
# Verifies R version, package installation, dependency integrity,
# file system access, and namespace exports.
source(system.file("validation/IQ_regulog.R", package = "regulog"))

# Phase 2: Operational Qualification (26 tests)
# Tests every 21 CFR §11.10 requirement: hash chain integrity,
# tamper detection, user attribution, timestamps, export format,
# electronic signatures, and error isolation.
source(system.file("validation/OQ_regulog.R", package = "regulog"))

# Phase 3: Performance Qualification (7 tests)
# End-to-end clinical workflows: data review, regulatory export,
# multi-user session independence, 500-entry load test, and
# inspector query simulation.
source(system.file("validation/PQ_regulog.R", package = "regulog"))

Capturing the qualification record

Retain the output of each run as documented evidence of system qualification. The simplest approach is to capture it to a file:

sink("IQ_execution_record.txt")
source(system.file("validation/IQ_regulog.R", package = "regulog"))
sink()

sink("OQ_execution_record.txt")
source(system.file("validation/OQ_regulog.R", package = "regulog"))
sink()

sink("PQ_execution_record.txt")
source(system.file("validation/PQ_regulog.R", package = "regulog"))
sink()

Each execution record includes the timestamp, R version, platform, and the pass/fail result of every test against its acceptance criterion.

Requirements traceability

The RTM maps every OQ test to the regulatory clause it addresses:

read.csv(system.file("validation/RTM_regulog.csv", package = "regulog"))

Logging the qualification itself

The qualification run is itself an activity in a regulated environment and should be logged. Using regulog to audit its own qualification produces a Part 11-compliant record of who ran it, when, and the outcome:

log <- regulog_init(
  app     = "regulog-qualification",
  version = "0.2.0",
  user    = "val.lead",
  path    = "qualification/audit_trail.rlog"
)

log_action(log,
  action = "qualification_start",
  object = "regulog 0.2.0",
  reason = "IQ/OQ/PQ qualification initiated per SOP-VAL-007"
)

source(system.file("validation/IQ_regulog.R", package = "regulog"))
log_action(log,
  action = "IQ_complete",
  object = "IQ_regulog.R",
  reason = "10 tests passed. Proceeding to OQ."
)

source(system.file("validation/OQ_regulog.R", package = "regulog"))
log_action(log,
  action = "OQ_complete",
  object = "OQ_regulog.R",
  reason = "26 tests passed. Proceeding to PQ."
)

source(system.file("validation/PQ_regulog.R", package = "regulog"))
log_action(log,
  action = "PQ_complete",
  object = "PQ_regulog.R",
  reason = "7 tests passed. Qualification complete."
)

log_signature(log,
  "I certify that regulog 0.2.0 has been qualified in this environment
   per SOP-VAL-007 and is approved for use in regulated R workflows."
)

verify_log(log)
export_audit_trail(log,
  format = "csv",
  signed = TRUE,
  path   = "qualification/audit_trail_export.csv"
)

Re-qualification

Any significant change — a new package version, a change to the R environment, or a platform migration — requires re-qualification. Re-run the three protocols in the updated environment and retain the new execution records as evidence that the qualified state has been re-established.

See also vignette("hash-chain") for a detailed explanation of the tamper detection mechanism, and the qualification guide on reprostats.org for a fuller discussion of the regulatory context.