More Than Scan Rates and Retention

The historian configuration decision that engineers make without enough thought is treating every tag the same: 60-second scan, 2-year retention, done. On a non-GMP project that is fine. On a pharma project, each tag in the historian requires a deliberate set of decisions — and those decisions need to be documented in a controlled document before the system is commissioned.

What separates a GMP historian configuration from an industrial one is: a documented tag list that is a version-controlled engineering document, storage type decisions made on a per-tag basis with a justification, a quality status field on every GMP-critical measurement, data buffering when the historian link drops, gap detection with an alarm, and a two-tier retention architecture covering online access and long-term archive. Each of these affects both the data you produce and the audit evidence you can generate.

The Historian Tag List — A Controlled Document

The historian tag list — sometimes called the tag configuration export — is not an output of the SCADA platform. It is an engineering design document that specifies what should be configured, which the platform then implements. The sequence matters: design the tag list first, get it approved, then configure the historian from it. Configuring the historian and then exporting the tag list as documentation is backwards and is an audit finding waiting to happen.

The tag list must contain, at minimum, the following columns for each tag: tag name (matching the SCADA tag naming convention), description, engineering unit, scan interval, storage type, online retention period, archive retention period, data type, PLC source address, GMP-critical flag, and alert limits. There should also be a naming convention validation column — a simple check confirming the tag name conforms to the defined convention. Non-conforming names must be resolved before OQ.

The tag list is version-controlled alongside the SCADA application. Any change to historian configuration — adding a tag, changing a scan rate, modifying a deadband — must go through change control and generate a new version of the tag list export. The hash of the approved tag configuration is part of the validated baseline.

Three Storage Types — Different Purposes

Every historian platform (Wonderware, OSIsoft PI, Ignition Historian, FactoryTalk Historian) offers multiple storage mechanisms. Choosing the wrong one for a GMP-critical tag creates either data gaps or excessive data volume that makes queries slow. The three types that matter for a pharma configuration:

Exception-Based Storage

A record is written to the historian only when the value changes beyond a defined deadband. For a conductivity measurement with a 0.01 µS/cm deadband: if the value stays within that window, no new record is written. When it moves outside, a record is written with the new value and timestamp. This produces efficient storage that still captures every meaningful change in the process. It is the correct choice for analogue process values like conductivity, temperature, and pressure.

The deadband requires deliberate setting. Too large and genuine process changes are not recorded. Too small and every noise excursion generates a record — effectively turning exception storage into time-based storage with excessive data volume. The deadband should be set to approximately 1% of the engineering range for most process measurements.

Time-Based Storage

A record is written at every scan interval regardless of whether the value has changed. This is appropriate for two categories: safety-critical measurements where you need to prove continuity of monitoring (the F-CPU safety temperature channel at 10-second intervals), and totalisers or counters where the absolute value matters at a point in time (a daily flow totaliser). Also appropriate for the sanitization hold timer — the forensic record that the hold phase actually ran for the full duration needs a continuous time-based trace, not just the start and end points.

On-Change Storage

A record is written the moment the value changes — no deadband, no interval. This is the correct storage type for discrete and state tags: equipment status, mode states, setpoint values, alarm bits, system flags. For setpoints, on-change storage creates an automatic record whenever an operator modifies a setpoint — even if the audit trail captures the change event, having the setpoint value in the historian allows trend queries to show what setpoint was active at any given time.

STORAGE TYPE DECISION — GMP HISTORIAN TAG CONFIGURATION EXCEPTION-BASED RECORD WHEN VALUE EXCEEDS DEADBAND ✓ Conductivity PV (0.01 µS/cm DB) ✓ Temperature PV (0.5°C DB) ✓ Pressure PV (0.05 bar DB) ✓ TOC PV (5 ppb DB) ✓ Level PV (1% DB) EFFICIENT · CAPTURES MEANINGFUL CHANGES TIME-BASED RECORD AT EVERY SCAN INTERVAL ✓ Safety temp. F-CPU (10 s) ✓ Sanitization hold timer (10 s) ✓ Flow totaliser (5 min) ✓ System availability (on change) CONTINUOUS EVIDENCE · HIGHER VOLUME ON-CHANGE RECORD INSTANTLY ON ANY CHANGE ✓ Pump run status (Bool) ✓ System mode (enum) ✓ Setpoints (audit trail) ✓ Divert valve command ✓ Alarm bits COMPLETE DISCRETE HISTORY · LOW VOLUME
// EVERY TAG NEEDS A DELIBERATE STORAGE TYPE DECISION DOCUMENTED IN THE HISTORIAN TAG LIST. "DEFAULT TO EXCEPTION-BASED" IS NOT A DECISION — IT IS AN ABSENCE OF ONE.

Quality Status — The Field That Proves Data Trustworthiness

Every GMP-critical process value stored in the historian must carry a quality status field alongside the value. The quality status indicates whether the measurement was trustworthy at the time it was recorded. Without it, a historian record of 0.5 µS/cm is ambiguous — it could mean the water genuinely was 0.5 µS/cm, or it could mean the sensor was disconnected and the historian stored a zeroed or frozen value.

The three standard quality states are Good (measurement valid and within sensor operating range), Bad (sensor fault, signal loss, or instrument error), and Uncertain (intermediate state — sensor present but reading quality cannot be confirmed, for example during sensor warm-up or immediately after calibration). Any record with a Bad or Uncertain quality status is not usable as evidence of process conformance. The quality field makes this visible in historian queries rather than requiring an engineer to cross-reference alarm history to determine whether the measurement was valid.

Quality status is written by the PLC to the historian alongside the process value. The UDT for a GMP analogue input includes a quality word alongside the engineering unit value — the historian tag configuration must include both the PV tag and the quality tag for every GMP-critical measurement.

What to Historise — The Full Tag Scope

The historian tag list must cover more than just the process values. A complete GMP historian configuration includes five categories of tags:

Data Buffering — Keeping the Record Complete

The "Complete" principle of ALCOA+ requires no data gaps. But historian communication links fail. A network switch reboots, a server goes down for maintenance, a power event takes the historian offline for 20 minutes. Without a buffer, those 20 minutes of process data are gone — and a gap in the GMP record is a data integrity finding.

The PLC must be configured to buffer GMP-critical process values in non-volatile memory when the historian link is unavailable. The buffer stores records with their original NTP-synchronised timestamps. When the link restores, the buffer synchronises to the historian in chronological order. The historian receives the buffered data and stores it in its correct position in the time series, so there is no gap in the query output.

The buffer capacity must be sized for the worst realistic outage. For a system with 20 GMP-critical tags at 60-second scan intervals, that is approximately 20 records per minute. A 4-hour outage generates roughly 4,800 records. Buffer capacity should be specified to cover at least 8 hours — a full working shift — to handle planned maintenance windows without data loss.

The gap detection alarm closes the loop: the historian monitors for any gap exceeding 5 minutes in any GMP-critical tag. A gap that exceeds this threshold generates a High alarm and creates a gap event record in the historian itself, noting the start time, end time, and estimated number of records affected. This alarm is what triggers investigation — either confirming that buffer synchronisation is in progress, or escalating if buffered data cannot be recovered.

The 7-Year Retention Architecture

EU GMP Annex 11 Clause 17 requires that data is backed up and available throughout the required retention period. For most pharmaceutical products, process records must be retained for at least the product shelf life plus one year — in practice this often means a minimum of 7 years for a process historian. The retention architecture must support full queryability throughout that period, not just storage.

A two-tier architecture is standard: 2 years online in the active historian database with full SQL query access and trend display; 5 additional years archived to a secondary storage location with SQL query capability via archive restore. The boundary between online and archive retention must be managed automatically — the historian should automatically archive records that age beyond the online window, not rely on a manual annual process that may be skipped.

Critically, the audit trail records must be retained for the same period as the process records they document. If process data is retained for 7 years, the corresponding audit trail entries must also be retained for 7 years. The retention policy must explicitly prohibit purging audit trail records independently of the associated process records — this is specified in the FDS and verified as part of the write-once architecture test during OQ.

In the QLean Framework

The Engineering Lists workbook (EL-SYS-001) Sheet 4 is the Historian Tag List — a structured table containing all 44 historian tags for the reference system. Each entry specifies: tag name (validated against the [SYS]_[AREA]_[DESCRIPTION]_[TYPE] convention), description, engineering unit, scan interval, storage type (Exception/Time-based/On-Change with specific deadband values), online retention (2 years), archive retention (7 years), data type, PLC source DB address, GMP-critical flag (28 of 44 tags flagged), and alert limits. The tag categories cover quality parameters (conductivity with both sensor values and processed value, TOC), temperature at all monitoring points, pressure and level, setpoints stored on-change for audit trail purposes, sanitization sequence state with hold timer at 10-second time-based for forensic completeness, and system health tags (backup status, NTP status). The FDS (FDS-SYS-001) FUNC-DATA-001 through FUNC-DATA-004 define the full historian recording requirements including the data buffering specification and 5-minute gap detection alarm (ALM-DATA-001).