Data Sources - Operational Analytics

Operational analytics is only as reliable as the data that feeds it. Before any analysis, dashboard, or KPI program can deliver value, the organization must have a clear picture of what systems are generating operational data, what that data actually represents, and how to integrate it without introducing distortion through misalignment, latency, or inconsistent definitions.

This article maps the major operational data source categories, explains what each system records and what it does not, identifies the integration challenges that consistently derail operational analytics initiatives, and describes the architectural patterns that produce a reliable analytical foundation. For the KPIs this data supports, see Operational KPIs. For the analytical techniques applied to this data, see Techniques and Models.

The Operational Data Landscape

Most operations of meaningful scale run on five to eight distinct operational systems, each optimized for a specific transactional function rather than for analytical use. These systems were not designed to work together analytically. They use different identifiers, different timestamps, different granularity levels, and different data models. The integration challenge is not primarily technical (modern data platforms handle integration at scale); it is semantic. Getting the data right requires understanding what each system records, in what context, and with what limitations.

Enterprise Resource Planning (ERP)

ERP systems (SAP S/4HANA, Oracle ERP Cloud, Microsoft Dynamics 365, NetSuite) are the operational system of record for most manufacturing, distribution, and service organizations. They record the financial and transactional side of operations: production orders, purchase orders, work orders, inventory movements, goods receipts, invoicing, and cost allocations.

What ERP data captures well: Order-to-cash flows, procurement cycles, inventory valuation, standard cost structures, planned versus actual production quantities, and financial postings at the cost center level.

What ERP data does not capture well: Real-time production status, asset-level performance, quality event detail, process parameter values, and anything that happens between the start and end of a production or service transaction. ERP records that a work order was completed. It does not record what happened during the execution of that work order unless a human entered it.

Key data entities for operational analytics:

Production Orders contain planned quantities, planned start and end dates, actual quantities confirmed, and cost postings. They are the primary source for variance to plan analysis, cost per unit calculations, and delivery performance measurement.

Inventory movements (goods issues, goods receipts, transfer orders) provide the basis for material yield calculations, scrap recording, and inventory accuracy measurement.

Routing and operation confirmations, where operators post time against work order operations, generate the labor data used for labor efficiency and productivity metrics. The reliability of this data depends heavily on confirmation discipline, which varies significantly across operations.

Integration considerations: ERP data is structured but often delayed. Many operations confirm production once per shift or once per day rather than in real time, creating a reporting lag that makes ERP unsuitable as the sole data source for real-time operational monitoring. ERP data is best used as the system of record for planned values and financial actuals, integrated with higher-frequency sources for real-time monitoring.

Manufacturing Execution Systems (MES)

Where ERP records the plan and the financial result, MES records execution. Manufacturing Execution Systems (systems like Siemens Opcenter, Rockwell FactoryTalk, AVEVA, or purpose-built systems) capture the step-by-step execution of production: job starts and completions, material consumption at the operation level, quality check results, equipment assignments, and operator identifications.

What MES data captures well: Operation-level cycle times, actual vs. standard time by operation, real-time work-in-process location and status, quality results at each process step, genealogy and traceability, and labor assignments at the task level.

What MES data does not capture well: Equipment telemetry (which typically comes from SCADA or direct sensor integration), financial allocations (which belong to ERP), and aggregate business performance (which requires joining to ERP data).

Key data entities for operational analytics:

Process step logs provide the granular cycle time data needed for bottleneck analysis and variance decomposition. When MES records the start and end timestamp of each operation step, it becomes possible to identify not just that a production order was late, but which specific step in the process consumed excess time.

Quality records at the operation level enable First Pass Yield calculation at a granularity that plant-level quality records cannot support. Scrap and rework events recorded in MES with reason codes provide the starting point for root cause analysis.

Electronic batch records (in regulated industries) provide complete process history for traceability, audit, and quality investigation.

Integration considerations: MES and ERP must be synchronized on production order identifiers, material master data, and routing structures. Mismatches between MES operation sequences and ERP routings produce gaps and double-counting in analytical models. This synchronization is often the most time-consuming part of a manufacturing analytics implementation.

IoT Sensors and SCADA Systems

Industrial IoT sensors and Supervisory Control and Data Acquisition (SCADA) systems provide the equipment-level telemetry that neither ERP nor MES captures: machine states, process parameters, energy consumption, vibration signatures, temperature profiles, and environmental conditions.

What IoT/SCADA data captures well: Continuous equipment state (running, idle, faulted, changeover), process parameter values at high frequency (temperature, pressure, speed, torque, flow rate), alarm and event histories, and energy consumption by asset.

What IoT/SCADA data does not capture well: Business context. The sensor does not know which production order was running, which product was being made, or which quality standard applies. This context must be joined from MES or ERP.

Key data entities for operational analytics:

Machine state data is the primary source for OEE Availability calculations. When equipment state is recorded in real time (running, stopped planned, stopped unplanned, changeover) Availability can be calculated accurately at the minute level rather than estimated from shift reports. See Techniques and Models for OEE decomposition methodology.

Process parameter time series provide the data for Statistical Process Control analysis and predictive maintenance models. A vibration signature that precedes a bearing failure, or a temperature drift that precedes a quality excursion, is only detectable if the time series data is captured at sufficient frequency and retained with adequate history.

Alarm and event logs are underutilized in most operations. Alarm frequency is itself a process health indicator: a well-controlled process generates alarms rarely. Alarm floods (when a large number of alarms fire in a short period) indicate either a process upset or a poorly configured alarm system, both of which warrant investigation.

Integration considerations: IoT data volume is large, latency requirements are often sub-second, and the data arrives in formats (time series, binary streams) that differ from the row-structured formats that ERP and MES produce. Time series databases (InfluxDB, TimescaleDB, OSIsoft PI) are commonly used to store this data before it is joined to contextual data from MES and ERP for analytical use. Context joins (matching sensor readings to production orders, products, and quality standards) are the core integration challenge.

Quality Management Systems (QMS)

Quality Management Systems capture the structured quality record: inspection results, non-conformance reports, corrective and preventive actions (CAPAs), supplier quality data, audit findings, and calibration records.

What QMS data captures well: Formal quality events with cause codes, disposition records, CAPA linkages, and regulatory documentation. QMS systems provide the quality history that compliance and continuous improvement programs depend on.

What QMS data does not capture well: High-frequency inline quality data (which comes from MES or dedicated inspection systems) and the relationship between quality events and process parameter values (which requires joining to IoT/SCADA data).

Key data entities for operational analytics:

Non-conformance records with cause codes are the primary source for Pareto analysis of defect types and root cause categories. When coded consistently, they enable analysis of which causes drive the largest share of cost of quality.

CAPA records, when linked to recurrence data, enable measurement of corrective action effectiveness, one of the most informative quality metrics, and one that is rarely tracked in practice.

Supplier quality records (incoming inspection results, supplier defect rates) feed material quality metrics and enable supplier performance scorecards.

Workforce Management and Time Tracking Systems

Workforce management systems (Kronos (UKG), Workday, custom time-and-attendance systems, field service management platforms) capture labor availability, scheduling, attendance, and in many cases activity-level time tracking.

What workforce management data captures well: Scheduled versus actual attendance, overtime hours, labor cost by department and cost center, and in systems with activity tracking, time spent on specific tasks or work orders.

What workforce management data does not capture well: Quality of work performed, adherence to standard methods, and the relationship between labor assignments and output quality or cycle time (which requires joining to MES or production data).

Key data entities for operational analytics:

Attendance and schedule data enables the calculation of actual available labor hours, the denominator in labor productivity calculations. Operations that calculate productivity against a fixed headcount assumption rather than actual available hours produce distorted productivity metrics.

Activity logs, where available, enable labor efficiency analysis at the task level: time spent on productive work versus non-value-added activities, time in training, and time lost to system downtime or material unavailability.

For service operations, agent state data from workforce management or automatic call distribution (ACD) systems provides the equivalent of machine state data: when agents are available, on calls, in after-call work, in training, or absent. This feeds utilization and occupancy calculations directly.

Process Event Logs

Process event logs are the operational data source that receives the least attention in most frameworks but enables the most powerful class of analysis: process mining. Event logs are the digital footprints of business processes, records in enterprise systems that capture what happened, when, and who performed it.

In practice, every enterprise system generates event logs implicitly. An ERP system records when each production order was created, released, confirmed, and closed, along with timestamps and user IDs. A CRM records when leads are created, qualified, converted, and closed. A ticketing system records when cases are opened, assigned, escalated, resolved, and verified. A loan processing system records when applications move through each underwriting and approval step.

Process event logs consist of at minimum three elements: a case identifier (which process instance), an activity label (what happened), and a timestamp (when it happened). With these three elements, process mining algorithms can reconstruct the actual flow of work through a process and compare it to the intended flow, revealing rework loops, bypassed controls, excessive wait times, and bottleneck activities.

For detailed coverage of how to use process event log data analytically, see Techniques and Models.

Inventory and Supply Chain Systems

Warehouse management systems (WMS), inventory management platforms, and transportation management systems (TMS) generate the supply chain operational data that feeds delivery performance, inventory accuracy, and material availability metrics.

Key data entities for operational analytics:

Inventory transaction records (receipts, issues, adjustments, cycle counts) provide the material for inventory accuracy measurement and shrinkage analysis.

Putaway and pick cycle times from WMS systems enable warehouse labor efficiency analysis and throughput measurement.

Carrier performance data from TMS systems feeds on-time delivery analysis at the shipment and carrier level, enabling carrier performance scorecards and service level compliance tracking.

Building an Integrated Operational Data Foundation

The value of operational analytics increases exponentially when data from these systems is integrated, not operated in silos. A plant that measures OEE in SCADA but tracks cost in ERP and quality in a QMS cannot answer the question “What is the cost of quality associated with the assets with the lowest OEE?” without manual reconciliation.

An integrated operational data architecture typically follows this pattern:

Source layer: Real-time or near-real-time extraction from each source system. ERP data via API or database replication; MES via direct database or API integration; IoT/SCADA via message broker (MQTT, OPC-UA) or historian export; QMS and workforce management via scheduled batch extraction.

Integration layer: Common identifier mapping across systems (production order IDs, asset IDs, material codes, employee IDs), timestamp normalization to a single time zone and precision standard, and semantic reconciliation of shared concepts (what counts as “downtime” in SCADA versus what counts as “downtime” in MES).

Analytical layer: A data model that organizes integrated data around the business entities that analysts query: assets, production orders, products, shifts, and quality events. Dimensional models (star schema or snowflake) work well for this purpose when queries are primarily aggregation-oriented. Wide denormalized tables work well when queries are primarily filter-and-retrieve-oriented. Platforms such as Plotono can serve as this analytical layer, connecting to operational data sources through managed pipelines and presenting the integrated data through role-appropriate dashboards without requiring a custom warehouse build.

Access layer: Dashboards and self-service analytics tools that access the analytical layer. The integration work done in lower layers ensures that when a user queries throughput by shift and asset, they get an answer that is consistent across reporting tools and consistent with the financial records in ERP. See Dashboards and Reporting for the dashboard patterns that use this integrated data.

The investment in data integration is typically the largest portion of an operational analytics program implementation. It is also the portion most likely to be underestimated. Organizations that shortcut integration by building point-to-point connections between source systems and dashboards find that each metric exists in a subtly different version across reporting tools, creating reconciliation overhead that consumes the analytical capacity the program was meant to free up.

The Operational Data Landscape

Enterprise Resource Planning (ERP)

Manufacturing Execution Systems (MES)

IoT Sensors and SCADA Systems

Quality Management Systems (QMS)

Workforce Management and Time Tracking Systems

Process Event Logs

Inventory and Supply Chain Systems

Building an Integrated Operational Data Foundation

Get More from D-LIT