Data Sources - Supply Chain & Logistics Analytics

The quality of supply chain analytics is bounded by the quality, completeness, and timeliness of the underlying data. Supply chains are uniquely challenging data environments: the data is fragmented across multiple systems owned by different teams, often managed by different vendors, and frequently operates on different update frequencies - from batch-nightly ERP refreshes to sub-second IoT sensor streams.

Before an organization can measure inventory turnover accurately, it must have reliable, reconciled inventory data from its ERP and WMS. Before it can calculate OTIF, it must integrate order management data with carrier delivery confirmation data. Before it can build a supply chain risk index, it must have supplier financial signals, lead time history, and geographic exposure data all in the same analytical environment.

This article covers the eight primary supply chain data source categories, what each system contains, how to connect it for analytics, and what data quality issues to anticipate. For the KPIs these sources support, see Supply Chain KPIs. For the analytical techniques applied to this data, see Techniques and Models.

Enterprise Resource Planning (ERP) Systems

ERP systems are the system of record for financial transactions, purchase orders, sales orders, production orders, inventory valuation, and supplier master data. They are the most important and most complex data source in the supply chain analytics stack.

Primary systems: SAP S/4HANA, SAP ECC, Oracle E-Business Suite, Oracle Fusion Cloud, Microsoft Dynamics 365 Supply Chain, NetSuite, Infor CloudSuite, Epicor.

Key data domains within ERP:

Inventory management: Stock levels by plant, storage location, and material. Goods receipts and goods issues with timestamps. Inventory valuation (standard cost, moving average, FIFO). Batch and serial number tracking. Safety stock levels and reorder points configured in the system.

Procurement: Purchase orders by vendor, material, quantity, and requested delivery date. Goods receipts matched to purchase orders. Invoice receipts and payment status. Vendor master data including payment terms, currency, and sourcing category.

Sales and order management: Customer sales orders with requested and confirmed delivery dates. Delivery documents and goods issue records. Customer master data, pricing conditions, and credit limits.

Production: Production orders, bills of materials, routings, and work center data. Planned and actual production quantities and completion dates. Material requirements planning (MRP) run outputs and exception messages.

Connection patterns:

SAP systems can be accessed via several mechanisms depending on your infrastructure. The OData API layer (available in S/4HANA and newer ECC versions) is the preferred modern approach for real-time or near-real-time reads. For large-volume historical loads, direct database extraction via SAP’s CDS (Core Data Services) views or table-level extraction through tools like SAP Datasphere, Fivetran’s SAP connector, or Airbyte is common. Avoid direct table reads from custom ABAP programs without formal data governance approval - ERP schemas are complex and joins written outside the application layer frequently miss business rules.

Data quality challenges:

ERP data quality problems fall into predictable categories. Master data issues - duplicate vendor records, inconsistent material numbering, missing attributes on recently created materials - are endemic and require MDM governance processes to address. Transactional timing issues arise when goods receipts are posted days after physical receipt, causing inventory positions to lag physical reality. Configuration changes in the ERP (new plant codes, reorganized storage locations, changed valuation methods) break historical time series comparisons without warning. Any analytics built on ERP data must account for these realities.

Warehouse Management Systems (WMS)

WMS systems manage the physical movement of goods within distribution centers and warehouses. They are the source of truth for physical inventory location, pick and pack activity, labor productivity, and order accuracy.

Primary systems: Manhattan Associates WMS, Blue Yonder Luminate Warehouse, SAP Extended Warehouse Management (EWM), Oracle WMS Cloud, HighJump (Korber), Infor WMS, 3PL Central (for third-party logistics providers).

Key data domains:

Inventory location: Bin-level inventory positions showing exactly where each unit is stored. Putaway and pick history. Cycle count results and inventory adjustment records. Expiry date and lot tracking for regulated industries.

Order fulfillment: Wave and batch picking performance. Pick confirmation rates and pick error rates. Packing and manifesting activity. Ship confirmation with carrier tracking numbers.

Labor and productivity: Task completion rates by associate, zone, and time period. Units per hour by task type (receiving, putaway, picking, packing). Exception events including system-directed task overrides.

Dock and yard management: Inbound appointment schedules and actual arrival times. Dock-to-stock cycle times for receiving. Outbound staging and loading times.

Connection patterns:

WMS systems typically expose data through scheduled database exports, REST APIs (increasingly common in cloud-native platforms), or integration middleware layers. For operational dashboards requiring near-real-time inventory visibility, direct database replication or event-driven integration via message queues (Kafka, AWS EventBridge) is preferable to scheduled batch extracts. The key latency requirement: inventory position data for stockout monitoring should be no more than 4 hours old; for high-velocity fulfillment operations, 15-30 minutes is the appropriate target.

Data quality challenges:

Cycle count frequency and coverage directly impacts WMS inventory accuracy. Warehouses that cycle count less than 10% of locations per month will have meaningful location-level discrepancy rates that corrupt analytics. Adjustment transactions that lack a reason code obscure root cause analysis. WMS systems that have been heavily customized over time frequently have non-standard schemas that require significant reverse engineering to use for analytics.

Transportation Management Systems (TMS)

TMS systems manage freight procurement, carrier selection, route planning, load tendering, and shipment tracking. They are the primary source of data for logistics cost analytics, carrier performance measurement, and transit time analysis.

Primary systems: Oracle Transportation Management, SAP Transportation Management, MercuryGate, MoLo, Transplace (now Uber Freight), project44, Descartes Systems.

Key data domains:

Shipment execution: Load tender details including carrier, mode, lane, and rate. Tendering acceptance rates by carrier. Pickup and delivery confirmation with timestamps. Accessorial charges (fuel surcharge, detention, liftgate).

Carrier performance: On-time pickup and delivery rates by carrier and lane. Claims frequency and resolution rates. Carrier capacity availability by lane and time period.

Freight cost: Rated cost per shipment by carrier, mode, lane, and shipment weight/volume. Accessorial cost breakdown. Benchmark rate comparison where TMS rate shopping data is captured.

Route optimization: Planned versus actual routes. Multi-stop routing efficiency metrics. Driver compliance with planned routes.

Connection patterns:

Modern TMS platforms provide REST APIs and webhook-based event notifications for shipment status events. For carriers without direct TMS integration, EDI-based tracking updates (EDI 214 transaction sets for motor carrier shipment status) remain the standard. Third-party visibility platforms (project44, FourKites, Descartes MacroPoint) aggregate carrier tracking data across hundreds of carriers into a single API, making them a practical shortcut for organizations with diverse carrier bases.

Demand Planning and Forecasting Platforms

Demand planning systems generate the statistical forecasts and collaborative demand signals that drive replenishment, production scheduling, and capacity planning. Their output data is critical input to inventory optimization analytics.

Primary systems: Blue Yonder (formerly JDA) Luminate Planning, Kinaxis RapidResponse, o9 Solutions, SAP Integrated Business Planning (IBP), Oracle Demand Management Cloud, Logility, E2open.

Key data domains:

Statistical forecasts: Baseline statistical forecasts by SKU and location, generated by algorithm (ARIMA, exponential smoothing, machine learning models). Forecast accuracy history (MAPE, WMAPE, bias) by SKU and time horizon.

Collaborative overrides: Commercial overrides applied by sales or marketing teams to account for promotions, new product launches, or anticipated demand changes. Override history and approval workflows.

Consensus demand plan: The agreed-upon demand plan after statistical and commercial inputs are reconciled. Version history of plan changes. Approved plan versus unconstrained forecast.

Exception management: SKUs flagged for review due to high forecast error, unusual demand patterns, or missing data. MRP exception messages and planner actions.

Connection patterns:

Demand planning platforms typically export to a data warehouse through scheduled batch interfaces or dedicated analytics connectors. The critical field to export alongside the forecast quantity is the forecast creation date - without it, you cannot analyze forecast accuracy at different time horizons. Forecast data without this field makes it impossible to answer the question “how accurate was our 12-week forecast versus our 4-week forecast?”

Electronic Data Interchange (EDI) and Supplier Portals

EDI and supplier portals are the primary mechanisms for exchanging structured operational data with suppliers. They provide the ground truth for supplier performance analytics.

Key EDI transaction sets:

Transaction Set	Description	Analytics Use
850	Purchase Order	Order placement timing
855	PO Acknowledgment	Supplier confirmation and date acceptance
856	Advance Ship Notice (ASN)	Advance visibility to inbound shipments
997	Functional Acknowledgment	EDI connectivity health
810	Invoice	AP matching and payment timing

Supplier portal data: Supplier portals (Ariba, Coupa, SAP Business Network, Oracle Supplier Portal) capture broader supplier relationship data: capacity declarations, lead time updates, quality certificates, financial documents, and compliance certifications.

Analytics value: The gap between Purchase Order date and ASN date reveals supplier production lead time. The gap between ASN promised delivery date and actual goods receipt reveals transportation performance. Suppliers who consistently fail to send ASNs before shipment arrival cannot provide visibility for receiving planning - an operational cost that should be quantified and included in supplier scorecards.

Connection challenges: EDI data arrives in structured but non-relational formats through VAN (value-added network) providers like SPS Commerce, TrueCommerce, or direct AS2 connections. Parsing, normalizing, and linking EDI transactions to ERP purchase orders requires an integration layer. Many organizations outsource this complexity to EDI managed service providers, but the parsed data must flow into the analytics environment to be useful.

IoT Sensors and GPS Fleet Data

Connected hardware - from GPS trackers on trucks to environmental sensors in warehouses and cold chain monitoring devices on shipments - generates continuous streams of operational data that enable real-time visibility and proactive exception management.

Key data streams:

GPS fleet tracking: Vehicle location at defined intervals (typically every 30-60 seconds for in-transit trucks). Speed, idle time, and geofence events. Driver hours of service compliance data.

Cold chain monitoring: Temperature and humidity readings at defined intervals for temperature-sensitive cargo. Threshold breach events with timestamp and location. Chain of custody documentation.

Warehouse sensors: Dock door activity sensors that detect when doors open and close. Environmental monitoring (temperature, humidity) in cold storage areas. Forklift telematics for utilization and impact event tracking.

RFID and barcode events: Read events from fixed RFID portals at dock doors, conveyor systems, and staging areas. These create timestamped location history for inventory items without manual scanning.

Connection patterns:

IoT data is almost exclusively stream-based, requiring a streaming data pipeline rather than batch extraction. Common architectures route IoT messages through an MQTT broker or directly to a cloud event hub (AWS IoT Core, Azure IoT Hub, Google Cloud Pub/Sub), then into a stream processing layer (Apache Kafka, Apache Flink) for enrichment and routing, with materialized views or aggregated tables written to a data warehouse for analytics queries. For real-time control tower applications, a separate hot path delivers alerts directly to operational users without waiting for the warehouse load cycle.

Data volume consideration: A fleet of 500 trucks generating GPS pings every 60 seconds produces 720,000 position records per day. A cold chain monitoring network covering 1,000 shipments with 5-minute readings generates 288,000 temperature records per day. Analytics schemas must be designed for this volume, using partitioning and appropriate retention policies rather than treating IoT data like ERP transactional records.

Order Management Systems (OMS)

Order management systems are the hub of customer-facing order processing, managing order capture, order routing, available-to-promise (ATP) logic, and post-order customer service.

Primary systems: IBM Sterling Order Management, Blue Yonder Order Management, Salesforce Order Management, SAP Order Management, custom-built OMS platforms common in large retailers.

Key data domains:

Order lifecycle: Full history of order status transitions from capture through fulfillment. Source channel (web, EDI, call center). Order lines including SKU, quantity, requested date, and ship-to location.

Fulfillment routing: Decisions on which warehouse or store fulfills each order line, with the routing logic and reason codes.

Available-to-promise: ATP check results showing inventory availability at time of order capture. Inventory reservations and reservation release events.

Post-order events: Customer cancellations and modification requests. Split shipment events when orders are fulfilled from multiple locations. Customer service contacts linked to order IDs.

Analytics value: OMS data enables analysis that neither ERP nor WMS can provide alone: the customer experience journey from order placement to delivery confirmation. Linking OMS data to WMS data enables measurement of order cycle time (order capture to ship confirmation). Linking OMS to carrier tracking enables end-to-end order-to-delivery cycle time, the metric customers care about most.

Third-Party Logistics (3PL) Fulfillment Data

Organizations using 3PL providers for warehousing and fulfillment must extract operational data from 3PL systems, which frequently creates the most difficult integration challenge in the supply chain data architecture.

Data categories from 3PLs:

Inventory reporting: Inventory on-hand reports by SKU and location, typically provided daily (sometimes real-time via API for modern 3PLs). Inventory aging reports. Physical inventory count results.

Order fulfillment: Order pick, pack, and ship activity. Shipment details including carrier and tracking numbers. Freight cost billing details.

Receiving: Inbound receipt confirmations with quantities received and discrepancy notes. Receiving productivity and backlog metrics.

Integration challenges: 3PL data integration is frequently the most challenging element of a supply chain data architecture for two reasons. First, 3PLs use diverse WMS platforms, and their willingness to provide API access varies significantly - some provide modern REST APIs; others provide only weekly Excel reports via email. Second, 3PL data is governed by contractual terms, and requesting data access beyond what is contractually specified may require contract amendments. Organizations should negotiate data access requirements into 3PL contracts at the time of engagement, not after.

Best practice: Establish data exchange standards with 3PL partners in the master service agreement, specifying format, frequency, latency requirements, and API access obligations. Build data quality monitoring that flags when 3PL data feeds are delayed or contain anomalous values - a 3PL inventory report that stops arriving is often the first indication of an operational problem at that facility.

Building a Unified Supply Chain Data Model

The analytical value of any single source is limited. The transformative capability comes from integrating these sources into a unified data model where inventory positions, orders, shipments, supplier performance, and cost data are linked by common keys: SKU, location, supplier, order number, and time period.

Common integration keys:

Key Type	Links These Sources
SKU / Material Number	ERP, WMS, OMS, Demand Planning, 3PL
Location / Plant Code	ERP, WMS, TMS, IoT
Purchase Order Number	ERP, EDI/Supplier Portal, TMS
Sales Order Number	ERP, OMS, WMS, TMS, Carrier
Supplier ID	ERP, EDI, Supplier Portal, Risk Data

Data latency tiers:

Design the data architecture around three latency tiers with different technical implementations. The real-time tier (sub-minute to 15 minutes) serves operational alerting: stockout signals, shipment exception alerts, cold chain breach notifications. The near-real-time tier (1-4 hours) serves operational dashboards: current inventory positions, open order status, today’s fulfillment metrics. The batch tier (daily to weekly) serves analytical reporting: KPI trends, supplier scorecards, cost analysis, forecasting model training.

Attempting to run all analytics from a single real-time stream is unnecessarily complex and expensive. Attempting to run all analytics from a nightly batch is too slow for operational decision-making. The tiered architecture matches data freshness to decision urgency. Platforms like Plotono can operationalize this tiered model by managing data pipelines across latency tiers while surfacing unified views to each audience.

For guidance on how to present this integrated data to different audiences, see Dashboards and Reporting. For the analytical methods that transform this data into decisions, see Techniques and Models.

Enterprise Resource Planning (ERP) Systems

Warehouse Management Systems (WMS)

Transportation Management Systems (TMS)

Demand Planning and Forecasting Platforms

Electronic Data Interchange (EDI) and Supplier Portals

IoT Sensors and GPS Fleet Data

Order Management Systems (OMS)

Third-Party Logistics (3PL) Fulfillment Data

Building a Unified Supply Chain Data Model

Get More from D-LIT