Data Sources - IT & Systems Analytics

IT analytics is unusually data-rich compared to most business domains. A mid-sized organization running a few hundred servers generates millions of metric data points per hour from infrastructure monitoring alone, before accounting for logs, traces, ITSM records, and security events. The challenge is not finding data - it is building a coherent integration architecture that connects the right source to the right analytical purpose without creating a brittle, unmaintainable spaghetti of point-to-point connections.

This article catalogs the primary data source categories for IT analytics, the specific platforms in each category, what data each produces, and how to integrate them into an analytics architecture. The focus is practical: what fields matter, what are the integration options, and what are the common pitfalls that cause data quality problems downstream.

IT Service Management (ITSM) Platforms

ITSM platforms are the system of record for the human side of IT operations: incidents, service requests, changes, problems, and knowledge articles. They produce the structured, business-contextualized data that feeds service delivery KPIs and SLA reporting.

ServiceNow

ServiceNow is the dominant enterprise ITSM platform. Its data model is rich and well-documented, and it offers several integration pathways.

Key tables and fields:

incident - the core incident table. Key fields: sys_id (UUID), number (INC0001234), priority (1-4), state (1=New, 2=In Progress, 6=Resolved, 7=Closed), opened_at, resolved_at, closed_at, assigned_to, assignment_group, category, subcategory, business_impact, short_description, resolution_code, work_notes.
change_request - change records. Key fields: type (standard/normal/emergency), state, start_date, end_date, risk, conflict_status, cab_required, close_code.
task_sla - SLA tracking table. Contains task (FK to incident/change), sla (FK to SLA definition), has_breached, breach_time, business_time_left.
cmdb_ci - Configuration Item records for the CMDB. Links incidents to affected infrastructure.

Integration options:

Table API (REST): ServiceNow exposes every table via a REST endpoint at /api/now/table/{table_name}. Suitable for incremental extraction using sysparm_query=sys_updated_on>javascript:gs.dateGenerate('YYYY-MM-DD','00:00:00'). Paginate with sysparm_offset and sysparm_limit.
Export Sets: Scheduled exports to a file server or cloud storage bucket. Less flexible but lower API quota impact.
MID Server + JDBC: For high-volume extracts, a MID Server can push data directly to a data warehouse using JDBC. Suitable for full historical loads.
IntegrationHub: Native ServiceNow feature for event-driven integrations, useful for real-time incident feeds.

Common data quality issues: Priority fields are often manually set and inconsistently applied across teams. Assignment group names change over time and require a dimension table for consistent grouping. Resolved and closed timestamps are frequently confused - use resolved_at for MTTR calculation, not closed_at.

Jira Service Management

Jira Service Management (formerly Jira Service Desk) is common in organizations that use Atlassian tooling. Its data model maps differently to ITSM concepts, requiring translation.

Key objects:

Issues - the core record type. Relevant issue types: Incident, Service Request, Change, Problem. Key fields: id, key (e.g., HELP-1234), status, priority, created, updated, resolutiondate, assignee, reporter, components, labels, customfield_* (SLA fields are stored as custom fields).
SLA fields - stored as custom fields with structured JSON values including completedCycles (array of breached/completed SLA windows) and ongoingCycle (current SLA status). Parsing requires JSON extraction logic in your ETL.
Worklogs - time-tracking entries linked to issues. Useful for calculating labor cost per incident.

Integration options:

REST API v3: GET /rest/api/3/search with JQL filters. JQL supports date range filtering: project = HELP AND created >= "2024-01-01" AND created <= "2024-01-31". Paginate with startAt and maxResults.
Jira Automation webhooks: Trigger outbound webhooks on issue events (created, updated, resolved). Suitable for near-real-time feeds.
Atlassian Analytics: Native analytics product, suitable if you want managed reporting without building an ETL.

Infrastructure Monitoring Platforms

Infrastructure monitoring platforms collect continuous telemetry from hosts, containers, services, and networks. They are the primary source for availability, performance, and capacity KPIs.

Datadog

Datadog is a cloud-native monitoring platform with broad integration coverage. It ingests metrics, logs, and traces under a unified data model.

Key data types:

Metrics - time series data points. Collected at configurable intervals (typically 10-60 seconds). Accessible via the Metrics Query API (/api/v1/query) and the Metrics Stream API for high-volume export. Key metric namespaces: system.cpu.user, system.mem.used, system.disk.used, system.net.bytes_sent, kubernetes.*, aws.*.
Events - discrete events with a timestamp, title, text, and tags. Host restarts, deployment markers, and alert state changes appear here. Accessible via the Events API.
Monitors - alert definitions. Monitor history shows when alerts fired and resolved, making it a synthetic source for availability calculations.
Incidents - Datadog Incident Management records. Structured incident data including timeline, responders, and impact.

Integration options:

Metrics API: Query time series data directly. Suitable for pulling specific metrics into a data warehouse on a scheduled basis. Rate limits apply; use metric rollups (hourly, daily) for historical extraction rather than raw resolution.
Log Archives: Configure Datadog to archive logs to S3 or GCS in NDJSON format. Pull from the archive for log-based analytics without API rate limits.
Datadog Forwarder (Lambda): For AWS integrations, the Datadog Forwarder Lambda can forward metrics and logs to external destinations.

New Relic

New Relic provides APM, infrastructure monitoring, and distributed tracing under a unified telemetry database (NRDB).

Key data types and NRQL queries:

SystemSample - host metrics (CPU, memory, disk, network). Query: FROM SystemSample SELECT average(cpuPercent), average(memoryUsedPercent) FACET hostname SINCE 7 days ago.
ProcessSample - per-process resource consumption.
Transaction - APM transaction data including duration, error rate, and throughput.
Metric - dimensional metrics from OpenTelemetry and custom instrumentation.

Integration options:

NerdGraph API (GraphQL): New Relic’s primary query API. Execute NRQL queries programmatically and retrieve results. Supports cursor-based pagination for large result sets.
Streaming Data Export: New Relic offers streaming export to AWS Kinesis Data Firehose or Azure Event Hubs. Suitable for high-volume, low-latency integration.

Prometheus and Grafana

Prometheus is the de facto standard for cloud-native infrastructure monitoring, particularly in Kubernetes environments.

Data model: Prometheus stores time series data as streams of timestamped float64 samples identified by a metric name and a set of labels. The query language PromQL is powerful for point-in-time and range vector calculations.

Integration for analytics: Prometheus is optimized for real-time querying, not historical analytics at scale. For long-term analytics:

Thanos or Cortex - distributed Prometheus with object storage backend. Thanos Store Gateway enables querying data stored in S3/GCS via the Prometheus-compatible API.
Remote Write to TimescaleDB or ClickHouse - configure Prometheus remote_write to push metrics to a columnar store optimized for time series analytics.
Grafana MIMIR - managed long-term storage for Prometheus metrics with cost-efficient tiered storage.

Security Information and Event Management (SIEM)

SIEM platforms aggregate security event data from across the environment and provide correlation, alerting, and audit capabilities.

Splunk

Splunk is the dominant enterprise SIEM. It ingests unstructured and structured data from virtually any source and provides SPL (Search Processing Language) for analysis.

Key data sources within Splunk:

Security events - Windows Security Event Log (Event IDs 4624, 4625, 4648, 4768, etc.), Linux syslog, firewall logs, IDS/IPS events.
Endpoint telemetry - Splunk UBA (User Behavior Analytics) and Splunk SOAR feed anomaly detections.
Network flows - NetFlow, sFlow, and IPFIX data for traffic analysis.

Integration for analytics: Splunk supports data export via the REST API (/services/search/jobs), the Splunk SDK, and the outputlookup command for scheduled report generation. For large-scale export, the Splunk HEC (HTTP Event Collector) can be reversed - use the Forwarder to push to external systems. Splunk also supports ODBC/JDBC connections for direct query from BI tools.

Microsoft Sentinel

Azure Sentinel (now Microsoft Sentinel) is a cloud-native SIEM/SOAR built on Azure Log Analytics.

Key tables:

SecurityEvent - Windows security events from Azure Monitor Agent.
Syslog - Linux system logs.
SecurityAlert - alerts from connected security products.
SecurityIncident - incident records with status, severity, and responder assignment.
CommonSecurityLog - CEF-format events from firewalls and IDS.

Integration options:

Log Analytics API: Query tables using KQL (Kusto Query Language) via the Log Analytics REST API. Results return as JSON.
Azure Monitor Data Export: Configure workspace data export rules to stream table data to Azure Storage Account or Azure Event Hub in real time.
Azure Data Factory: Native connectors for moving Sentinel data to Azure Synapse or Azure Data Lake Storage.

Application Performance Monitoring (APM)

APM tools capture service-level performance data with distributed tracing context. They are the primary source for application performance KPIs.

Key platforms: Datadog APM, New Relic APM, Dynatrace, AppDynamics, Elastic APM, Jaeger (open source).

Key data types common across platforms:

Traces - end-to-end request flows across services. Each trace consists of spans representing individual service calls. Key attributes: trace ID, span ID, parent span ID, service name, operation name, duration, status (success/error).
Service metrics - derived from traces: request rate (rpm/rps), error rate (%), latency (P50, P95, P99).
Apdex scores - aggregated user satisfaction scores per service.
Deployment markers - annotations on performance timelines marking when code was deployed. Critical for change impact analysis.

Integration patterns: Most APM platforms expose a metrics API and a traces search API. For analytics, derive service-level aggregate metrics from traces using the platform’s metrics API rather than raw trace export, which can involve terabytes of data for high-traffic services.

Cloud Provider Monitoring

AWS, Azure, and GCP each provide native monitoring services that are the authoritative source for cloud infrastructure metrics.

AWS CloudWatch

Key namespaces:

AWS/EC2 - instance CPU, network, disk metrics.
AWS/RDS - database CPU, connections, latency, IOPS.
AWS/ALB - load balancer request count, latency, HTTP error rates.
AWS/Lambda - invocation count, duration, error rate, throttles.
AWS/S3 - request metrics, bucket size.

Integration options:

CloudWatch Metrics API (GetMetricData): Pull specific metrics with arbitrary time ranges and statistics. Supports batch requests for multiple metrics.
CloudWatch Metric Streams: Near-real-time streaming of CloudWatch metrics to Amazon Kinesis Data Firehose, enabling sub-minute latency delivery to S3, Redshift, or third-party destinations.
AWS Cost and Usage Report (CUR): For financial analytics, the CUR is the authoritative source for detailed cloud spend data, delivered daily to S3 in CSV or Parquet format.

Azure Monitor

Key data types:

Metrics - numeric time series for Azure resources (VM CPU, storage latency, App Service response time).
Logs - structured log data in Log Analytics workspace, queryable via KQL.
Activity Log - audit trail of control-plane operations (who created/modified/deleted what resource, when).

Integration options: Azure Monitor Metrics REST API for metric extraction. Log Analytics API for log queries. Azure Monitor Data Export for continuous streaming to Event Hub or Storage Account.

Endpoint Management

Endpoint management platforms track the state and compliance of managed devices, feeding patch compliance and security posture metrics.

Microsoft Intune: Exposes device compliance state, OS version, patch status, and app deployment status via the Microsoft Graph API (/deviceManagement/managedDevices, /deviceManagement/deviceCompliancePolicies).

Jamf Pro: macOS and iOS endpoint management. Exposes inventory, patch status, and policy compliance via the Jamf Pro Classic API and the newer Jamf Pro API. Key endpoints: /v1/computers-inventory, /v1/patch-management-software-titles.

Key fields for analytics: device ID, OS version, last check-in timestamp, compliance status, patch status by severity, encryption status, antivirus status.

Network Monitoring

Network monitoring tools provide bandwidth utilization, latency, packet loss, and device health data.

Key platforms: SolarWinds NPM, PRTG, Nagios, Zabbix, Kentik, Cisco ThousandEyes.

Key data types:

SNMP polls - device health (CPU, memory, interface statistics) queried via Simple Network Management Protocol at regular intervals.
NetFlow/sFlow - traffic flow records showing source/destination IP, protocol, bytes, and packets. Primary source for bandwidth utilization analytics.
ICMP probes - latency and availability checks. Nagios and similar tools store probe results in RRD files or PostgreSQL databases.

Integration: Most network monitoring platforms expose REST APIs for current and historical data. SolarWinds NPM supports SWQL (SolarWinds Query Language) for database-level access. For flow analytics, store NetFlow data in ClickHouse or Apache Pinot for high-throughput time series queries.

Building a Coherent Integration Architecture

The most important design decision in IT analytics data architecture is the separation of real-time from historical paths. Real-time dashboards for NOC operators cannot tolerate the latency of a nightly batch load from a data warehouse. Historical capacity planning cannot depend on data retained only in monitoring tools with 30-day retention windows.

Recommended architecture pattern:

Streaming layer: Monitoring platforms push or stream metrics to an event bus (Kafka, AWS Kinesis, Azure Event Hubs). Real-time dashboards consume from this layer with sub-minute latency.
Historical storage: The event bus fans out to columnar storage (ClickHouse, BigQuery, Redshift, Snowflake) for long-term retention and analytical queries. ITSM data loads on a scheduled cadence (hourly for operational reporting, nightly for historical analysis).
Metadata layer: A centralized metadata store maps monitoring IDs to ITSM CIs, business service names, and cost centers. This is the join key that makes cross-source analytics possible - linking a Datadog host metric to a ServiceNow service record to an AWS cost center.

Analytics platforms such as Plotono can sit on top of this architecture, connecting to multiple data sources through managed pipelines and presenting unified dashboards without requiring teams to build custom ETL for each integration.

For the analytical techniques that use these data sources - capacity forecasting, SLA breach prediction, root cause analysis - see IT Techniques & Models. For KPI definitions that rely on these sources, see IT KPIs.

IT Service Management (ITSM) Platforms

ServiceNow

Jira Service Management

Infrastructure Monitoring Platforms

Datadog

New Relic

Prometheus and Grafana

Security Information and Event Management (SIEM)

Splunk

Microsoft Sentinel

Application Performance Monitoring (APM)

Cloud Provider Monitoring

AWS CloudWatch

Azure Monitor

Endpoint Management

Network Monitoring

Building a Coherent Integration Architecture

Get More from D-LIT