SCADA Historian vs Cloud Data Storage
Key Takeaway
Comparing on-premises SCADA historians with cloud-based data storage for industrial time-series data. Covers architecture, latency, cost, data sovereignty, hybrid approaches, and migration strategies for energy and industrial operations.
The Historian's Role in SCADA Systems
A SCADA historian is a specialized database optimized for storing and retrieving time-series process data. Unlike general-purpose databases, historians use compression algorithms designed for industrial data patterns, achieving 10:1 to 20:1 compression ratios while preserving data fidelity. A single historian server can store billions of data points spanning years or decades, supporting trend analysis, regulatory reporting, process optimization, and root cause investigation.
Traditional SCADA historians run on dedicated on-premises servers within the OT network, typically at Purdue Model Level 3. Leading historian platforms include AVEVA Historian (formerly Wonderware), OSIsoft PI (now AVEVA PI), FactoryTalk Historian (Rockwell), and Ignition Tag Historian. These platforms have decades of proven reliability in industrial environments.
Cloud Data Storage for Industrial Data
Cloud Time-Series Databases
Cloud platforms offer time-series database services purpose-built for IoT and industrial data:
- AWS Timestream: Serverless time-series database with automatic scaling and built-in analytics
- Azure Data Explorer (ADX): High-performance analytics engine optimized for time-series and log data
- Google Cloud IoT Core + BigQuery: IoT ingestion with powerful SQL analytics on stored data
- InfluxDB Cloud: Open-source time-series database available as managed cloud service
- AVEVA Data Hub: AVEVA's cloud historian, designed as a natural extension of on-premises PI or Historian
Cloud Advantages
Cloud storage offers compelling benefits for certain use cases:
- Scalability: Storage and compute scale automatically without hardware procurement cycles
- Advanced analytics: Integration with machine learning, anomaly detection, and predictive maintenance services
- Multi-site aggregation: Central data lake consolidating data from multiple facilities and SCADA systems
- Disaster recovery: Geographic redundancy and high availability without building a secondary data center
- Reduced IT burden: No server hardware to maintain, patch, or replace on lifecycle schedules
On-Premises Historian Advantages
On-premises historians retain critical advantages for operational technology environments:
- Latency: Sub-millisecond query response for real-time operator trend displays and reports. Cloud queries introduce 50-500ms network latency that impacts operator experience.
- Availability independence: Historian continues to collect and serve data during internet outages. Cloud-dependent systems lose historical data access if connectivity drops.
- Data sovereignty: Process data remains within the OT network, simplifying cybersecurity compliance with IEC 62443 and reducing the attack surface
- Regulatory compliance: Some regulations require data retention on systems the operator controls, not third-party cloud infrastructure
- Bandwidth: A SCADA system generating 50,000 data points per second would require substantial bandwidth to stream all raw data to the cloud
Hybrid Architecture: The Practical Approach
Most industrial organizations are adopting hybrid architectures that combine the reliability of on-premises historians with the analytical power of cloud platforms:
- Edge-to-cloud replication: On-premises historian handles real-time operations while a subset of data replicates to cloud for enterprise analytics, typically downsampled or aggregated to reduce bandwidth and cost
- DMZ data broker: A server in the industrial DMZ (Level 3.5) extracts data from the OT historian and pushes it to the cloud via HTTPS, maintaining the security boundary
- MQTT/Sparkplug B: Edge devices publish data to both local historian and cloud MQTT broker simultaneously, with the cloud receiving a parallel data stream without touching the OT network
- Store and forward: Edge historian caches data locally and forwards to cloud when bandwidth is available, handling intermittent connectivity gracefully
Cost Comparison
On-Premises Costs
On-premises historians require capital expenditure for server hardware (typically refreshed every 5-7 years), historian software licensing (ranging from $10,000 to $100,000+ depending on platform and tag count), storage hardware (SAN or NAS for large datasets), IT staff time for maintenance and patching, and physical infrastructure (rack space, power, cooling, UPS). Total 5-year cost for a mid-size historian (50,000 tags) typically ranges from $100,000 to $300,000.
Cloud Costs
Cloud storage uses an operational expenditure model with costs driven by data ingestion rate (cost per million data points written), storage volume (cost per GB-month retained), query volume (cost per GB scanned or compute time), and egress (cost to download data). For high-volume SCADA data (millions of data points per day), cloud costs can exceed on-premises costs within 2-3 years, particularly for long retention periods. Careful data tiering (hot/warm/cold storage) and aggregation strategies are essential for cost management.
Migration Strategies
Migrating historical data from on-premises to cloud requires careful planning. Key considerations include determining which data to migrate (not all historical data may have analytical value in the cloud), choosing a migration method (bulk export/import vs. continuous replication), mapping historian tag names and metadata to cloud data structures, validating data integrity after migration, and training staff on cloud analytics tools. NFM Consulting helps Texas energy companies design hybrid historian architectures that maintain operational reliability while unlocking cloud analytics capabilities.
Frequently Asked Questions
Most industrial operations should maintain an on-premises historian for real-time operations and replicate a subset of data to the cloud for enterprise analytics. Completely replacing an on-premises historian with cloud storage introduces latency and connectivity dependency risks that are unacceptable for operator trend displays and real-time decision making.
Cloud costs depend on data volume, retention period, and query frequency. A SCADA system with 10,000 tags at 5-second scan rates generates approximately 170 million data points per day. At typical cloud pricing, ingestion and storage can cost $500-3,000 per month. Downsampling data to 1-minute averages before cloud upload reduces costs by 90% or more.
Use a DMZ-based data broker or MQTT gateway that pulls data from the OT historian and pushes it to the cloud via outbound-only HTTPS or MQTT connections. Never allow inbound connections from the cloud to the OT network. AVEVA Data Hub, AWS IoT SiteWise, and Azure IoT Hub all support secure unidirectional data ingestion patterns.