Amazon Data Firehose

Reliably load real-time streams into data lakes, warehouses, and analytics services

Benefits

Easily capture, transform, and load streaming data. Create a delivery stream, select your destination, and start streaming real-time data with just a few clicks.

Automatically provision and scale compute, memory, and network resources without ongoing administration.

Transform raw streaming data into formats like Apache Parquet, and dynamically partition streaming data without building your own processing pipelines.

How it works

Amazon Data Firehose provides the easiest way to acquire, transform, and deliver data streams within seconds to data lakes, data warehouses, and analytics services. To use Amazon Data Firehose, you set up a stream with a source, destination, and required transformations. Amazon Data Firehose continuously processes the stream, automatically scales based on the amount of data available, and delivers it within seconds.

Select the source for your data stream, such as a topic in Amazon Managed Streaming for Kafka (MSK), a stream in Kinesis Data Streams, or write data using the Firehose Direct PUT API. Amazon Data Firehose is integrated into 20+ AWS services, so you can set up a stream from sources such as Databases (preview), Amazon CloudWatch Logs, AWS WAF web ACL logs, AWS Network Firewall Logs, Amazon SNS, or AWS IoT.

Specify if you want to convert your data stream into formats such as Parquet or ORC, decompress the data, perform custom data transformations using your own AWS Lambda function, or dynamically partition input records based on attributes to deliver into different locations.

Select a destination for your stream, such as Amazon S3, Amazon OpenSearch Service, Amazon Redshift, Splunk, Snowflake, Apache Iceberg Tables, Amazon S3 Tables (preview) or a custom HTTP endpoint.

For more information about Amazon Data Firehose, see Amazon Data Firehose Documentation.

Use cases

Stream data into Amazon S3 and convert data into required formats for analysis without building processing pipelines.

Monitor network security in real time and create alerts when potential threats arise using supported Security Information and Event Management (SIEM) tools.

Enrich your data streams with machine learning (ML) models to analyze data and predict inference endpoints as streams move to their destination.


Explore more of AWS