# Kinesis
#aws #cloud #serverless #dataanalytics
Family of services to collect, process and analyze streaming data in real-time by AWS.
Streaming data is simply data continuously generated by many data sources sending a small amount of information simultaneously. (Think of harvesting rainwater)
Deals with moving data (streaming data) similar to Kafka.
# Kinesis Streams
Enables us to stream data and video. (Kinesis Data Streams and Kinesis Video Streams).
(Eg recording online meeting video, Game score in real-time).
Uses Shards, retains data, and requires Consumers.
# Kinesis Data Stream
# Kinesis Video Streams
# Kinesis Data Firehose
- Load streaming data into AWS data stores for near-real-time analytics.
- No Shards, No Data retention, Optional Consumer and Automatic Capacity
- Consumers are optional. (can Chose only one consumer)
- Data is not persisted
- Can use Lambda for processing data.
- Data is directly to the AWS store.
- Inexpensive
It can be used to load data directly into destinations like: S3 Redshift Elastic Search etc
# Kinesis Data Analytics
For real time analytics using SQL. You need to specify Data Firehose, Data Stream or Lambda as destination. Provides analytics from Kinesis Data Stream and Kinesis Data Firehose
Allows running SQL queries and stores the result on AWS data-store.
# Kinesis Shards
Kinesis streams are made up of shards.
Data capacity of streams are determined by shards.
Each shard is a sequence of one or more data records and provides a fixed unit of capacity.
(read: 5 per second, 2MB per second) (write: 1000 per second, 1MB per second).
# Kinesis Client Library
Ø Each shard is processed by exactly one KCL worker and has exactly one corresponding record processor
Ø One worker can process any number of shards, so it’s fine if the number of shards exceeds the number of instances