Shrestha Rajat

Search

Search IconIcon to open search

Last updated Jul 9, 2023 Edit Source

# AWS Glue

#aws #cloud

A fully managed ETL service by AWS, whihc is used for preparing data for analytics. It runs ETL jobs on fully managed scale-out Apache Spark Environment. AWS Glue discovers data and stores metadata (e.g. table definition and schema) in the Data Catalog. Works with data lakes like Snowflake , data warehouses like Redshift, and data stores like RDS or Database on EC2.

You can use a crawler to populate the AWS Glue Data Catalog with tables. A crawler can crawl multiple data stores in a single run. Upon completion, the crawler creates or updates one or more tables in your Data Catalog and are used by ETL jobs that you define in AWS Glue.

# Data Catalog