Prerequisites : This lab requires Prerequisites to be completed before you can continue. Please continue if you have already completed the prerequisites.
PART B: Verify the data using AWS Athena. Change the Serde for Tables.
1. Once the source data are crawled and collected using AWS Glue Crawler. 5 tables are added to Database c360_workshop_db Goto AWS Glue Console and click on Tables. Select source_auto_address goto “Action” and chose “view data” . This will take you to Athena console.
2. Once the Athena console opens.
Please use Athena new Console.
Click Settings, Manage
for “Location of query result” Browse S3 bucket s3://mod-xxxx-s3bucketstack-xxxxx-s3bucket-xxxx/ . Please note add "/" at the end of the bucket name.
3. click save
4. Go back to Editor in Athena. Chose Database as C360_workshop_db for the table “source_auto_address” preview the data for the table.
5. Output will look like below, Due to embeded ,
in data zipcode column is empty.
6. To correct embeded comma. Navigate to the AWS Glue Console
7. Goto Data catalog -> Databases -> “c360_workshop_db” -> “Tables in c360_workshop_db”. Select table “source_auto_address” -> Action -> “Edit Table details”.
Edit and Replace with below value
Serde serialization lib
org.apache.hadoop.hive.serde2.OpenCSVSerde
key quoteChar
Value "
key separatorChar
Value ,
8. Click Apply, Repeat the step 7 for “source_auto_customer”,“source_property_customers” Tables.
9. Check the Preview data in Athena ( Select the table “source_auto_address” for “view data” ).