November 08, 2018
This lab shows how to use Amazon EMR Hadoop to analyze a provided CloudFront log file, the provided script creates a Hive table, parses the log file using Regular Expression Serializer/Deserializer, writes parsed result to table, submits a HiveQL query to retrieve the total requests per OS for a given time frame, and writes the query result to S3 bucket.
QwikLab: Analyze Big Data with Hadoop
In General Configuration section
In Hardware configuration section
In Security and access section:
Create a step
Written by Warren who studies distributed systems at George Washington University. You might wanna follow him on Github