Saturday, January 26, 2013

My presentation on "Introduction to Pig"

Few months back, I conducted a 2-hour workshop on "Introduction to Pig" at Fifth Elephant, Bangalore, India on 26th July, 2012. This is a community-powered conference on the Big Data ecosystem.

As part of this workshop, I have touched a bit on Hadoop, MapReduce and Hive. But as the title says, the focus was on Apache Pig. I have also demoed few usecases of execution of Java MapReduce, Hive and Pig. And also a brief overview and demo of Twitter's Ambrose UI for visualizing Pig MapReduce jobs.

Here are the slides of my presentation. This presentation gives a basic understanding of
  1. Big Data
  2. Basics of Hadoop and MapReduce
  3. Landscape of Hadoop ecosystem
  4. Introduction to Apache Pig
  5. Basics of Pig and Pig Latin
  6. Pig vs. Hadoop MR
  7. Pig vs. SQL and Pig vs. Hive
  8. Twitter Ambrose for visualizing Pig MR Jobs




I have also posted the same slides on Speaker Deck.
Code developed for the demos in this workshop can be found on Github.


Update on 05th April, 2015: After a fair bit of time here, I have moved on to GitHub hosted Octopress blogs. Please find me on http://P7h.org henceforth for all new updates.