Apache PIG interview questions

46. I have a relation R. How can I get the top 10 tuples from the relation R.?
TOP () function returns the top N tuples from a bag of tuples or a relation. N is passed as a parameter to the function top () along with the column whose values are to be compared and the relation R.

47. What are the commonalities between Pig and Hive?
HiveQL and PigLatin both convert the commands into MapReduce jobs.
They cannot be used for OLAP transactions as it is difficult to execute low latency queries.

48. What are the different types of UDF’s in Java supported by Apache Pig?
Algebraic, Eval and Filter functions are the various types of UDF’s supported in Pig.

49. What is a UDF in Pig?
If the in-built operators do not provide some functions then programmers can implement those functionalities by writing user defined functions using other programming languages like Java, Python, Ruby, etc

50. What are the components of Pig Execution Environment?
Pig Scripts: Pig scripts are submitted to the Apache Pig execution environment which can be written in Pig Latin using built-in operators and UDFs can be embedded in it.
Parser: The Parser does the type checking and checks the syntax of the script. The parser outputs a DAG (directed acyclic graph). DAG represents the Pig Latin statements and logical operators.
Optimizer: The Optimizer performs the optimization activities like split, merge, transform, reorder operators, etc. The optimizer provides the automatic optimization feature to Apache Pig. The optimizer basically aims to reduce the amount of data in the pipeline.
Compiler: The Apache Pig compiler converts the optimized code into MapReduce jobs automatically.
Execution Engine: Finally, the MapReduce jobs are submitted to the execution engine. Then, the MapReduce jobs are executed and the required result is produced.

Author: user

Leave a Reply