Coping files from Hadoop’s HDFS (Hadoop Distributed File System) to your local machine

To copy files from Hadoop’s HDFS (Hadoop Distributed File System) to your local machine, you can use the hadoop fs or hdfs dfs command, which provides a simple way to interact with HDFS. Here’s a step-by-step guide with examples:

Step 1: Open a terminal

Open a terminal window on your local machine. This is where you’ll run the Hadoop HDFS commands.

Step 2: Use the hadoop fs or hdfs dfs Command

You can use either hadoop fs or hdfs dfs command to interact with HDFS. Both commands are equivalent, so choose the one you prefer. Below, I’ll use hdfs dfs.

Step 3: Copying files from HDFS to local

To copy files from HDFS to your local machine, you can use the -copyToLocal or -get command with the HDFS path and the local destination path.

Here’s the basic syntax:

hdfs dfs -copyToLocal <HDFS_PATH> <LOCAL_DESTINATION_PATH>

Or using -get:

hdfs dfs -get <HDFS_PATH> <LOCAL_DESTINATION_PATH>
<HDFS_PATH> is the path to the file or directory on HDFS that you want to copy.
<LOCAL_DESTINATION_PATH> is the local directory where you want to copy the file(s).

Examples

Let’s look at a few examples:

To copy a file named myfile.txt from HDFS to your current local directory:

hdfs dfs -copyToLocal /user/freshers_in/myfile.txt .

The . at the end specifies the current directory as the destination.

To copy a file named mydata.csv from HDFS to a specific local directory:

hdfs dfs -copyToLocal /user/freshers_in/data/mydata.csv /path/to/local/directory/
Replace /path/to/local/directory/ with the actual path to your desired local directory.

To copy an entire directory from HDFS to your local machine:

hdfs dfs -copyToLocal /user/freshers_in/mydirectory/ /path/to/local/directory/

This will copy all files and subdirectories from the HDFS directory to your local directory.

Refer more on python here :

Refer more on python NumPy here

Author: user