Hadoop file Management Tasks

Implement the following file management tasks in Hadoop:

a) Adding files and directories

b) Retrieving files

c) Deleting files

Hint: A typical Hadoop workflow creates data files (such as log files) elsewhere and copies them into HDFS using one of the above command line utilities.

Program:

The most common file management tasks in Hadoop includes:

Adding files and directories to HDFS
Retrieving files from HDFS to local filesystem
Deleting files from HDFS

Hadoop file commands take the following form:

hadoop fs -cmd

Where cmd is the specific file command and <args> is a variable number of arguments. The command cmd is usually named after the corresponding Unix equivalent. For example, the command for listing files is ls as in Unix.

a) Adding Files and Directories to HDFS

Creating Directory in HDFS

$ hadoop fs -mkdir foldername (syntax)

$ hadoop fs -mkdir cse

Hadoop’s mkdir command automatically creates parent directories if they don’t already exist. Now that we have a working directory, we can put a file into it.

Adding File in HDFS

Create some text file on your local filesystem called example.txt. The Hadoop command put is used to copy files from the local system into HDFS.

Syntax

$hadoop fs -put source destination

The command above is equivalent to:

$ hadoop fs -put example.txt cse

b) Retrieving files from HDFS

Hadoop command get gets the data from HDFS to local filesystem.

Syntax

$hadoop fs -get source destination

$hadoop fs -get cse/example.txt Desktop (copies data from HDFS(cse/example.txt) to local file system(Desktop)

c) Deleting Files

Hadoop command rm removes the files from HDFS.

$hadoop fs -rm cse/example.txt (removes file from HDFS)

Viva Questions

What is command to list all the files in directory of HDFS?

Ans: $hadoop fs -ls cse (let cse be directory)

What is command to copy data from local file system to HDFS using copyFromLocal?

Ans: $hadoop fs -copyFromLocal <source> <destination> (it is similar to put command except that source is restricted to local file reference)

What is command to copy data from HDFS to local file system using copyToLocal?

Ans: $hadoop fs -copyToLocal <source> <destination> (it is similar to get command except that destination is restricted to local file reference)

What is command to display contents of file in HDFS?

Ans: $hadoop fs -cat cse/A.java

What is command to display last 1KB of particular file in HDFS?

Ans: $hadoop fs -tail <filename>

What is command used to change replication factor for files or directories in HDFS?

Ans : $hadoop fs -setrep -w <value> <filename or directory> (-w flag requests that the command waits for the replication process to get completed.)

What is command to show disk usage in bytes for all files/directories of path in HDFS?

Ans: $hadoop fs -du <path>

What is command used to display free space in HDFS?

Ans: $hadoop fs -df -h

What is command used to create new file at the path containing the current time as a timestamp in HDFS?

Ans: $hadoop fs -touchz <path>

What is command used to take source file from HDFS and outputs the given file in text format?

Ans: $hadoop fs -text <source>

What is command used to display information about the path?

Ans: $hadoop fs -stat <path>

What is command used to apply permissions to file in HDFS?

Ans: $hadoop fs -chmod <value> <file or directory>(for eg : value=777 where owner,group and others can read,write & execute read=4,write=2,execute=1)

What is command used to counts the number of directories, number of files present and bytes under the path?

Ans: $hadoop fs -count <path>

Sample Output

1 3 1050120 cse (1 is directory, 3 is no of files , 1050120 are no of bytes & cse is directory)

What is command used to get usage of particular command?

Ans: $hadoop fs -usage <commandname>

15. What is command used to empty trash?

Ans: $hadoop fs -expunge

CHARVIK

Search This Blog

Hadoop file Management Tasks

Comments

Popular posts from this blog

Machine Learning Lab Internal Questions

Static Member Functions

How to Install Parrot Operating System in Virtual Box using OVA