Implement the following file management tasks in Hadoop:
a) Adding files and directories
b) Retrieving files
c) Deleting files
Hint: A typical Hadoop workflow creates data files (such as log
files) elsewhere and copies them into HDFS using one of the above command line
utilities.
Program:
The most common file management tasks in
Hadoop includes:
- Adding
files and directories to HDFS
- Retrieving
files from HDFS to local filesystem
- Deleting
files from HDFS
Hadoop file commands
take the following form:
|
hadoop fs -cmd |
Where cmd is the
specific file command and <args> is a variable number of arguments. The
command cmd is usually named after the corresponding Unix equivalent. For
example, the command for listing files is ls as in Unix.
a) Adding Files
and Directories to HDFS
Creating
Directory in HDFS
|
$ hadoop
fs -mkdir foldername (syntax) $ hadoop fs -mkdir cse |
Hadoop’s
mkdir command automatically creates parent directories if they don’t already
exist. Now that we have a working directory, we can put a file into it.
Adding
File in HDFS
Create
some text file on your local filesystem called example.txt. The Hadoop
command put is used to copy files from the local system into HDFS.
Syntax
$hadoop
fs -put source destination
The command above is equivalent to:
|
$ hadoop fs -put example.txt cse b) Retrieving files from HDFS Hadoop command get gets the data from HDFS to local filesystem. Syntax $hadoop fs -get source destination $hadoop fs -get cse/example.txt Desktop
(copies data from HDFS(cse/example.txt) to local file system(Desktop) c) Deleting Files Hadoop
command rm removes the files from HDFS. $hadoop
fs -rm cse/example.txt (removes file from HDFS) Viva Questions
Ans: $hadoop fs -ls cse (let cse be
directory)
Ans:
$hadoop fs -copyFromLocal <source> <destination> (it is similar
to put command except that source is restricted to local file reference)
Ans:
$hadoop fs -copyToLocal <source> <destination> (it is similar to
get command except that destination is restricted to local file reference)
Ans:
$hadoop fs -cat cse/A.java
Ans:
$hadoop fs -tail <filename>
Ans :
$hadoop fs -setrep -w <value> <filename or directory> (-w flag
requests that the command waits for the replication process to get
completed.)
Ans:
$hadoop fs -du <path>
Ans:
$hadoop fs -df -h
Ans:
$hadoop fs -touchz <path>
Ans:
$hadoop fs -text <source>
Ans:
$hadoop fs -stat <path>
Ans:
$hadoop fs -chmod <value> <file or directory>(for eg : value=777
where owner,group and others can read,write & execute
read=4,write=2,execute=1)
Ans:
$hadoop fs -count <path> Sample
Output 1 3 1050120 cse (1 is directory, 3 is
no of files , 1050120 are no of bytes & cse is directory)
Ans:
$hadoop fs -usage <commandname> 15. What is command used to empty
trash? Ans: $hadoop fs -expunge |
Comments