Skip to main content

Introduction to Hadoop

Comments

Popular posts from this blog

Hadoop file Management Tasks

  Implement the following file management tasks in Hadoop: a) Adding files and directories b) Retrieving files c) Deleting files Hint: A typical Hadoop workflow creates data files (such as log files) elsewhere and copies them into HDFS using one of the above command line utilities. Program:  The most common file management tasks in Hadoop includes: Adding files and directories to HDFS Retrieving files from HDFS to local filesystem Deleting files from HDFS Hadoop file commands take the following form:     hadoop fs - cmd Where cmd is the specific file command and <args> is a variable number of arguments. The command cmd is usually named after the corresponding Unix equivalent. For example, the command for listing files is ls as in Unix. a) Adding Files and Directories to HDFS Creating Directory in HDFS    $ hadoop fs - mkdir foldername (syntax)  $ ha...

How to Install Parrot Operating System in Virtual Box using OVA

Step by Step Process of Parrot OS Installation What is Parrot OS Parrot is a free and open-source Linux system based on Debian that is popular among security researchers, security experts, developers, and privacy-conscious users. It comes with cyber security and digital forensics arsenal that is totally portable. It also includes everything you'll need to make your own apps and protect your online privacy. Parrot is offered in Home and Security Editions, as well as a virtual machine and a Docker image, featuring the KDE and Mate desktop environments. Features of Parrot OS The following are some of the features of Parrot OS that set it apart from other Debian distributions: Tor, Tor chat, I2P, Anonsurf, and Zulu Crypt, which are popular among developers, security researchers, and privacy-conscious individuals, are included as pre-installed development, forensics, and anonymity applications. It has a separate "Forensics Mode" that does not mount any of the system's hard...

Binning Method by Data smoothing in python

 Binning Method Binning is a technique for smoothing data or dealing with noisy data. The data is sorted first, and then the sorted values are dispersed into a number of buckets or bins in this approach. Binning methods provide local smoothing since they consult the vicinity of values.  Smoothing can be accomplished in three ways: Bin smoothing entails:  Each value in a bin is replaced by the bin's mean value when smoothing by bin means is used.  Smoothing by bin median:  Each bin value is replaced by its bin median value in this method.  Smoothing by bin borders:  In smoothing by bin boundaries, the bin boundaries are determined as the minimum and maximum values in a given bin. The nearest boundary value is then used to replace each bin value. Example: Sorted data for price (in dollars): 4, 8, 9, 15, 21, 21, 24, 25, 26, 28, 29, 34 Smoothing by bin means:       - Bin 1: 9, 9, 9, 9       - Bin 2: 23, 23, 23, 23   ...