Skip to main content

Posts

Showing posts from March, 2022

What is Changing in the Realms of Big Data

Today, it is an era of a tight handshake between business, IT, and yet another class called Data Scientists. Here, we are citing there vital reasons why companies should compulsorily consider leveraging big data. Competitive advantage: The most important resource with any organization today is their data. What they do with it will determine their fate in the market. Decision Making: It has shifted from the hands of the elite few to the empowered many. Good decisions play a significant role in furthering customer engagement, reducing operating margins in retail, cutting and other expenditures in the health sector. Value of data: The value of data continues to see a steep rise. As an all-important resource, it is time to look at newer architecture, tools, and practices to leverage this.

Binning Method by Data smoothing in python

 Binning Method Binning is a technique for smoothing data or dealing with noisy data. The data is sorted first, and then the sorted values are dispersed into a number of buckets or bins in this approach. Binning methods provide local smoothing since they consult the vicinity of values.  Smoothing can be accomplished in three ways: Bin smoothing entails:  Each value in a bin is replaced by the bin's mean value when smoothing by bin means is used.  Smoothing by bin median:  Each bin value is replaced by its bin median value in this method.  Smoothing by bin borders:  In smoothing by bin boundaries, the bin boundaries are determined as the minimum and maximum values in a given bin. The nearest boundary value is then used to replace each bin value. Example: Sorted data for price (in dollars): 4, 8, 9, 15, 21, 21, 24, 25, 26, 28, 29, 34 Smoothing by bin means:       - Bin 1: 9, 9, 9, 9       - Bin 2: 23, 23, 23, 23       - Bin 3: 29, 29, 29, 29 Smoothing by bin boundaries:       - Bin 1

Hadoop installation in Windows

  Installing Hadoop 3.2.1 on Windows 1.Prerequisites First, we need to make sure that the following prerequisites are installed: Java 8 runtime environment (JRE): Hadoop 3 requires a Java 8 installation. I prefer using the offline installer. Java 8 development Kit (JDK) To unzip downloaded Hadoop binaries, we should install 7zip   I will create a folder on my local machine to store downloaded Hadoop files "Drive_Name:\Hadoop_evn" 2. Download Hadoop Barriers The first step is to download Hadoop binaries from the official website . The binary package size is about 342 MB. or  Better to download Hadoop Binaries from the nonofficial . you can directly utilize these file.  After finishing the file download, we should unpack the package using 7zip int two steps. First, we should extract the hadoop-3.2.1.tar.gz library, and then, we should unpack the extracted tar file 3. Setting up environmental Variables After installing Hadoop and its prerequisites, we should configure the envi