Big Data Hadoop Training Institute in Noida

What is Big Data
Evolution of Big Data
Need for Big Data Analytics
Benefits of Big Data
Big Data Challenges
Operational vs Analytical Big Data

Introduction to Yarn
Why Yarn
Resource Manager
Classic MapReduce v/s Yarn
Advantages of Yarn
Yarn Architecture
Node Manager
Application Master
Application submission in YARN
Resource Manager components
Yarn applications
Scheduling in Yarn
Node Manager containers
Fair Scheduler
Capacity Scheduler
Fault tolerance

Secondary Name Node
Master Nodes
Name Node
Client Nodes
Slaves
Job Tracker
Hadoop configuration
Setting up a Hadoop cluster

What is Hadoop
History of Hadoop
How does Hadoop work
Hadoop Architecture
Why Hadoop & Big Data
Hadoop Cluster introduction
Standalone
Cluster Modes
Pseudo-distributed
Fully – distributed
HDFS Overview
Hadoop Ecosystem Components
Introduction to MapReduce
Hadoop in demand

Introduction to HDFS
HDFS Features
Blocks
Goals of HDFS
The Name node & Data Node
Secondary Name node
The Job Tracker
HDFS Federation
The Process of a File Read
Data Replication
Rack Awareness
Configuring HDFS
HDFS Architecture
Fault tolerance
How does a File Write work
Name node failure management
Access HDFS from Java
HDFS Web Interface

What is MapReduce
Why MapReduce
How MapReduce works
Difference between Hadoop 1 & Hadoop 2
Identity mapper & reducer
Data flow in MapReduce
Input Splits
Relation Between Input Splits and HDFS Blocks
Flow of Job Submission in MapReduce
Job submission & Monitoring
MapReduce algorithms
Sorting
Searching
Indexing
TF-IDF

Listing contents of directory
Displaying and printing disk usage
Displaying file contents
Moving files & directories
Copying files and directories

Hadoop data types
The Mapper Class
The Reducer Class
Shuffle Phase
How Combiner works
Sort Phase
Secondary Sort
Reduce Phase
The Job class
Job class constructor
JobContext interface
Combiner Class
Partitioner Task
Map method
Record Reader
Map Phase
Combiner Phase
Reducer Phase
Record Writer
Partitioners
Input Data
Map Tasks
Reduce Task
Compilation & Execution

What is Apache Pig?
Why Apache Pig?
Pig features
Where should Pig be used
Pig Latin Statements
Where not to use Pig
The Pig Architecture
Pig components
Pig v/s MapReduce
Pig v/s SQL
Pig Installation
Pig Execution Modes & Mechanisms
Grunt Shell Commands
Pig v/s Hive
Pig Latin – Data Model
Pig Latin operators

Introduction to Apache Spark
Features of Spark
Spark built on Hadoop
Components of Spark
Resilient Distributed Datasets
Data Sharing using Spark RDD
Iterative Operations on Spark RDD
Interactive Operations on Spark RDD
Spark shell
RDD transformations
Actions
Programming with RDD
Start Shell
Create RDD
Execute Transformations

Live project

BigData Hadoop Course

Big Data Introduction:

Yarn

Hadoop cluster

Hadoop Fundamentals

HDFS

MapReduce

HDFS Command Reference

MapReduce Programming

Hadoop Ecosystems Pig

Spark

Live Project