1) Easy but Not recommended way-
Step 1: Stop all services of cluster.- $HADOOP/sbin/stop-all.sh
Step 2: Edit $HADOOP_HOME/etc/slaves file, Delete entry of node to remove from cluster.
Step 3: Start all service.- $HADOOP_HOME/sbin/start-all.sh
This may cause data loss, In case of replicas of data from removed data node not found on live datanodes.
2) Safe way
Add dfs.hosts.exclude property to hdfs-site.xml
Add mapred.hosts.exclude property to mapred-site.xml
Both of properties can point to file path, This file contains IP addresses of Host/datanodes to be removed.
Add IP address of the nodes to be decommissioned to the exclude file.
Restart all Hadoop Services:
# hdfs dfsadmin -refreshNodes
Step 4: Check Cluster status
# hdfs dfsadmin -report
Remove the nodes from the slaves file.