Writing to HDFS from Java, getting “could only be replicated to 0 nodes instead of minReplication”









up vote
17
down vote

favorite
7












I’ve downloaded and started up Cloudera's Hadoop Demo VM for CDH4 (running Hadoop 2.0.0). I’m trying to write a Java program that will run from my windows 7 machine (The same machine/OS that the VM is running in). I have a sample program like:



public static void main(String args) 
try
Configuration conf = new Configuration();
conf.addResource("config.xml");
FileSystem fs = FileSystem.get(conf);
FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
fdos.writeBytes("Test text for the txt file");
fdos.flush();
fdos.close();
fs.close();
catch(Exception e)
e.printStackTrace();





My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.



When I run it I’m getting the following exception:



org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
at org.apache.hadoop.ipc.Client.call(Client.java:1160)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)


I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:



Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013


I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?



Update



The only thing I see in the logs is an exception similar to the one on get on the client:



java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)


I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).



I notice that when I'm browsing the HDFS via the HUE web app I see that the folder structure was created and that the file does exist but it is empty. I tried putting the file under the default user directory by using



FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true); 


instead of



FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);


Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.










share|improve this question























  • Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
    – Chris White
    Jan 12 '13 at 0:13










  • Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
    – David Harris
    Jan 14 '13 at 16:48










  • Have you solved that?
    – Denis
    Jun 27 '14 at 14:02










  • I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
    – David Harris
    Jun 30 '14 at 15:07










  • It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
    – Kumar
    May 10 '16 at 3:46














up vote
17
down vote

favorite
7












I’ve downloaded and started up Cloudera's Hadoop Demo VM for CDH4 (running Hadoop 2.0.0). I’m trying to write a Java program that will run from my windows 7 machine (The same machine/OS that the VM is running in). I have a sample program like:



public static void main(String args) 
try
Configuration conf = new Configuration();
conf.addResource("config.xml");
FileSystem fs = FileSystem.get(conf);
FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
fdos.writeBytes("Test text for the txt file");
fdos.flush();
fdos.close();
fs.close();
catch(Exception e)
e.printStackTrace();





My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.



When I run it I’m getting the following exception:



org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
at org.apache.hadoop.ipc.Client.call(Client.java:1160)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)


I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:



Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013


I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?



Update



The only thing I see in the logs is an exception similar to the one on get on the client:



java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)


I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).



I notice that when I'm browsing the HDFS via the HUE web app I see that the folder structure was created and that the file does exist but it is empty. I tried putting the file under the default user directory by using



FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true); 


instead of



FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);


Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.










share|improve this question























  • Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
    – Chris White
    Jan 12 '13 at 0:13










  • Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
    – David Harris
    Jan 14 '13 at 16:48










  • Have you solved that?
    – Denis
    Jun 27 '14 at 14:02










  • I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
    – David Harris
    Jun 30 '14 at 15:07










  • It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
    – Kumar
    May 10 '16 at 3:46












up vote
17
down vote

favorite
7









up vote
17
down vote

favorite
7






7





I’ve downloaded and started up Cloudera's Hadoop Demo VM for CDH4 (running Hadoop 2.0.0). I’m trying to write a Java program that will run from my windows 7 machine (The same machine/OS that the VM is running in). I have a sample program like:



public static void main(String args) 
try
Configuration conf = new Configuration();
conf.addResource("config.xml");
FileSystem fs = FileSystem.get(conf);
FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
fdos.writeBytes("Test text for the txt file");
fdos.flush();
fdos.close();
fs.close();
catch(Exception e)
e.printStackTrace();





My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.



When I run it I’m getting the following exception:



org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
at org.apache.hadoop.ipc.Client.call(Client.java:1160)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)


I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:



Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013


I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?



Update



The only thing I see in the logs is an exception similar to the one on get on the client:



java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)


I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).



I notice that when I'm browsing the HDFS via the HUE web app I see that the folder structure was created and that the file does exist but it is empty. I tried putting the file under the default user directory by using



FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true); 


instead of



FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);


Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.










share|improve this question















I’ve downloaded and started up Cloudera's Hadoop Demo VM for CDH4 (running Hadoop 2.0.0). I’m trying to write a Java program that will run from my windows 7 machine (The same machine/OS that the VM is running in). I have a sample program like:



public static void main(String args) 
try
Configuration conf = new Configuration();
conf.addResource("config.xml");
FileSystem fs = FileSystem.get(conf);
FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
fdos.writeBytes("Test text for the txt file");
fdos.flush();
fdos.close();
fs.close();
catch(Exception e)
e.printStackTrace();





My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.



When I run it I’m getting the following exception:



org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
at org.apache.hadoop.ipc.Client.call(Client.java:1160)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)


I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:



Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013


I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?



Update



The only thing I see in the logs is an exception similar to the one on get on the client:



java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)


I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).



I notice that when I'm browsing the HDFS via the HUE web app I see that the folder structure was created and that the file does exist but it is empty. I tried putting the file under the default user directory by using



FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true); 


instead of



FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);


Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.







java hadoop hdfs






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 22 '17 at 14:56









Виталий Олегович

2,23542240




2,23542240










asked Jan 11 '13 at 23:43









David Harris

3281414




3281414











  • Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
    – Chris White
    Jan 12 '13 at 0:13










  • Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
    – David Harris
    Jan 14 '13 at 16:48










  • Have you solved that?
    – Denis
    Jun 27 '14 at 14:02










  • I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
    – David Harris
    Jun 30 '14 at 15:07










  • It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
    – Kumar
    May 10 '16 at 3:46
















  • Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
    – Chris White
    Jan 12 '13 at 0:13










  • Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
    – David Harris
    Jan 14 '13 at 16:48










  • Have you solved that?
    – Denis
    Jun 27 '14 at 14:02










  • I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
    – David Harris
    Jun 30 '14 at 15:07










  • It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
    – Kumar
    May 10 '16 at 3:46















Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
– Chris White
Jan 12 '13 at 0:13




Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
– Chris White
Jan 12 '13 at 0:13












Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
– David Harris
Jan 14 '13 at 16:48




Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
– David Harris
Jan 14 '13 at 16:48












Have you solved that?
– Denis
Jun 27 '14 at 14:02




Have you solved that?
– Denis
Jun 27 '14 at 14:02












I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
– David Harris
Jun 30 '14 at 15:07




I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
– David Harris
Jun 30 '14 at 15:07












It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
– Kumar
May 10 '16 at 3:46




It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
– Kumar
May 10 '16 at 3:46












11 Answers
11






active

oldest

votes

















up vote
10
down vote













I got a same problem.

In my case, a key of the problem was following error message.

There are 1 datanode(s) running and 1 node(s) are excluded in this operation.



It means that your hdfs-client couldn't connect to your datanode with 50010 port.
As you connected to hdfs namenode, you could got a datanode's status. But, your hdfs-client would failed to connect to your datanode.



(In hdfs, a namenode manages file directories, and datanodes. If hdfs-client connect to a namnenode, it will find a target file path and address of datanode that have the data. Then hdfs-client will communicate with datanode. (You can check those datanode uri by using netstat. because, hdfs-client will be trying to communicate with datanodes using by address informed by namenode)



I solved that problem by:



  1. opening 50010 port in a firewall.

  2. adding propertiy "dfs.client.use.datanode.hostname", "true"

  3. adding hostname to hostfile in my client PC.

I'm sorry for my poor English skill.






share|improve this answer





























    up vote
    2
    down vote













    Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
    Then in the linux vm edit /etc/host file with



    IPADDRESS (SPALCE) hostname



    example :
    192.168.110.27 clouderavm



    and change the all your hadoop configuration files like



    core-site.xml



    hdfs-site.xml



    mapred-site.xml



    yarn-site.xml



    change localhost or localhost.localdomain or 0.0.0.0 to your hostname



    then Restart cloudera manger.



    in the windows machine edit C:WindowsSystem32Driversetchosts



    add one line at the end with



    you vm machine ip and hostname (same as you done on the /etc/host file in the vm)



    VMIPADRESS VMHOSTNAME



    example :



    192.168.110.27 clouderavm



    then check now, it should work, for detail configuration check following VIDEO from you tube



    https://www.youtube.com/watch?v=fSGpYHjGIRY






    share|improve this answer



























      up vote
      1
      down vote













      add given property in hdfs-site.xml



      <property>
      <name>dfs.replication</name>
      <value>1</value>
      </property>


      and add this file also in your program



      conf.addResource("hdfs-site.xml");


      stop hadoop



      stop-all.sh


      then start



      start-all.sh





      share|improve this answer



























        up vote
        1
        down vote













        I ran into the similar issue and have two pieces of information may help you.



        1. The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.


        2. The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.


        In core-site.xml after I changed



        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>


        to



        <value>hdfs://host_name:9000</value>


        I no longer need the ssh turnnel and I can access the hdfs remotely.






        share|improve this answer



























          up vote
          1
          down vote













          Since I found many questions like this one in my search for having the exact same issue I thought I would share what finally worked for me. I found this forum post on Hortonworks: https://community.hortonworks.com/questions/16837/cannot-copy-from-local-machine-to-vm-datanode-via.html



          The answer was truly understanding what calling new Configuration() means and setting the correct parameters as I needed them. In my case it was exactly the one mentioned in that post. So my working code looks like this.



          try 
          Configuration config = new Configuration();
          config.set("dfs.client.use.datanode.hostname", "true");
          Path pdFile = new Path("stgicp-" + pd);
          FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config,
          HadoopProperties.HIVE_DEFAULT_USER);
          if (dFS.exists(pdFile))
          dFS.delete(pdFile, false);

          FSDataOutputStream outStream = dFS.create(pdFile);
          for (String sjWLR : processWLR.get(pd))
          outStream.writeBytes(sjWLR);

          outStream.flush();
          outStream.close();

          dFS.delete(pdFile, false);
          dFS.close();
          catch (IOException | URISyntaxException | InterruptedException e)
          log.error("WLR file processing error: " + e.getMessage());






          share|improve this answer



























            up vote
            0
            down vote













            in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements






            share|improve this answer



























              up vote
              0
              down vote













              You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.






              share|improve this answer



























                up vote
                0
                down vote













                From error message replication factor seems to be fine i.e.1.
                It Seems datanode is properly functioning or have permission issues.
                Check the permissions and check the status of datanode form the user, you are trying to run hadoop.






                share|improve this answer



























                  up vote
                  0
                  down vote













                  I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/






                  share|improve this answer



























                    up vote
                    0
                    down vote













                    It appears to be some issue with the FS.
                    Either the parameters in cross-site.xml are not matching the file it is trying to read



                    OR



                    there is some common mismatch in the path (I see there being a WINDOWS reference).



                    you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
                    Location : $/bin/cygpath.exe




                    P.S. Replication does NOT seem to be the primary issue here according to me







                    share|improve this answer





























                      up vote
                      0
                      down vote













                      Here is how I create files in the HDFS:



                      import java.io.BufferedReader;
                      import java.io.BufferedWriter;
                      import java.io.InputStreamReader;
                      import java.io.OutputStream;
                      import java.io.OutputStreamWriter;
                      import org.apache.hadoop.fs.FileSystem;
                      import org.apache.hadoop.fs.Path;

                      FileSystem hdfs = FileSystem.get(context.getConfiguration());
                      Path outFile=new Path("/path to store the output file");

                      String line1=null;

                      if (!hdfs.exists(outFile))
                      OutputStream out = hdfs.create(outFile);
                      BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                      br.write("whatever data"+"n");
                      br.close();
                      hdfs.close();

                      else
                      String line2=null;
                      BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
                      while((line2=br1.readLine())!=null)
                      line1=line1.concat(line2)+"n";

                      br1.close();
                      hdfs.delete(outFile, true);
                      OutputStream out = hdfs.create(outFile);
                      BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                      br2.write(line1+"new data"+"n");
                      br2.close();
                      hdfs.close();






                      share|improve this answer




















                        Your Answer






                        StackExchange.ifUsing("editor", function ()
                        StackExchange.using("externalEditor", function ()
                        StackExchange.using("snippets", function ()
                        StackExchange.snippets.init();
                        );
                        );
                        , "code-snippets");

                        StackExchange.ready(function()
                        var channelOptions =
                        tags: "".split(" "),
                        id: "1"
                        ;
                        initTagRenderer("".split(" "), "".split(" "), channelOptions);

                        StackExchange.using("externalEditor", function()
                        // Have to fire editor after snippets, if snippets enabled
                        if (StackExchange.settings.snippets.snippetsEnabled)
                        StackExchange.using("snippets", function()
                        createEditor();
                        );

                        else
                        createEditor();

                        );

                        function createEditor()
                        StackExchange.prepareEditor(
                        heartbeatType: 'answer',
                        convertImagesToLinks: true,
                        noModals: true,
                        showLowRepImageUploadWarning: true,
                        reputationToPostImages: 10,
                        bindNavPrevention: true,
                        postfix: "",
                        imageUploader:
                        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                        allowUrls: true
                        ,
                        onDemand: true,
                        discardSelector: ".discard-answer"
                        ,immediatelyShowMarkdownHelp:true
                        );



                        );













                        draft saved

                        draft discarded


















                        StackExchange.ready(
                        function ()
                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f14288453%2fwriting-to-hdfs-from-java-getting-could-only-be-replicated-to-0-nodes-instead%23new-answer', 'question_page');

                        );

                        Post as a guest















                        Required, but never shown

























                        11 Answers
                        11






                        active

                        oldest

                        votes








                        11 Answers
                        11






                        active

                        oldest

                        votes









                        active

                        oldest

                        votes






                        active

                        oldest

                        votes








                        up vote
                        10
                        down vote













                        I got a same problem.

                        In my case, a key of the problem was following error message.

                        There are 1 datanode(s) running and 1 node(s) are excluded in this operation.



                        It means that your hdfs-client couldn't connect to your datanode with 50010 port.
                        As you connected to hdfs namenode, you could got a datanode's status. But, your hdfs-client would failed to connect to your datanode.



                        (In hdfs, a namenode manages file directories, and datanodes. If hdfs-client connect to a namnenode, it will find a target file path and address of datanode that have the data. Then hdfs-client will communicate with datanode. (You can check those datanode uri by using netstat. because, hdfs-client will be trying to communicate with datanodes using by address informed by namenode)



                        I solved that problem by:



                        1. opening 50010 port in a firewall.

                        2. adding propertiy "dfs.client.use.datanode.hostname", "true"

                        3. adding hostname to hostfile in my client PC.

                        I'm sorry for my poor English skill.






                        share|improve this answer


























                          up vote
                          10
                          down vote













                          I got a same problem.

                          In my case, a key of the problem was following error message.

                          There are 1 datanode(s) running and 1 node(s) are excluded in this operation.



                          It means that your hdfs-client couldn't connect to your datanode with 50010 port.
                          As you connected to hdfs namenode, you could got a datanode's status. But, your hdfs-client would failed to connect to your datanode.



                          (In hdfs, a namenode manages file directories, and datanodes. If hdfs-client connect to a namnenode, it will find a target file path and address of datanode that have the data. Then hdfs-client will communicate with datanode. (You can check those datanode uri by using netstat. because, hdfs-client will be trying to communicate with datanodes using by address informed by namenode)



                          I solved that problem by:



                          1. opening 50010 port in a firewall.

                          2. adding propertiy "dfs.client.use.datanode.hostname", "true"

                          3. adding hostname to hostfile in my client PC.

                          I'm sorry for my poor English skill.






                          share|improve this answer
























                            up vote
                            10
                            down vote










                            up vote
                            10
                            down vote









                            I got a same problem.

                            In my case, a key of the problem was following error message.

                            There are 1 datanode(s) running and 1 node(s) are excluded in this operation.



                            It means that your hdfs-client couldn't connect to your datanode with 50010 port.
                            As you connected to hdfs namenode, you could got a datanode's status. But, your hdfs-client would failed to connect to your datanode.



                            (In hdfs, a namenode manages file directories, and datanodes. If hdfs-client connect to a namnenode, it will find a target file path and address of datanode that have the data. Then hdfs-client will communicate with datanode. (You can check those datanode uri by using netstat. because, hdfs-client will be trying to communicate with datanodes using by address informed by namenode)



                            I solved that problem by:



                            1. opening 50010 port in a firewall.

                            2. adding propertiy "dfs.client.use.datanode.hostname", "true"

                            3. adding hostname to hostfile in my client PC.

                            I'm sorry for my poor English skill.






                            share|improve this answer














                            I got a same problem.

                            In my case, a key of the problem was following error message.

                            There are 1 datanode(s) running and 1 node(s) are excluded in this operation.



                            It means that your hdfs-client couldn't connect to your datanode with 50010 port.
                            As you connected to hdfs namenode, you could got a datanode's status. But, your hdfs-client would failed to connect to your datanode.



                            (In hdfs, a namenode manages file directories, and datanodes. If hdfs-client connect to a namnenode, it will find a target file path and address of datanode that have the data. Then hdfs-client will communicate with datanode. (You can check those datanode uri by using netstat. because, hdfs-client will be trying to communicate with datanodes using by address informed by namenode)



                            I solved that problem by:



                            1. opening 50010 port in a firewall.

                            2. adding propertiy "dfs.client.use.datanode.hostname", "true"

                            3. adding hostname to hostfile in my client PC.

                            I'm sorry for my poor English skill.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Apr 4 '16 at 2:59









                            psyco

                            92331521




                            92331521










                            answered Apr 4 '16 at 2:36









                            kook

                            10113




                            10113






















                                up vote
                                2
                                down vote













                                Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
                                Then in the linux vm edit /etc/host file with



                                IPADDRESS (SPALCE) hostname



                                example :
                                192.168.110.27 clouderavm



                                and change the all your hadoop configuration files like



                                core-site.xml



                                hdfs-site.xml



                                mapred-site.xml



                                yarn-site.xml



                                change localhost or localhost.localdomain or 0.0.0.0 to your hostname



                                then Restart cloudera manger.



                                in the windows machine edit C:WindowsSystem32Driversetchosts



                                add one line at the end with



                                you vm machine ip and hostname (same as you done on the /etc/host file in the vm)



                                VMIPADRESS VMHOSTNAME



                                example :



                                192.168.110.27 clouderavm



                                then check now, it should work, for detail configuration check following VIDEO from you tube



                                https://www.youtube.com/watch?v=fSGpYHjGIRY






                                share|improve this answer
























                                  up vote
                                  2
                                  down vote













                                  Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
                                  Then in the linux vm edit /etc/host file with



                                  IPADDRESS (SPALCE) hostname



                                  example :
                                  192.168.110.27 clouderavm



                                  and change the all your hadoop configuration files like



                                  core-site.xml



                                  hdfs-site.xml



                                  mapred-site.xml



                                  yarn-site.xml



                                  change localhost or localhost.localdomain or 0.0.0.0 to your hostname



                                  then Restart cloudera manger.



                                  in the windows machine edit C:WindowsSystem32Driversetchosts



                                  add one line at the end with



                                  you vm machine ip and hostname (same as you done on the /etc/host file in the vm)



                                  VMIPADRESS VMHOSTNAME



                                  example :



                                  192.168.110.27 clouderavm



                                  then check now, it should work, for detail configuration check following VIDEO from you tube



                                  https://www.youtube.com/watch?v=fSGpYHjGIRY






                                  share|improve this answer






















                                    up vote
                                    2
                                    down vote










                                    up vote
                                    2
                                    down vote









                                    Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
                                    Then in the linux vm edit /etc/host file with



                                    IPADDRESS (SPALCE) hostname



                                    example :
                                    192.168.110.27 clouderavm



                                    and change the all your hadoop configuration files like



                                    core-site.xml



                                    hdfs-site.xml



                                    mapred-site.xml



                                    yarn-site.xml



                                    change localhost or localhost.localdomain or 0.0.0.0 to your hostname



                                    then Restart cloudera manger.



                                    in the windows machine edit C:WindowsSystem32Driversetchosts



                                    add one line at the end with



                                    you vm machine ip and hostname (same as you done on the /etc/host file in the vm)



                                    VMIPADRESS VMHOSTNAME



                                    example :



                                    192.168.110.27 clouderavm



                                    then check now, it should work, for detail configuration check following VIDEO from you tube



                                    https://www.youtube.com/watch?v=fSGpYHjGIRY






                                    share|improve this answer












                                    Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
                                    Then in the linux vm edit /etc/host file with



                                    IPADDRESS (SPALCE) hostname



                                    example :
                                    192.168.110.27 clouderavm



                                    and change the all your hadoop configuration files like



                                    core-site.xml



                                    hdfs-site.xml



                                    mapred-site.xml



                                    yarn-site.xml



                                    change localhost or localhost.localdomain or 0.0.0.0 to your hostname



                                    then Restart cloudera manger.



                                    in the windows machine edit C:WindowsSystem32Driversetchosts



                                    add one line at the end with



                                    you vm machine ip and hostname (same as you done on the /etc/host file in the vm)



                                    VMIPADRESS VMHOSTNAME



                                    example :



                                    192.168.110.27 clouderavm



                                    then check now, it should work, for detail configuration check following VIDEO from you tube



                                    https://www.youtube.com/watch?v=fSGpYHjGIRY







                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Dec 2 '15 at 7:34









                                    Chennakrishna

                                    1616




                                    1616




















                                        up vote
                                        1
                                        down vote













                                        add given property in hdfs-site.xml



                                        <property>
                                        <name>dfs.replication</name>
                                        <value>1</value>
                                        </property>


                                        and add this file also in your program



                                        conf.addResource("hdfs-site.xml");


                                        stop hadoop



                                        stop-all.sh


                                        then start



                                        start-all.sh





                                        share|improve this answer
























                                          up vote
                                          1
                                          down vote













                                          add given property in hdfs-site.xml



                                          <property>
                                          <name>dfs.replication</name>
                                          <value>1</value>
                                          </property>


                                          and add this file also in your program



                                          conf.addResource("hdfs-site.xml");


                                          stop hadoop



                                          stop-all.sh


                                          then start



                                          start-all.sh





                                          share|improve this answer






















                                            up vote
                                            1
                                            down vote










                                            up vote
                                            1
                                            down vote









                                            add given property in hdfs-site.xml



                                            <property>
                                            <name>dfs.replication</name>
                                            <value>1</value>
                                            </property>


                                            and add this file also in your program



                                            conf.addResource("hdfs-site.xml");


                                            stop hadoop



                                            stop-all.sh


                                            then start



                                            start-all.sh





                                            share|improve this answer












                                            add given property in hdfs-site.xml



                                            <property>
                                            <name>dfs.replication</name>
                                            <value>1</value>
                                            </property>


                                            and add this file also in your program



                                            conf.addResource("hdfs-site.xml");


                                            stop hadoop



                                            stop-all.sh


                                            then start



                                            start-all.sh






                                            share|improve this answer












                                            share|improve this answer



                                            share|improve this answer










                                            answered Jul 29 '15 at 6:28









                                            Kishore

                                            3,79431237




                                            3,79431237




















                                                up vote
                                                1
                                                down vote













                                                I ran into the similar issue and have two pieces of information may help you.



                                                1. The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.


                                                2. The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.


                                                In core-site.xml after I changed



                                                <name>fs.defaultFS</name>
                                                <value>hdfs://localhost:9000</value>


                                                to



                                                <value>hdfs://host_name:9000</value>


                                                I no longer need the ssh turnnel and I can access the hdfs remotely.






                                                share|improve this answer
























                                                  up vote
                                                  1
                                                  down vote













                                                  I ran into the similar issue and have two pieces of information may help you.



                                                  1. The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.


                                                  2. The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.


                                                  In core-site.xml after I changed



                                                  <name>fs.defaultFS</name>
                                                  <value>hdfs://localhost:9000</value>


                                                  to



                                                  <value>hdfs://host_name:9000</value>


                                                  I no longer need the ssh turnnel and I can access the hdfs remotely.






                                                  share|improve this answer






















                                                    up vote
                                                    1
                                                    down vote










                                                    up vote
                                                    1
                                                    down vote









                                                    I ran into the similar issue and have two pieces of information may help you.



                                                    1. The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.


                                                    2. The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.


                                                    In core-site.xml after I changed



                                                    <name>fs.defaultFS</name>
                                                    <value>hdfs://localhost:9000</value>


                                                    to



                                                    <value>hdfs://host_name:9000</value>


                                                    I no longer need the ssh turnnel and I can access the hdfs remotely.






                                                    share|improve this answer












                                                    I ran into the similar issue and have two pieces of information may help you.



                                                    1. The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.


                                                    2. The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.


                                                    In core-site.xml after I changed



                                                    <name>fs.defaultFS</name>
                                                    <value>hdfs://localhost:9000</value>


                                                    to



                                                    <value>hdfs://host_name:9000</value>


                                                    I no longer need the ssh turnnel and I can access the hdfs remotely.







                                                    share|improve this answer












                                                    share|improve this answer



                                                    share|improve this answer










                                                    answered Dec 2 '15 at 0:39









                                                    zfy

                                                    2,722166




                                                    2,722166




















                                                        up vote
                                                        1
                                                        down vote













                                                        Since I found many questions like this one in my search for having the exact same issue I thought I would share what finally worked for me. I found this forum post on Hortonworks: https://community.hortonworks.com/questions/16837/cannot-copy-from-local-machine-to-vm-datanode-via.html



                                                        The answer was truly understanding what calling new Configuration() means and setting the correct parameters as I needed them. In my case it was exactly the one mentioned in that post. So my working code looks like this.



                                                        try 
                                                        Configuration config = new Configuration();
                                                        config.set("dfs.client.use.datanode.hostname", "true");
                                                        Path pdFile = new Path("stgicp-" + pd);
                                                        FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config,
                                                        HadoopProperties.HIVE_DEFAULT_USER);
                                                        if (dFS.exists(pdFile))
                                                        dFS.delete(pdFile, false);

                                                        FSDataOutputStream outStream = dFS.create(pdFile);
                                                        for (String sjWLR : processWLR.get(pd))
                                                        outStream.writeBytes(sjWLR);

                                                        outStream.flush();
                                                        outStream.close();

                                                        dFS.delete(pdFile, false);
                                                        dFS.close();
                                                        catch (IOException | URISyntaxException | InterruptedException e)
                                                        log.error("WLR file processing error: " + e.getMessage());






                                                        share|improve this answer
























                                                          up vote
                                                          1
                                                          down vote













                                                          Since I found many questions like this one in my search for having the exact same issue I thought I would share what finally worked for me. I found this forum post on Hortonworks: https://community.hortonworks.com/questions/16837/cannot-copy-from-local-machine-to-vm-datanode-via.html



                                                          The answer was truly understanding what calling new Configuration() means and setting the correct parameters as I needed them. In my case it was exactly the one mentioned in that post. So my working code looks like this.



                                                          try 
                                                          Configuration config = new Configuration();
                                                          config.set("dfs.client.use.datanode.hostname", "true");
                                                          Path pdFile = new Path("stgicp-" + pd);
                                                          FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config,
                                                          HadoopProperties.HIVE_DEFAULT_USER);
                                                          if (dFS.exists(pdFile))
                                                          dFS.delete(pdFile, false);

                                                          FSDataOutputStream outStream = dFS.create(pdFile);
                                                          for (String sjWLR : processWLR.get(pd))
                                                          outStream.writeBytes(sjWLR);

                                                          outStream.flush();
                                                          outStream.close();

                                                          dFS.delete(pdFile, false);
                                                          dFS.close();
                                                          catch (IOException | URISyntaxException | InterruptedException e)
                                                          log.error("WLR file processing error: " + e.getMessage());






                                                          share|improve this answer






















                                                            up vote
                                                            1
                                                            down vote










                                                            up vote
                                                            1
                                                            down vote









                                                            Since I found many questions like this one in my search for having the exact same issue I thought I would share what finally worked for me. I found this forum post on Hortonworks: https://community.hortonworks.com/questions/16837/cannot-copy-from-local-machine-to-vm-datanode-via.html



                                                            The answer was truly understanding what calling new Configuration() means and setting the correct parameters as I needed them. In my case it was exactly the one mentioned in that post. So my working code looks like this.



                                                            try 
                                                            Configuration config = new Configuration();
                                                            config.set("dfs.client.use.datanode.hostname", "true");
                                                            Path pdFile = new Path("stgicp-" + pd);
                                                            FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config,
                                                            HadoopProperties.HIVE_DEFAULT_USER);
                                                            if (dFS.exists(pdFile))
                                                            dFS.delete(pdFile, false);

                                                            FSDataOutputStream outStream = dFS.create(pdFile);
                                                            for (String sjWLR : processWLR.get(pd))
                                                            outStream.writeBytes(sjWLR);

                                                            outStream.flush();
                                                            outStream.close();

                                                            dFS.delete(pdFile, false);
                                                            dFS.close();
                                                            catch (IOException | URISyntaxException | InterruptedException e)
                                                            log.error("WLR file processing error: " + e.getMessage());






                                                            share|improve this answer












                                                            Since I found many questions like this one in my search for having the exact same issue I thought I would share what finally worked for me. I found this forum post on Hortonworks: https://community.hortonworks.com/questions/16837/cannot-copy-from-local-machine-to-vm-datanode-via.html



                                                            The answer was truly understanding what calling new Configuration() means and setting the correct parameters as I needed them. In my case it was exactly the one mentioned in that post. So my working code looks like this.



                                                            try 
                                                            Configuration config = new Configuration();
                                                            config.set("dfs.client.use.datanode.hostname", "true");
                                                            Path pdFile = new Path("stgicp-" + pd);
                                                            FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config,
                                                            HadoopProperties.HIVE_DEFAULT_USER);
                                                            if (dFS.exists(pdFile))
                                                            dFS.delete(pdFile, false);

                                                            FSDataOutputStream outStream = dFS.create(pdFile);
                                                            for (String sjWLR : processWLR.get(pd))
                                                            outStream.writeBytes(sjWLR);

                                                            outStream.flush();
                                                            outStream.close();

                                                            dFS.delete(pdFile, false);
                                                            dFS.close();
                                                            catch (IOException | URISyntaxException | InterruptedException e)
                                                            log.error("WLR file processing error: " + e.getMessage());







                                                            share|improve this answer












                                                            share|improve this answer



                                                            share|improve this answer










                                                            answered Nov 3 '16 at 12:58









                                                            Eva Donaldson

                                                            151215




                                                            151215




















                                                                up vote
                                                                0
                                                                down vote













                                                                in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements






                                                                share|improve this answer
























                                                                  up vote
                                                                  0
                                                                  down vote













                                                                  in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements






                                                                  share|improve this answer






















                                                                    up vote
                                                                    0
                                                                    down vote










                                                                    up vote
                                                                    0
                                                                    down vote









                                                                    in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements






                                                                    share|improve this answer












                                                                    in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements







                                                                    share|improve this answer












                                                                    share|improve this answer



                                                                    share|improve this answer










                                                                    answered Mar 14 '13 at 9:52









                                                                    srikayala

                                                                    361420




                                                                    361420




















                                                                        up vote
                                                                        0
                                                                        down vote













                                                                        You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.






                                                                        share|improve this answer
























                                                                          up vote
                                                                          0
                                                                          down vote













                                                                          You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.






                                                                          share|improve this answer






















                                                                            up vote
                                                                            0
                                                                            down vote










                                                                            up vote
                                                                            0
                                                                            down vote









                                                                            You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.






                                                                            share|improve this answer












                                                                            You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.







                                                                            share|improve this answer












                                                                            share|improve this answer



                                                                            share|improve this answer










                                                                            answered May 2 '13 at 8:35









                                                                            Jickson T George

                                                                            30336




                                                                            30336




















                                                                                up vote
                                                                                0
                                                                                down vote













                                                                                From error message replication factor seems to be fine i.e.1.
                                                                                It Seems datanode is properly functioning or have permission issues.
                                                                                Check the permissions and check the status of datanode form the user, you are trying to run hadoop.






                                                                                share|improve this answer
























                                                                                  up vote
                                                                                  0
                                                                                  down vote













                                                                                  From error message replication factor seems to be fine i.e.1.
                                                                                  It Seems datanode is properly functioning or have permission issues.
                                                                                  Check the permissions and check the status of datanode form the user, you are trying to run hadoop.






                                                                                  share|improve this answer






















                                                                                    up vote
                                                                                    0
                                                                                    down vote










                                                                                    up vote
                                                                                    0
                                                                                    down vote









                                                                                    From error message replication factor seems to be fine i.e.1.
                                                                                    It Seems datanode is properly functioning or have permission issues.
                                                                                    Check the permissions and check the status of datanode form the user, you are trying to run hadoop.






                                                                                    share|improve this answer












                                                                                    From error message replication factor seems to be fine i.e.1.
                                                                                    It Seems datanode is properly functioning or have permission issues.
                                                                                    Check the permissions and check the status of datanode form the user, you are trying to run hadoop.







                                                                                    share|improve this answer












                                                                                    share|improve this answer



                                                                                    share|improve this answer










                                                                                    answered Jul 30 '13 at 10:11









                                                                                    Neha Milak

                                                                                    112




                                                                                    112




















                                                                                        up vote
                                                                                        0
                                                                                        down vote













                                                                                        I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/






                                                                                        share|improve this answer
























                                                                                          up vote
                                                                                          0
                                                                                          down vote













                                                                                          I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/






                                                                                          share|improve this answer






















                                                                                            up vote
                                                                                            0
                                                                                            down vote










                                                                                            up vote
                                                                                            0
                                                                                            down vote









                                                                                            I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/






                                                                                            share|improve this answer












                                                                                            I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/







                                                                                            share|improve this answer












                                                                                            share|improve this answer



                                                                                            share|improve this answer










                                                                                            answered Jan 8 '15 at 14:20









                                                                                            bachr

                                                                                            2,27053359




                                                                                            2,27053359




















                                                                                                up vote
                                                                                                0
                                                                                                down vote













                                                                                                It appears to be some issue with the FS.
                                                                                                Either the parameters in cross-site.xml are not matching the file it is trying to read



                                                                                                OR



                                                                                                there is some common mismatch in the path (I see there being a WINDOWS reference).



                                                                                                you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
                                                                                                Location : $/bin/cygpath.exe




                                                                                                P.S. Replication does NOT seem to be the primary issue here according to me







                                                                                                share|improve this answer


























                                                                                                  up vote
                                                                                                  0
                                                                                                  down vote













                                                                                                  It appears to be some issue with the FS.
                                                                                                  Either the parameters in cross-site.xml are not matching the file it is trying to read



                                                                                                  OR



                                                                                                  there is some common mismatch in the path (I see there being a WINDOWS reference).



                                                                                                  you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
                                                                                                  Location : $/bin/cygpath.exe




                                                                                                  P.S. Replication does NOT seem to be the primary issue here according to me







                                                                                                  share|improve this answer
























                                                                                                    up vote
                                                                                                    0
                                                                                                    down vote










                                                                                                    up vote
                                                                                                    0
                                                                                                    down vote









                                                                                                    It appears to be some issue with the FS.
                                                                                                    Either the parameters in cross-site.xml are not matching the file it is trying to read



                                                                                                    OR



                                                                                                    there is some common mismatch in the path (I see there being a WINDOWS reference).



                                                                                                    you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
                                                                                                    Location : $/bin/cygpath.exe




                                                                                                    P.S. Replication does NOT seem to be the primary issue here according to me







                                                                                                    share|improve this answer














                                                                                                    It appears to be some issue with the FS.
                                                                                                    Either the parameters in cross-site.xml are not matching the file it is trying to read



                                                                                                    OR



                                                                                                    there is some common mismatch in the path (I see there being a WINDOWS reference).



                                                                                                    you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
                                                                                                    Location : $/bin/cygpath.exe




                                                                                                    P.S. Replication does NOT seem to be the primary issue here according to me








                                                                                                    share|improve this answer














                                                                                                    share|improve this answer



                                                                                                    share|improve this answer








                                                                                                    edited Aug 10 '15 at 5:38









                                                                                                    Meenesh Jain

                                                                                                    2,49221429




                                                                                                    2,49221429










                                                                                                    answered Aug 10 '15 at 5:13









                                                                                                    Yunus Khan

                                                                                                    365




                                                                                                    365




















                                                                                                        up vote
                                                                                                        0
                                                                                                        down vote













                                                                                                        Here is how I create files in the HDFS:



                                                                                                        import java.io.BufferedReader;
                                                                                                        import java.io.BufferedWriter;
                                                                                                        import java.io.InputStreamReader;
                                                                                                        import java.io.OutputStream;
                                                                                                        import java.io.OutputStreamWriter;
                                                                                                        import org.apache.hadoop.fs.FileSystem;
                                                                                                        import org.apache.hadoop.fs.Path;

                                                                                                        FileSystem hdfs = FileSystem.get(context.getConfiguration());
                                                                                                        Path outFile=new Path("/path to store the output file");

                                                                                                        String line1=null;

                                                                                                        if (!hdfs.exists(outFile))
                                                                                                        OutputStream out = hdfs.create(outFile);
                                                                                                        BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                        br.write("whatever data"+"n");
                                                                                                        br.close();
                                                                                                        hdfs.close();

                                                                                                        else
                                                                                                        String line2=null;
                                                                                                        BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
                                                                                                        while((line2=br1.readLine())!=null)
                                                                                                        line1=line1.concat(line2)+"n";

                                                                                                        br1.close();
                                                                                                        hdfs.delete(outFile, true);
                                                                                                        OutputStream out = hdfs.create(outFile);
                                                                                                        BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                        br2.write(line1+"new data"+"n");
                                                                                                        br2.close();
                                                                                                        hdfs.close();






                                                                                                        share|improve this answer
























                                                                                                          up vote
                                                                                                          0
                                                                                                          down vote













                                                                                                          Here is how I create files in the HDFS:



                                                                                                          import java.io.BufferedReader;
                                                                                                          import java.io.BufferedWriter;
                                                                                                          import java.io.InputStreamReader;
                                                                                                          import java.io.OutputStream;
                                                                                                          import java.io.OutputStreamWriter;
                                                                                                          import org.apache.hadoop.fs.FileSystem;
                                                                                                          import org.apache.hadoop.fs.Path;

                                                                                                          FileSystem hdfs = FileSystem.get(context.getConfiguration());
                                                                                                          Path outFile=new Path("/path to store the output file");

                                                                                                          String line1=null;

                                                                                                          if (!hdfs.exists(outFile))
                                                                                                          OutputStream out = hdfs.create(outFile);
                                                                                                          BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                          br.write("whatever data"+"n");
                                                                                                          br.close();
                                                                                                          hdfs.close();

                                                                                                          else
                                                                                                          String line2=null;
                                                                                                          BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
                                                                                                          while((line2=br1.readLine())!=null)
                                                                                                          line1=line1.concat(line2)+"n";

                                                                                                          br1.close();
                                                                                                          hdfs.delete(outFile, true);
                                                                                                          OutputStream out = hdfs.create(outFile);
                                                                                                          BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                          br2.write(line1+"new data"+"n");
                                                                                                          br2.close();
                                                                                                          hdfs.close();






                                                                                                          share|improve this answer






















                                                                                                            up vote
                                                                                                            0
                                                                                                            down vote










                                                                                                            up vote
                                                                                                            0
                                                                                                            down vote









                                                                                                            Here is how I create files in the HDFS:



                                                                                                            import java.io.BufferedReader;
                                                                                                            import java.io.BufferedWriter;
                                                                                                            import java.io.InputStreamReader;
                                                                                                            import java.io.OutputStream;
                                                                                                            import java.io.OutputStreamWriter;
                                                                                                            import org.apache.hadoop.fs.FileSystem;
                                                                                                            import org.apache.hadoop.fs.Path;

                                                                                                            FileSystem hdfs = FileSystem.get(context.getConfiguration());
                                                                                                            Path outFile=new Path("/path to store the output file");

                                                                                                            String line1=null;

                                                                                                            if (!hdfs.exists(outFile))
                                                                                                            OutputStream out = hdfs.create(outFile);
                                                                                                            BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                            br.write("whatever data"+"n");
                                                                                                            br.close();
                                                                                                            hdfs.close();

                                                                                                            else
                                                                                                            String line2=null;
                                                                                                            BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
                                                                                                            while((line2=br1.readLine())!=null)
                                                                                                            line1=line1.concat(line2)+"n";

                                                                                                            br1.close();
                                                                                                            hdfs.delete(outFile, true);
                                                                                                            OutputStream out = hdfs.create(outFile);
                                                                                                            BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                            br2.write(line1+"new data"+"n");
                                                                                                            br2.close();
                                                                                                            hdfs.close();






                                                                                                            share|improve this answer












                                                                                                            Here is how I create files in the HDFS:



                                                                                                            import java.io.BufferedReader;
                                                                                                            import java.io.BufferedWriter;
                                                                                                            import java.io.InputStreamReader;
                                                                                                            import java.io.OutputStream;
                                                                                                            import java.io.OutputStreamWriter;
                                                                                                            import org.apache.hadoop.fs.FileSystem;
                                                                                                            import org.apache.hadoop.fs.Path;

                                                                                                            FileSystem hdfs = FileSystem.get(context.getConfiguration());
                                                                                                            Path outFile=new Path("/path to store the output file");

                                                                                                            String line1=null;

                                                                                                            if (!hdfs.exists(outFile))
                                                                                                            OutputStream out = hdfs.create(outFile);
                                                                                                            BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                            br.write("whatever data"+"n");
                                                                                                            br.close();
                                                                                                            hdfs.close();

                                                                                                            else
                                                                                                            String line2=null;
                                                                                                            BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
                                                                                                            while((line2=br1.readLine())!=null)
                                                                                                            line1=line1.concat(line2)+"n";

                                                                                                            br1.close();
                                                                                                            hdfs.delete(outFile, true);
                                                                                                            OutputStream out = hdfs.create(outFile);
                                                                                                            BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
                                                                                                            br2.write(line1+"new data"+"n");
                                                                                                            br2.close();
                                                                                                            hdfs.close();







                                                                                                            share|improve this answer












                                                                                                            share|improve this answer



                                                                                                            share|improve this answer










                                                                                                            answered Oct 12 '15 at 6:45









                                                                                                            Punit Naik

                                                                                                            308518




                                                                                                            308518



























                                                                                                                draft saved

                                                                                                                draft discarded
















































                                                                                                                Thanks for contributing an answer to Stack Overflow!


                                                                                                                • Please be sure to answer the question. Provide details and share your research!

                                                                                                                But avoid


                                                                                                                • Asking for help, clarification, or responding to other answers.

                                                                                                                • Making statements based on opinion; back them up with references or personal experience.

                                                                                                                To learn more, see our tips on writing great answers.





                                                                                                                Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                                                                                Please pay close attention to the following guidance:


                                                                                                                • Please be sure to answer the question. Provide details and share your research!

                                                                                                                But avoid


                                                                                                                • Asking for help, clarification, or responding to other answers.

                                                                                                                • Making statements based on opinion; back them up with references or personal experience.

                                                                                                                To learn more, see our tips on writing great answers.




                                                                                                                draft saved


                                                                                                                draft discarded














                                                                                                                StackExchange.ready(
                                                                                                                function ()
                                                                                                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f14288453%2fwriting-to-hdfs-from-java-getting-could-only-be-replicated-to-0-nodes-instead%23new-answer', 'question_page');

                                                                                                                );

                                                                                                                Post as a guest















                                                                                                                Required, but never shown





















































                                                                                                                Required, but never shown














                                                                                                                Required, but never shown












                                                                                                                Required, but never shown







                                                                                                                Required, but never shown

































                                                                                                                Required, but never shown














                                                                                                                Required, but never shown












                                                                                                                Required, but never shown







                                                                                                                Required, but never shown







                                                                                                                這個網誌中的熱門文章

                                                                                                                What does pagestruct do in Eviews?

                                                                                                                Dutch intervention in Lombok and Karangasem

                                                                                                                Channel Islands