Writing to HDFS from Java, getting “could only be replicated to 0 nodes instead of minReplication”

up vote
17
down vote

favorite

I’ve downloaded and started up Cloudera's Hadoop Demo VM for CDH4 (running Hadoop 2.0.0). I’m trying to write a Java program that will run from my windows 7 machine (The same machine/OS that the VM is running in). I have a sample program like:

public static void main(String args) 
 try
 Configuration conf = new Configuration();
 conf.addResource("config.xml");
 FileSystem fs = FileSystem.get(conf);
 FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
 fdos.writeBytes("Test text for the txt file");
 fdos.flush();
 fdos.close();
 fs.close();
 catch(Exception e)
 e.printStackTrace();

My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.

When I run it I’m getting the following exception:

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
 at org.apache.hadoop.ipc.Client.call(Client.java:1160)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
 at $Proxy9.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
 at $Proxy9.addBlock(Unknown Source)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)

I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:

Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013

I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?

Update

The only thing I see in the logs is an exception similar to the one on get on the client:

java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).

I notice that when I'm browsing the HDFS via the HUE web app I see that the folder structure was created and that the file does exist but it is empty. I tried putting the file under the default user directory by using

FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true);

instead of

FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);

Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

asked Jan 11 '13 at 23:43

David Harris

3281414

Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
– Chris White
Jan 12 '13 at 0:13

Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
– David Harris
Jan 14 '13 at 16:48

Have you solved that?
– Denis
Jun 27 '14 at 14:02

I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
– David Harris
Jun 30 '14 at 15:07

It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
– Kumar
May 10 '16 at 3:46

add a comment |

up vote
17
down vote

favorite

public static void main(String args) 
 try
 Configuration conf = new Configuration();
 conf.addResource("config.xml");
 FileSystem fs = FileSystem.get(conf);
 FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
 fdos.writeBytes("Test text for the txt file");
 fdos.flush();
 fdos.close();
 fs.close();
 catch(Exception e)
 e.printStackTrace();

My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.

When I run it I’m getting the following exception:

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
 at org.apache.hadoop.ipc.Client.call(Client.java:1160)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
 at $Proxy9.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
 at $Proxy9.addBlock(Unknown Source)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)

I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:

Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013

I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?

Update

The only thing I see in the logs is an exception similar to the one on get on the client:

java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).

FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true);

instead of

FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);

Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

asked Jan 11 '13 at 23:43

David Harris

3281414

Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
– Chris White
Jan 12 '13 at 0:13

Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
– David Harris
Jan 14 '13 at 16:48

Have you solved that?
– Denis
Jun 27 '14 at 14:02

I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
– David Harris
Jun 30 '14 at 15:07

It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
– Kumar
May 10 '16 at 3:46

add a comment |

up vote
17
down vote

favorite

public static void main(String args) 
 try
 Configuration conf = new Configuration();
 conf.addResource("config.xml");
 FileSystem fs = FileSystem.get(conf);
 FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
 fdos.writeBytes("Test text for the txt file");
 fdos.flush();
 fdos.close();
 fs.close();
 catch(Exception e)
 e.printStackTrace();

My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.

When I run it I’m getting the following exception:

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
 at org.apache.hadoop.ipc.Client.call(Client.java:1160)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
 at $Proxy9.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
 at $Proxy9.addBlock(Unknown Source)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)

I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:

Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013

I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?

Update

The only thing I see in the logs is an exception similar to the one on get on the client:

java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).

FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true);

instead of

FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);

Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

asked Jan 11 '13 at 23:43

David Harris

3281414

public static void main(String args) 
 try
 Configuration conf = new Configuration();
 conf.addResource("config.xml");
 FileSystem fs = FileSystem.get(conf);
 FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);
 fdos.writeBytes("Test text for the txt file");
 fdos.flush();
 fdos.close();
 fs.close();
 catch(Exception e)
 e.printStackTrace();

My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.

When I run it I’m getting the following exception:

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
 at org.apache.hadoop.ipc.Client.call(Client.java:1160)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
 at $Proxy9.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
 at $Proxy9.addBlock(Unknown Source)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)

I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:

Configured Capacity: 25197727744 (23.47 GB)
Present Capacity: 21771988992 (20.28 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used: 1273856 (1.21 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost.localdomain)
Hostname: localhost.localdomain
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 1273856 (1.21 MB)
Non DFS Used: 3425738752 (3.19 GB)
DFS Remaining: 21770715136 (20.28 GB)
DFS Used%: 0.01%
DFS Remaining%: 86.4%
Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013

I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?

Update

The only thing I see in the logs is an exception similar to the one on get on the client:

java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).

FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true);

instead of

FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true);

Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.

java hadoop hdfs

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

asked Jan 11 '13 at 23:43

David Harris

3281414

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

asked Jan 11 '13 at 23:43

David Harris

3281414

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

edited Dec 22 '17 at 14:56

Виталий Олегович

2,23542240

asked Jan 11 '13 at 23:43

David Harris

3281414

asked Jan 11 '13 at 23:43

David Harris

3281414

asked Jan 11 '13 at 23:43

David Harris

3281414

Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
– Chris White
Jan 12 '13 at 0:13

Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
– David Harris
Jan 14 '13 at 16:48

Have you solved that?
– Denis
Jun 27 '14 at 14:02

I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
– David Harris
Jun 30 '14 at 15:07

It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
– Kumar
May 10 '16 at 3:46

add a comment |

Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
– Chris White
Jan 12 '13 at 0:13

Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
– David Harris
Jan 14 '13 at 16:48

Have you solved that?
– Denis
Jun 27 '14 at 14:02

I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
– David Harris
Jun 30 '14 at 15:07

It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
– Kumar
May 10 '16 at 3:46

Check the permissions of your data directory (on the local disk). Also check the logs from the data node.
– Chris White
Jan 12 '13 at 0:13

Thanks Chris, I tried your suggestions but still no luck. I've added more info to my question based on what you said.
– David Harris
Jan 14 '13 at 16:48

Have you solved that?
– Denis
Jun 27 '14 at 14:02

I never did solve it, this was on a VM I was using for my own personal learning so I ended up blowing it away and starting from scratch. Sorry I don't have more for you.
– David Harris
Jun 30 '14 at 15:07

It seems you are connecting with pseudonode cluster from remote machine, if it so replace your ip address instead of 127.0.0.1 in all hadoop configuration and try it.
– Kumar
May 10 '16 at 3:46

add a comment |

11 Answers
11

active

oldest

votes

up vote
10
down vote

I got a same problem.

In my case, a key of the problem was following error message.

There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

It means that your hdfs-client couldn't connect to your datanode with 50010 port.
As you connected to hdfs namenode, you could got a datanode's status. But, your hdfs-client would failed to connect to your datanode.

(In hdfs, a namenode manages file directories, and datanodes. If hdfs-client connect to a namnenode, it will find a target file path and address of datanode that have the data. Then hdfs-client will communicate with datanode. (You can check those datanode uri by using netstat. because, hdfs-client will be trying to communicate with datanodes using by address informed by namenode)

I solved that problem by:

opening 50010 port in a firewall.

adding propertiy "dfs.client.use.datanode.hostname", "true"

adding hostname to hostfile in my client PC.

I'm sorry for my poor English skill.

edited Apr 4 '16 at 2:59

psyco

92331521

answered Apr 4 '16 at 2:36

kook

10113

add a comment |

up vote
2
down vote

Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
Then in the linux vm edit /etc/host file with

IPADDRESS (SPALCE) hostname

example :
192.168.110.27 clouderavm

and change the all your hadoop configuration files like

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

change localhost or localhost.localdomain or 0.0.0.0 to your hostname

then Restart cloudera manger.

in the windows machine edit C:WindowsSystem32Driversetchosts

add one line at the end with

you vm machine ip and hostname (same as you done on the /etc/host file in the vm)

VMIPADRESS VMHOSTNAME

example :

192.168.110.27 clouderavm

then check now, it should work, for detail configuration check following VIDEO from you tube

https://www.youtube.com/watch?v=fSGpYHjGIRY

answered Dec 2 '15 at 7:34

Chennakrishna

1616

add a comment |

up vote
1
down vote

add given property in hdfs-site.xml

<property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>

and add this file also in your program

conf.addResource("hdfs-site.xml");

stop hadoop

stop-all.sh

then start

start-all.sh

answered Jul 29 '15 at 6:28

Kishore

3,79431237

add a comment |

up vote
1
down vote

I ran into the similar issue and have two pieces of information may help you.

The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.

The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.

In core-site.xml after I changed

<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>

<value>hdfs://host_name:9000</value>

I no longer need the ssh turnnel and I can access the hdfs remotely.

answered Dec 2 '15 at 0:39

zfy

2,722166

add a comment |

up vote
1
down vote

Since I found many questions like this one in my search for having the exact same issue I thought I would share what finally worked for me. I found this forum post on Hortonworks: https://community.hortonworks.com/questions/16837/cannot-copy-from-local-machine-to-vm-datanode-via.html

The answer was truly understanding what calling new Configuration() means and setting the correct parameters as I needed them. In my case it was exactly the one mentioned in that post. So my working code looks like this.

try 
 Configuration config = new Configuration();
 config.set("dfs.client.use.datanode.hostname", "true");
 Path pdFile = new Path("stgicp-" + pd);
 FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config, 
 HadoopProperties.HIVE_DEFAULT_USER);
 if (dFS.exists(pdFile)) 
 dFS.delete(pdFile, false);
 
 FSDataOutputStream outStream = dFS.create(pdFile);
 for (String sjWLR : processWLR.get(pd)) 
 outStream.writeBytes(sjWLR);
 
 outStream.flush();
 outStream.close();

 dFS.delete(pdFile, false);
 dFS.close();
 catch (IOException | URISyntaxException | InterruptedException e) 
 log.error("WLR file processing error: " + e.getMessage());

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

add a comment |

up vote
0
down vote

in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements

answered Mar 14 '13 at 9:52

srikayala

361420

add a comment |

up vote
0
down vote

You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.

answered May 2 '13 at 8:35

Jickson T George

30336

add a comment |

up vote
0
down vote

From error message replication factor seems to be fine i.e.1.
It Seems datanode is properly functioning or have permission issues.
Check the permissions and check the status of datanode form the user, you are trying to run hadoop.

answered Jul 30 '13 at 10:11

Neha Milak

112

add a comment |

up vote
0
down vote

I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/

answered Jan 8 '15 at 14:20

bachr

2,27053359

add a comment |

up vote
0
down vote

It appears to be some issue with the FS.
Either the parameters in cross-site.xml are not matching the file it is trying to read

there is some common mismatch in the path (I see there being a WINDOWS reference).

you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
Location : $/bin/cygpath.exe

P.S. Replication does NOT seem to be the primary issue here according to me

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

answered Aug 10 '15 at 5:13

Yunus Khan

365

add a comment |

up vote
0
down vote

Here is how I create files in the HDFS:

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

FileSystem hdfs = FileSystem.get(context.getConfiguration());
Path outFile=new Path("/path to store the output file");

String line1=null;

if (!hdfs.exists(outFile))
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br.write("whatever data"+"n");
 br.close();
 hdfs.close();
 
else
 String line2=null;
 BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
 while((line2=br1.readLine())!=null)
 line1=line1.concat(line2)+"n";
 
 br1.close();
 hdfs.delete(outFile, true);
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br2.write(line1+"new data"+"n");
 br2.close();
 hdfs.close();

answered Oct 12 '15 at 6:45

Punit Naik

308518

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f14288453%2fwriting-to-hdfs-from-java-getting-could-only-be-replicated-to-0-nodes-instead%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

11 Answers
11

active

oldest

votes

11 Answers
11

active

oldest

votes

up vote
10
down vote

I got a same problem.

In my case, a key of the problem was following error message.

There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

I solved that problem by:

opening 50010 port in a firewall.

adding propertiy "dfs.client.use.datanode.hostname", "true"

adding hostname to hostfile in my client PC.

I'm sorry for my poor English skill.

edited Apr 4 '16 at 2:59

psyco

92331521

answered Apr 4 '16 at 2:36

kook

10113

add a comment |

up vote
10
down vote

I got a same problem.

In my case, a key of the problem was following error message.

There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

I solved that problem by:

opening 50010 port in a firewall.

adding propertiy "dfs.client.use.datanode.hostname", "true"

adding hostname to hostfile in my client PC.

I'm sorry for my poor English skill.

edited Apr 4 '16 at 2:59

psyco

92331521

answered Apr 4 '16 at 2:36

kook

10113

add a comment |

up vote
10
down vote

I got a same problem.

In my case, a key of the problem was following error message.

There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

I solved that problem by:

opening 50010 port in a firewall.

adding propertiy "dfs.client.use.datanode.hostname", "true"

adding hostname to hostfile in my client PC.

I'm sorry for my poor English skill.

edited Apr 4 '16 at 2:59

psyco

92331521

answered Apr 4 '16 at 2:36

kook

10113

I got a same problem.

In my case, a key of the problem was following error message.

There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

I solved that problem by:

opening 50010 port in a firewall.

adding propertiy "dfs.client.use.datanode.hostname", "true"

adding hostname to hostfile in my client PC.

I'm sorry for my poor English skill.

edited Apr 4 '16 at 2:59

psyco

92331521

answered Apr 4 '16 at 2:36

kook

10113

edited Apr 4 '16 at 2:59

psyco

92331521

edited Apr 4 '16 at 2:59

psyco

92331521

edited Apr 4 '16 at 2:59

psyco

92331521

answered Apr 4 '16 at 2:36

kook

10113

answered Apr 4 '16 at 2:36

kook

10113

answered Apr 4 '16 at 2:36

kook

10113

add a comment |

up vote
2
down vote

Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
Then in the linux vm edit /etc/host file with

IPADDRESS (SPALCE) hostname

example :
192.168.110.27 clouderavm

and change the all your hadoop configuration files like

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

change localhost or localhost.localdomain or 0.0.0.0 to your hostname

then Restart cloudera manger.

in the windows machine edit C:WindowsSystem32Driversetchosts

add one line at the end with

you vm machine ip and hostname (same as you done on the /etc/host file in the vm)

VMIPADRESS VMHOSTNAME

example :

192.168.110.27 clouderavm

then check now, it should work, for detail configuration check following VIDEO from you tube

https://www.youtube.com/watch?v=fSGpYHjGIRY

answered Dec 2 '15 at 7:34

Chennakrishna

1616

add a comment |

up vote
2
down vote

Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
Then in the linux vm edit /etc/host file with

IPADDRESS (SPALCE) hostname

example :
192.168.110.27 clouderavm

and change the all your hadoop configuration files like

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

change localhost or localhost.localdomain or 0.0.0.0 to your hostname

then Restart cloudera manger.

in the windows machine edit C:WindowsSystem32Driversetchosts

add one line at the end with

you vm machine ip and hostname (same as you done on the /etc/host file in the vm)

VMIPADRESS VMHOSTNAME

example :

192.168.110.27 clouderavm

then check now, it should work, for detail configuration check following VIDEO from you tube

https://www.youtube.com/watch?v=fSGpYHjGIRY

answered Dec 2 '15 at 7:34

Chennakrishna

1616

add a comment |

up vote
2
down vote

Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
Then in the linux vm edit /etc/host file with

IPADDRESS (SPALCE) hostname

example :
192.168.110.27 clouderavm

and change the all your hadoop configuration files like

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

change localhost or localhost.localdomain or 0.0.0.0 to your hostname

then Restart cloudera manger.

in the windows machine edit C:WindowsSystem32Driversetchosts

add one line at the end with

you vm machine ip and hostname (same as you done on the /etc/host file in the vm)

VMIPADRESS VMHOSTNAME

example :

192.168.110.27 clouderavm

then check now, it should work, for detail configuration check following VIDEO from you tube

https://www.youtube.com/watch?v=fSGpYHjGIRY

answered Dec 2 '15 at 7:34

Chennakrishna

1616

Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd).
Then in the linux vm edit /etc/host file with

IPADDRESS (SPALCE) hostname

example :
192.168.110.27 clouderavm

and change the all your hadoop configuration files like

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

change localhost or localhost.localdomain or 0.0.0.0 to your hostname

then Restart cloudera manger.

in the windows machine edit C:WindowsSystem32Driversetchosts

add one line at the end with

you vm machine ip and hostname (same as you done on the /etc/host file in the vm)

VMIPADRESS VMHOSTNAME

example :

192.168.110.27 clouderavm

then check now, it should work, for detail configuration check following VIDEO from you tube

https://www.youtube.com/watch?v=fSGpYHjGIRY

answered Dec 2 '15 at 7:34

Chennakrishna

1616

answered Dec 2 '15 at 7:34

Chennakrishna

1616

answered Dec 2 '15 at 7:34

Chennakrishna

1616

answered Dec 2 '15 at 7:34

Chennakrishna

1616

add a comment |

up vote
1
down vote

add given property in hdfs-site.xml

<property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>

and add this file also in your program

conf.addResource("hdfs-site.xml");

stop hadoop

stop-all.sh

then start

start-all.sh

answered Jul 29 '15 at 6:28

Kishore

3,79431237

add a comment |

up vote
1
down vote

add given property in hdfs-site.xml

<property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>

and add this file also in your program

conf.addResource("hdfs-site.xml");

stop hadoop

stop-all.sh

then start

start-all.sh

answered Jul 29 '15 at 6:28

Kishore

3,79431237

add a comment |

up vote
1
down vote

add given property in hdfs-site.xml

<property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>

and add this file also in your program

conf.addResource("hdfs-site.xml");

stop hadoop

stop-all.sh

then start

start-all.sh

answered Jul 29 '15 at 6:28

Kishore

3,79431237

add given property in hdfs-site.xml

<property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>

and add this file also in your program

conf.addResource("hdfs-site.xml");

stop hadoop

stop-all.sh

then start

start-all.sh

answered Jul 29 '15 at 6:28

Kishore

3,79431237

answered Jul 29 '15 at 6:28

Kishore

3,79431237

answered Jul 29 '15 at 6:28

Kishore

3,79431237

answered Jul 29 '15 at 6:28

Kishore

3,79431237

add a comment |

up vote
1
down vote

I ran into the similar issue and have two pieces of information may help you.

The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.

The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.

In core-site.xml after I changed

<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>

<value>hdfs://host_name:9000</value>

I no longer need the ssh turnnel and I can access the hdfs remotely.

answered Dec 2 '15 at 0:39

zfy

2,722166

add a comment |

up vote
1
down vote

I ran into the similar issue and have two pieces of information may help you.

The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.

The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.

In core-site.xml after I changed

<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>

<value>hdfs://host_name:9000</value>

I no longer need the ssh turnnel and I can access the hdfs remotely.

answered Dec 2 '15 at 0:39

zfy

2,722166

add a comment |

up vote
1
down vote

I ran into the similar issue and have two pieces of information may help you.

The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.

The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.

In core-site.xml after I changed

<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>

<value>hdfs://host_name:9000</value>

I no longer need the ssh turnnel and I can access the hdfs remotely.

answered Dec 2 '15 at 0:39

zfy

2,722166

I ran into the similar issue and have two pieces of information may help you.

The first thing I realized is I was using ssh tunnel to access the name node and when the client code tries to access data node it can not find the data node due to the tunnel somehow messed up the communication. I then run the client on the same box as the hadoop name node and it solved the problem. In short, non-standard network configuration confused hadoop to find the data node.

The reason I used ssh tunnel is I can't access name node remotely and I thought it was due to port restriction by admin, so I used ssh tunnel to bypass the restriction. But it turns out to be a misconfiguration of hadoop.

In core-site.xml after I changed

<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>

<value>hdfs://host_name:9000</value>

I no longer need the ssh turnnel and I can access the hdfs remotely.

answered Dec 2 '15 at 0:39

zfy

2,722166

answered Dec 2 '15 at 0:39

zfy

2,722166

answered Dec 2 '15 at 0:39

zfy

2,722166

answered Dec 2 '15 at 0:39

zfy

2,722166

add a comment |

up vote
1
down vote

try 
 Configuration config = new Configuration();
 config.set("dfs.client.use.datanode.hostname", "true");
 Path pdFile = new Path("stgicp-" + pd);
 FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config, 
 HadoopProperties.HIVE_DEFAULT_USER);
 if (dFS.exists(pdFile)) 
 dFS.delete(pdFile, false);
 
 FSDataOutputStream outStream = dFS.create(pdFile);
 for (String sjWLR : processWLR.get(pd)) 
 outStream.writeBytes(sjWLR);
 
 outStream.flush();
 outStream.close();

 dFS.delete(pdFile, false);
 dFS.close();
 catch (IOException | URISyntaxException | InterruptedException e) 
 log.error("WLR file processing error: " + e.getMessage());

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

add a comment |

up vote
1
down vote

try 
 Configuration config = new Configuration();
 config.set("dfs.client.use.datanode.hostname", "true");
 Path pdFile = new Path("stgicp-" + pd);
 FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config, 
 HadoopProperties.HIVE_DEFAULT_USER);
 if (dFS.exists(pdFile)) 
 dFS.delete(pdFile, false);
 
 FSDataOutputStream outStream = dFS.create(pdFile);
 for (String sjWLR : processWLR.get(pd)) 
 outStream.writeBytes(sjWLR);
 
 outStream.flush();
 outStream.close();

 dFS.delete(pdFile, false);
 dFS.close();
 catch (IOException | URISyntaxException | InterruptedException e) 
 log.error("WLR file processing error: " + e.getMessage());

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

add a comment |

up vote
1
down vote

try 
 Configuration config = new Configuration();
 config.set("dfs.client.use.datanode.hostname", "true");
 Path pdFile = new Path("stgicp-" + pd);
 FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config, 
 HadoopProperties.HIVE_DEFAULT_USER);
 if (dFS.exists(pdFile)) 
 dFS.delete(pdFile, false);
 
 FSDataOutputStream outStream = dFS.create(pdFile);
 for (String sjWLR : processWLR.get(pd)) 
 outStream.writeBytes(sjWLR);
 
 outStream.flush();
 outStream.close();

 dFS.delete(pdFile, false);
 dFS.close();
 catch (IOException | URISyntaxException | InterruptedException e) 
 log.error("WLR file processing error: " + e.getMessage());

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

try 
 Configuration config = new Configuration();
 config.set("dfs.client.use.datanode.hostname", "true");
 Path pdFile = new Path("stgicp-" + pd);
 FileSystem dFS = FileSystem.get(new URI("hdfs://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HDFS_DEFAULT_PORT), config, 
 HadoopProperties.HIVE_DEFAULT_USER);
 if (dFS.exists(pdFile)) 
 dFS.delete(pdFile, false);
 
 FSDataOutputStream outStream = dFS.create(pdFile);
 for (String sjWLR : processWLR.get(pd)) 
 outStream.writeBytes(sjWLR);
 
 outStream.flush();
 outStream.close();

 dFS.delete(pdFile, false);
 dFS.close();
 catch (IOException | URISyntaxException | InterruptedException e) 
 log.error("WLR file processing error: " + e.getMessage());

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

answered Nov 3 '16 at 12:58

Eva Donaldson

151215

add a comment |

up vote
0
down vote

in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements

answered Mar 14 '13 at 9:52

srikayala

361420

add a comment |

up vote
0
down vote

in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements

answered Mar 14 '13 at 9:52

srikayala

361420

add a comment |

up vote
0
down vote

in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements

answered Mar 14 '13 at 9:52

srikayala

361420

in the hadoop configuration, default replication is set to 3. check it once and change accordingly to your requirements

answered Mar 14 '13 at 9:52

srikayala

361420

answered Mar 14 '13 at 9:52

srikayala

361420

answered Mar 14 '13 at 9:52

srikayala

361420

answered Mar 14 '13 at 9:52

srikayala

361420

add a comment |

up vote
0
down vote

You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.

answered May 2 '13 at 8:35

Jickson T George

30336

add a comment |

up vote
0
down vote

You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.

answered May 2 '13 at 8:35

Jickson T George

30336

add a comment |

up vote
0
down vote

You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.

answered May 2 '13 at 8:35

Jickson T George

30336

You can try deleting the data (dfs/data) folder manually and formating the namenode. You can then start hadoop.

answered May 2 '13 at 8:35

Jickson T George

30336

answered May 2 '13 at 8:35

Jickson T George

30336

answered May 2 '13 at 8:35

Jickson T George

30336

answered May 2 '13 at 8:35

Jickson T George

30336

add a comment |

up vote
0
down vote

answered Jul 30 '13 at 10:11

Neha Milak

112

add a comment |

up vote
0
down vote

answered Jul 30 '13 at 10:11

Neha Milak

112

add a comment |

up vote
0
down vote

answered Jul 30 '13 at 10:11

Neha Milak

112

answered Jul 30 '13 at 10:11

Neha Milak

112

answered Jul 30 '13 at 10:11

Neha Milak

112

answered Jul 30 '13 at 10:11

Neha Milak

112

answered Jul 30 '13 at 10:11

Neha Milak

112

add a comment |

up vote
0
down vote

I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/

answered Jan 8 '15 at 14:20

bachr

2,27053359

add a comment |

up vote
0
down vote

I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/

answered Jan 8 '15 at 14:20

bachr

2,27053359

add a comment |

up vote
0
down vote

I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/

answered Jan 8 '15 at 14:20

bachr

2,27053359

I had a similar problem, in my case I just emptied the following folder $hadoop.tmp.dir/nm-local-dir/usercache/hdfs_user/appcache/

answered Jan 8 '15 at 14:20

bachr

2,27053359

answered Jan 8 '15 at 14:20

bachr

2,27053359

answered Jan 8 '15 at 14:20

bachr

2,27053359

answered Jan 8 '15 at 14:20

bachr

2,27053359

add a comment |

up vote
0
down vote

It appears to be some issue with the FS.
Either the parameters in cross-site.xml are not matching the file it is trying to read

there is some common mismatch in the path (I see there being a WINDOWS reference).

you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
Location : $/bin/cygpath.exe

P.S. Replication does NOT seem to be the primary issue here according to me

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

answered Aug 10 '15 at 5:13

Yunus Khan

365

add a comment |

up vote
0
down vote

It appears to be some issue with the FS.
Either the parameters in cross-site.xml are not matching the file it is trying to read

there is some common mismatch in the path (I see there being a WINDOWS reference).

you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
Location : $/bin/cygpath.exe

P.S. Replication does NOT seem to be the primary issue here according to me

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

answered Aug 10 '15 at 5:13

Yunus Khan

365

add a comment |

up vote
0
down vote

It appears to be some issue with the FS.
Either the parameters in cross-site.xml are not matching the file it is trying to read

there is some common mismatch in the path (I see there being a WINDOWS reference).

you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
Location : $/bin/cygpath.exe

P.S. Replication does NOT seem to be the primary issue here according to me

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

answered Aug 10 '15 at 5:13

Yunus Khan

365

It appears to be some issue with the FS.
Either the parameters in cross-site.xml are not matching the file it is trying to read

there is some common mismatch in the path (I see there being a WINDOWS reference).

you can use cygwin tool to setup the path and place it where the datanodes and temp file locations are placed and that should sufficiently do the trick
Location : $/bin/cygpath.exe

P.S. Replication does NOT seem to be the primary issue here according to me

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

answered Aug 10 '15 at 5:13

Yunus Khan

365

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

edited Aug 10 '15 at 5:38

Meenesh Jain

2,49221429

answered Aug 10 '15 at 5:13

Yunus Khan

365

answered Aug 10 '15 at 5:13

Yunus Khan

365

answered Aug 10 '15 at 5:13

Yunus Khan

365

add a comment |

up vote
0
down vote

Here is how I create files in the HDFS:

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

FileSystem hdfs = FileSystem.get(context.getConfiguration());
Path outFile=new Path("/path to store the output file");

String line1=null;

if (!hdfs.exists(outFile))
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br.write("whatever data"+"n");
 br.close();
 hdfs.close();
 
else
 String line2=null;
 BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
 while((line2=br1.readLine())!=null)
 line1=line1.concat(line2)+"n";
 
 br1.close();
 hdfs.delete(outFile, true);
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br2.write(line1+"new data"+"n");
 br2.close();
 hdfs.close();

answered Oct 12 '15 at 6:45

Punit Naik

308518

add a comment |

up vote
0
down vote

Here is how I create files in the HDFS:

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

FileSystem hdfs = FileSystem.get(context.getConfiguration());
Path outFile=new Path("/path to store the output file");

String line1=null;

if (!hdfs.exists(outFile))
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br.write("whatever data"+"n");
 br.close();
 hdfs.close();
 
else
 String line2=null;
 BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
 while((line2=br1.readLine())!=null)
 line1=line1.concat(line2)+"n";
 
 br1.close();
 hdfs.delete(outFile, true);
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br2.write(line1+"new data"+"n");
 br2.close();
 hdfs.close();

answered Oct 12 '15 at 6:45

Punit Naik

308518

add a comment |

up vote
0
down vote

Here is how I create files in the HDFS:

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

FileSystem hdfs = FileSystem.get(context.getConfiguration());
Path outFile=new Path("/path to store the output file");

String line1=null;

if (!hdfs.exists(outFile))
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br.write("whatever data"+"n");
 br.close();
 hdfs.close();
 
else
 String line2=null;
 BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
 while((line2=br1.readLine())!=null)
 line1=line1.concat(line2)+"n";
 
 br1.close();
 hdfs.delete(outFile, true);
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br2.write(line1+"new data"+"n");
 br2.close();
 hdfs.close();

answered Oct 12 '15 at 6:45

Punit Naik

308518

Here is how I create files in the HDFS:

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

FileSystem hdfs = FileSystem.get(context.getConfiguration());
Path outFile=new Path("/path to store the output file");

String line1=null;

if (!hdfs.exists(outFile))
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br.write("whatever data"+"n");
 br.close();
 hdfs.close();
 
else
 String line2=null;
 BufferedReader br1 = new BufferedReader(new InputStreamReader(hdfs.open(outFile)));
 while((line2=br1.readLine())!=null)
 line1=line1.concat(line2)+"n";
 
 br1.close();
 hdfs.delete(outFile, true);
 OutputStream out = hdfs.create(outFile);
 BufferedWriter br2 = new BufferedWriter(new OutputStreamWriter(out, "UTF-8"));
 br2.write(line1+"new data"+"n");
 br2.close();
 hdfs.close();

answered Oct 12 '15 at 6:45

Punit Naik

308518

answered Oct 12 '15 at 6:45

Punit Naik

308518

answered Oct 12 '15 at 6:45

Punit Naik

308518

answered Oct 12 '15 at 6:45

Punit Naik

308518

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Odtnhj

Writing to HDFS from Java, getting “could only be replicated to 0 nodes instead of minReplication”

Update

Update

Update

Update

11 Answers
11

Your Answer

Post as a guest

11 Answers
11

11 Answers
11

Post as a guest

這個網誌中的熱門文章

What does pagestruct do in Eviews?

Dutch intervention in Lombok and Karangasem

Channel Islands

Writing to HDFS from Java, getting “could only be replicated to 0 nodes instead of minReplication”

Update

Update

Update

Update

11 Answers 11

Your Answer

Sign up or log in

Post as a guest

Post as a guest

11 Answers 11

11 Answers 11

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

這個網誌中的熱門文章

What does pagestruct do in Eviews?

Dutch intervention in Lombok and Karangasem

Channel Islands

11 Answers
11

11 Answers
11

11 Answers
11