Start a Conversation

Unsolved

This post is more than 5 years old

10053

April 26th, 2016 15:00

ECS: Remote I/O error when writing a file

I'd really appreciate some help on this. I've submitted a github ticket, but I haven't got an answer. This is what I've done.

I installed a centos 7 server and followed the single node docker install on github

I ran both step1 and step2 scripts.

I login to the web ui

I created an object user named nfs_user.

I created a bucket named nfs_bucket. I added the nfs_user as the owner and added the nfs_group with r,w,exec permissions

I went to file and created users named nfs_user (10001) and nfs_group (10002) with the corresponding IDs

I created an export and used nfs_user under anonuser and root squash and nfs_group under anon group.

This is what happens on my client.

[root@localhost /]# showmount -e 10.44.236.152

Export list for 10.44.236.152:

/ns1/nfs_bucket *

[root@localhost /]# mount -t nfs -o vers=3,sec=sys,proto=tcp,async 10.44.236.152:/ns1/nfs_bucket /nfsshare/

[root@localhost /]# cd /nfsshare/

[root@localhost nfsshare]# ls -al

total 1

drwxrwxrwx. 3 10001 10002 96 Apr 26 16:28 .

[root@localhost nfsshare]# mkdir test_dir

[root@localhost nfsshare]# ls -al

total 1

drwxrwxrwx. 3 10001 10002 96 Apr 26 16:29 .

drwxr-xr-x. 3 10001 10002 96 Apr 26 16:29 test_dir

[root@localhost nfsshare]# cd test_dir/

[root@localhost test_dir]# ls -al

total 1

drwxr-xr-x. 3 10001 10002 96 Apr 26 16:29 .

[root@localhost test_dir]# touch file

touch: cannot touch ‘file’: Remote I/O error

[root@localhost test_dir]# echo 'help me' > file

-bash: file: Remote I/O error

[root@localhost test_dir]# cd ..

[root@localhost nfsshare]# echo 'why am I getting this IO error' > file

-bash: file: Remote I/O error

Any ideas here?

April 27th, 2016 15:00

On our CE installation in our lab.  I'm able to do everything, but I'm not mounting to a sub directory.  How are you creating /nfsshare above?  Can you please try removing the /nfsshare and mount directly to the root of the bucket?

[root@localhost mnt]# showmount -e 10.1.83.115

Export list for 10.1.83.115:

/nfs/data *

[root@localhost mnt]# mount -t nfs -o vers=3,sec=sys,proto=tcp,async 10.1.83.115:/nfs/data

mount: can't find 10.1.83.115:/nfs/data in /etc/fstab

[root@localhost mnt]# mount -t nfs -o vers=3,sec=sys,proto=tcp,async 10.1.83.115:/nfs/data /mnt/data/

[root@localhost mnt]# cd /mnt/data/

[root@localhost data]# ls

[root@localhost data]# echo 'test data on the new mount.' > test.txt

[root@localhost data]# ls

test.txt

[root@localhost data]# cat test.txt

test data on the new mount.

[root@localhost data]# mkdir dir1

[root@localhost data]# cd dir1/

[root@localhost dir1]# ls

[root@localhost dir1]# cd ..

[root@localhost data]# ls -la

total 2

drwxrwxrwx. 3 500 501 96 Apr 27 18:07 .

drwxr-xr-x. 3 500 501 96 Apr 27 18:07 dir1

-rw-r--r--. 1 500 501 28 Apr 27 18:07 test.txt

[root@localhost data]# cd dir1/

[root@localhost dir1]# touch file

[root@localhost dir1]# echo 'data into test file in test dir1' > dir.txt

[root@localhost dir1]# ls

dir.txt  file

[root@localhost dir1]# cat dir.txt

data into test file in test dir1

[root@localhost dir1]# cat file

[root@localhost dir1]#

1.png2.png3.png4.png5.png6.png

18 Posts

April 27th, 2016 16:00

ok so a couple of things.

You keep saying that I'm mounting into a sub-directory but I'm not. I'm mounting into the root bucket which is "/ns1/nfs_bucket" the rest of the command is then my local mount point which is /nfsshare. Here is my command again. Look at it closely and you can see the space.

mount -t nfs -o vers=3 lon6dvvcsecs002:/ns1/nfs_bucket /nfsshare/

My about screen looks different. Are you installing the docker version form github?

version.JPG.jpg

I just noticed that in your export host photo you have nfs_user in the all 3 anonuser, anongroup and squashuser but if I look closely at the export summary it looks like you are actually dgp for the anongroup.

If you are installing the docker version are you using virtual machines? The way I have this setup is a vmware virtual machine that has a second vmdk added. This is the only thing I can think that is causing my problem.

The good news is I'm not getting any permissions errors. It's only this IO error when I try to write a file.

April 28th, 2016 08:00

The way we have it in our lab is two VMs, each running the single-node 2.2:latest docker image (the default; so it should be the same image you're using, despite the oddity in your GUI).

The second node we added as a separate VDC to the first through the web interface (VDC provisioning can be skipped with an optional flag in step2).

April 28th, 2016 08:00

dancaps, Sorry for the confusion.  Yes, I see, you are mounting directly to the root bucket.  Also, the GUI has a bug when viewing an existing host configuration where the anongid value gets populated with the anonuid value.  Any you would be available to look further via WebEx?  Let me know.

Final thought, can you tail the file service logs and run the commands that give you IO errors?  Maybe that will give us a little more information about what's happening.

-Ben

18 Posts

April 28th, 2016 09:00

Sure I would be more than happy to do a webex with you. Should we take the

conversation offline through email or something?

On Thu, Apr 28, 2016 at 9:45 AM benschumacher

April 28th, 2016 10:00

Hey Dan,

You there?

Ben

April 28th, 2016 10:00

Sure, shoot me an email.  I'm not sure how to access contact information on the community site.

18 Posts

April 28th, 2016 11:00

I just did another fresh install. These are my steps.

- I installed a minimal copy of centos

- I installed docker, wget, development-tools, git, tar and a couple other basic packages

- I added another vmdk to the vm

- I verified all the system requirments. I have 4 cpus, 16GB of memory, 200GB raw disk as sdb, CentOS Linux release 7.2.1511, Docker Version:1.9.1

- I cloned the github repo https://github.com/EMCECS/ECS-CommunityEdition.git

- I ran the step1 script and I was able to hit the web ui page.

- I ran the step2 script.

- I logged in as root and changed the password.

- I looked at the about page and it looked like your picture. So that's goo news.

- I trying clicking on other pages and the webui is basically frozen and unresponsive.

- I restarted the docker container and waited 10 minutes. I was able to login again

- The about page looks broken again.

- I went to "File" --> "User/Group Mapping" and created nfs_user (500) and nfs_group (501)

- I went to "Users" and created and Object user named nfs_user

- I went to "Buckets" and created a new bucket with the following parameters

  Name: nfs_bucket

  Bucket Owner: nfs_user

  File System: Enabled

  Default Bucket Group: nfs_group

  Group File Permissions: Read Write

  Group Directory Permissions: Read Write

  Bucket Retention: 1 second

- I went to "File" and created a new Export on the nfs_bucket

- I "Add" a new Export Host Option with the following parameters

  Export Host: *

  Permissions: Read/Write

  Authentication: Sys

  AnonUser: nfs_user

  AnonGroup: nfs_group

  RootSquash: nfs_user

- I saved the export

I go to my linux client and run these commands:

[root@ric1pdvcsmgt02 ~]# mkdir /nfs_bucket

[root@ric1pdvcsmgt02 ~]# chmod 777 /nfs_bucket

[root@ric1pdvcsmgt02 ~]# showmount -e 10.72.236.55

Export list for 10.72.236.55:

/ns1/nfs_bucket *

[root@ric1pdvcsmgt02 ~]# mount -t nfs -o vers=3,sec=sys,proto=tcp,async 10.72.236.55:/ns1/nfs_bucket /nfs_bucket

[root@ric1pdvcsmgt02 ~]# cd /nfs_bucket/

[root@ric1pdvcsmgt02 nfs_bucket]# ls

[root@ric1pdvcsmgt02 nfs_bucket]# echo "Test data into my file" > test

-bash: test: Remote I/O error

[root@ric1pdvcsmgt02 nfs_bucket]# ls -al

total 1

drwxrwxrwx. 3 500 501 96 Apr 28 12:49 .

[root@ric1pdvcsmgt02 nfs_bucket]#

I've literally done this same process 25 times using various ecs versions, linux flavors, and nfs clients. I've installed it in 3 different data centers all over the world, I tried different vmware scsi adapters and vmware nic adapters.

I can't help but to conclude this software is a POS and shouldn't be used for NFS. The documentation is terrible and frankly it feels like nfs was an after thought with the 4 or 5 sentences in the manual about it. It by no means explains how to configure the exports.

I appreciate your help Ben. Without you I'd be all alone out here. Do you work for emc? Contact me at danny.caperton @ gmail

April 28th, 2016 14:00

I see the problem ...  Edit your bucket configuration and set your bucket retention to 0 (zero) seconds.  I have to dig more, but we must have a bug with NFS and retention.  Let me know if that gets you past the IO error.

-Ben

18 Posts

April 28th, 2016 15:00

Ben!!! You sir are the man! I can't tell you how happy I am you found that. It's all working for me now. I have the export mounted on 2 servers and I'm writing data.

Thanks for sticking with me on this. It's been a beat down to say the least.

April 28th, 2016 18:00

Alright, that's great news.  Glad to hear you got it working.  I'd like to point out that your currently running a pre-release version of NFS on ECS and that the official supported release will be available in the coming weeks.  This upcoming release will include many enhancements and bug fixes.  I highly recommend you try the officially supported version once it's available in a couple weeks.

-Ben

10 Posts

January 12th, 2018 03:00

Hello all,

I know this topic is a little bit old but I'm facing the same issue.

I'm testing the ECS and it's working great for the S3 connection but I'm not able to use the NFS'one.

My version of ECS is :

version.png

Here is my Bucket Configuration:

bucket.png

File1.png

File2.png

File3.png

I'm able to mount the NFS share:

[DOC:root]/usr/mtc/bin>mount -t nfs -o vers=3,sec=sys,proto=tcp,async my_ip:/ns1/Bucket2/ /mnt/ecs_share

[DOC:root]/usr/mtc/bin>mount |grep ecs

my_ip:/ns1/Bucket2/ on /mnt/ecs_share type nfs (rw,vers=3,sec=sys,proto=tcp,addr=my_ip)

[DOC:root]/usr/mtc/bin>cd /mnt/ecs_share/

But if I try to copy a file on it, I have the Remote I/O error also:

[DOC:root]/mnt/ecs_share>echo "Test data into my file" > test.txt

Do you have any idea on the issue ?

Thanks,

January 15th, 2018 11:00

Hi,

Why are you accessing the mount as root?  Can you switch to the unix user with ID 30024 and then trying testing write access to the ecs_share directory?  I believe this is happening because you don't have a mapping created on ECS for UID/GID 0 so if you switch to the unix user (with ID 30024) it should work because you have that UID mapped to the bucket owner on ECS.

Also, can you ls -la on the /mnt directory so I can see the permissions on ecs_share?

Thanks,

Ben

10 Posts

January 16th, 2018 02:00

Hi Ben,

Thanks for your support.

I was wondering what was the ID of this Linux user in the ECS configuration. Now, I understood.

So I changed a little bit my configuration in ECS:

So, on my Linux system, I have also a user called user_ecs with UID 1003.

/mnt/ecs_share>cat /etc/passwd |grep user_ecs

user_ecs:x:1003:1003::/home/user_ecs:/bin/bash

Now, when I mount the share with this user, I have the following error:

/mnt>sudo mount -t nfs -o vers=3,sec=sys,proto=tcp my_ip:/ns1/Bucket2/ /mnt/ecs_share/

password for user_ecs:

/mnt>mount

/dev/sda2 on / type ext4 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

tmpfs on /dev/shm type tmpfs (rw)

/dev/sda1 on /boot type ext4 (rw)

/dev/sdb1 on /mnt/CACHE type ext4 (rw,user_xattr)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

//nas01/Software on /mnt/MOUNT_POINTS/Software type cifs (rw)

sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

nas02:/Q/shares/LTFS_NFS3 on /mnt/MOUNT_POINTS/NFS_Share type nfs (rw,vers=3,rsize=8192,wsize=8192,timeo=14,intr,addr=10.3.1.80)

nfsd on /proc/fs/nfsd type nfsd (rw)

core_fuse on /mnt/FUSE type fuse.core_fuse (rw,nosuid,nodev,default_permissions,allow_other)

my_ip:/ns1/Bucket2/ on /mnt/ecs_share type nfs (rw,vers=3,sec=sys,proto=tcp,addr=my_ip)

/mnt>ls -la

total 25

drwxr-xr-x. 8 root root 4096 Jan 12 09:49 .

dr-xr-xr-x. 23 root root 4096 Jan 16 09:11 ..

drwxrwxrwx 9 root root 4096 Jan 8 14:32 CACHE

drwx---rwx 3 30024 2147483647 96 Jan 12 12:30 ecs_share

drwxrwxrwx 2 root root 4096 Jan 16 09:50 FUSE

drwxrwxrwx. 12 root root 4096 Jan 5 13:19 LTFS

drwxrwxrwx. 4 root root 4096 Jan 5 13:19 MOUNT_POINTS

/mnt>cd ecs_share/

/mnt/ecs_share>ls

ls: reading directory .: Permission denied

/mnt/ecs_share>echo "Test" > test.txt

-bash: test.txt: Permission denied

/mnt/ecs_share>

Any other suggestions ?

Thanks,

January 16th, 2018 06:00

Are you mapping the bucket (Bucket2) owner (object_user_1) to UID 1003?  It doesn't look like it.  It appears you're still mapping UID 30024 to the bucket owner.  In ECS, map 1003 to object_user1 and 1004 to nfs_group.  Then, on linux box, run:

mkdir /mnt/ecs_share

chown 1003:1004 /mnt/ecs_share

sudo mount -t nfs -o vers=3,sec=sys,proto=tcp my_ip:/ns1/Bucket2/ /mnt/ecs_share

su user_ecs

Then, you should be able to work in the mounted directory.  Also, when you ls /mnt, you should see the correct user/group instead of the numbers your seeing above.

Ben

No Events found!

Top