4 Operator

 • 

8.6K Posts

June 5th, 2008 03:00

Hmmh,
sounds very much like a Linux bug between that adodb library and Linux NFS.

With Linux in the beginning NFS was implemented a bit "loose" in terms of features/interoperability and IIRC file locking came later.

If you want to test file locking itself you can use the lock tests from from the Connectathon test suite http://www.connectathon.org/nfstests.html

What I would try:
- ask in PHP and Linux forums
- updated to the latest versions of Linux, NFS, PHP
- see if there are any Linux or PHP options to change the locking

You could also go troubleshooting and take a tcpdump - if you catch it in the act you can at least find out if that flock never gets sent on the wire or if there is no return.

If you dont need shared access an alternative might be using ISCSI - there you put a native file system on the ISCSI LUN and get native locking behaviour

What DART version are you using ?
Its always worth a try looking at the release notes if something in that area was fixed in a newer code.

Rainer
P.S.: your mount options look fine

10 Posts

June 5th, 2008 07:00

Thanks for your reply. Comments are inline.

sounds very much like a Linux bug between that adodb
library and Linux NFS.

With Linux in the beginning NFS was implemented a bit
"loose" in terms of features/interoperability and
IIRC file locking came later.


That's true, Linux' NFS implementation is not known to be the best around ;-). However, our problem lies with locking on the NS20 now, not on Linux. As far as I know, the NFS-client on Linux is not known to be buggy.


If you want to test file locking itself you can use
the lock tests from from the Connectathon test suite
http://www.connectathon.org/nfstests.html


I've tried those tests during the tesingperiod of our NS20, a few months ago. They didn't show any problems.

You could also go troubleshooting and take a tcpdump
- if you catch it in the act you can at least find
out if that flock never gets sent on the wire or if
there is no return.


I was afraid you were going to say that ;-). Ah well, first step is to create a situation to easily reproduce the error.

If you dont need shared access an alternative might
be using ISCSI - there you put a native file system
on the ISCSI LUN and get native locking behaviour


We need shared access, we use a bunch of webservers behind loadbalancers.


What DART version are you using ?


[nasadmin@ns20 nasadmin]$ server_version server_2
server_2 : Product: EMC Celerra File Server Version: T5.5.32.4

Its always worth a try looking at the release notes
if something in that area was fixed in a newer code.


I'll check them out.

P.S.: your mount options look fine


Thanks.

4 Operator

 • 

8.6K Posts

June 5th, 2008 13:00

That's true, Linux' NFS implementation is not known
to be the best around ;-). However, our problem lies
with locking on the NS20 now, not on Linux. As far as
I know, the NFS-client on Linux is not known to be
buggy.


well, NFS locking is tricky to get right. Not every detail is documented in the RFCs.
My experience is that most established vendors used the Sun reference implementation for interoperability in grey area's where Linux was more testing against Linux :-)

I've tried those tests during the tesingperiod of our
NS20, a few months ago. They didn't show any
problems.


ok - knowing about Connectathon does make you an advanced user

I was afraid you were going to say that ;-). Ah well,
first step is to create a situation to easily
reproduce the error.


I know - and the tricky thing is how to trigger on the offending packets so that you dont have to capture and go through GB of data


[nasadmin@ns20 nasadmin]$ server_version server_2
server_2 : Product: EMC Celerra File Server Version:
T5.5.32.4


ok thats fairly recent.

Of course there is always the chance of a bug - however the Celerra lockd code is quite stable. The last bug fix I could find was for HP-UX clients in 5.5.30
I would say the chances of it being in the Linux NFS client code are greater :-)

I would still open a service request with EMC support - maybe they can provide some debugging options
No Events found!

Top