Solaris Troubleshooting : NFS TroubleShooting

This article will help you to understand some of the basic troubleshooting instructions for NFS problems …

1. Determine the NFS version:

To determine what version and transport of NFS is currently available, run rpcinfo on the NFS server.

 

# rpcinfo -p | grep 100003

100003 2 udp 0.0.0.0.8.1 nfs superuser

100003 3 udp 0.0.0.0.8.1 nfs superuser

100003 2 tcp 0.0.0.0.8.1 nfs superuser

100003 3 tcp 0.0.0.0.8.1 nfs superuser

he second column above is the NFS version, the third column is the transport protocol.

Sun has implemented the following versions of NFS on it’s operating systems, for both client and server:

 

OS Version NFSv2 NFSv3 NFSv4
SunOS UDP
Solaris[TM] 2.4 and below UDP
Solaris[TM] 2.5,2.6,7,8,9 UDP and/or TCP UDP and/or TCP
Solaris[TM] 10 UDP and/or TCP UDP and/or TCP TCP*

 

*The UDP transport is not supported in NFSv4, as it does not contain the required congestion control methods

2.  Check the Connectivity for NFS Server from NFS client:

1.   Check that the NFS server is reachable from the client by running:
#/usr/sbin/ping

2.   If the server is not reachable from the client, make sure that the local name service is running. For NIS+ clients:
#/usr/lib/nis/nisping -u

3.   If the name service is running, make sure that the client has received the correct host information –
# /usr/bin/getent hosts

4.   If the host information is correct, but the server is not reachable from the client, run the ping command from another client.

5.   If the server is reachable from the second client, use ping to check connectivity of the first client to other systems on the local network. If this fails, check the networking configuration on the client. Check the following files:
/etc/hosts, /etc/netmasks, /etc/nsswitch.conf,
/etc/nodename, /etc/net/*/hosts etc.

6.   If the software is correct, check the networking hardware.

Additionally you can refer the “NFS Hard mounts vs Soft Mounts”

3.  From the Server, Verify Service Daemons are running

a) confirm S10 smf  network nfs server services are online:

# svcs -a |grep nfs

b) statd ,  lockd , mountd and nfsd processes should be running:
# ps -elf |grep nfs

c) compare the times when nfsd and mountd started  with the time

when rpcbind was started. The rpcbind MUST have started before the  NFS Daemons.

d) verify that the NFS programs have been registered with rpcbind:

# rpcinfo -s

to confirm specific RPC service use the following commands:

# rpcinfo -t 100003

# rpcinfo -t 100005

# rpcinfo -t 100021

e) logging may be enabled (not for NFSv4).

On the client:

a) confirm S10 smf  network nfs client services are online:

# svcs -a |grep nfs

b)  statd ,  lockd  should be running
# ps -elf |grep nfs

c) You can verify the server is working from the client side.

# rpcinfo -s |egrep ?nfs|mountd|lock?

# rpcinfo -u 100003

# rpcinfo -u 100005

# rpcinfo -u 100021

4. Confirm proper syntax of  dfstab share entries on NFS server.

Solaris OS defines shared (or exported) filesystems in the /etc/dfs/dfstab  file.  The standard syntax of lines in that file is:

share [-F fstype] [ -o options] [-d “”] [resource]

For example, the following /etc/dfs/dfstab file is for a server that makes available the filesystems /usr, /var/spool/mail and /home:

share -F nfs /usr
share -F nfs /var/spool/mail
share -F nfs /home

You can add normal mount options to these lines, such as ro, rw and root.  This is done by proceeding the options with a -o flag.  The following example shows our /etc/dfs/dfstab file, with all filesystems shared read only:

share -F nfs -o ro /usr
share -F nfs -o ro /var/spool/mail
share -F nfs -o ro /home

To add new shares to existing ones, simply run the shareallcommand:
# shareall

This will share ALL filesystems available in the /etc/dfs/dfstab file. If you have never shared filesystems from this machine before, you
must run the nfs.server script:

# /etc/init.d/nfs.server start
This will run the shareall(1M) command and start the nfs daemons, mountd(1M), and nfsd. The “nfs.server start” procedure is also run on bootup, when the system enters run level 3.

5. Confirm file system is shared as seen on both ends.

The NFS server is the system that will share a file system. The ?showmount-e? or ?dfshares? command will display what  is being shared.  From the client use command with nfs server name.

# showmount -e

Note: that NFSv4 does not use mountd. If mountd is not running,  showmount will not work.

6. Verify mount point exists and is in use

To display statistics for each NFS mounted file system, use the command ?nfsstat -m?. This command will also tell you which options were used when the file system was mounted. You can also check the contents of the /etc/mnttab. It should show what is currently mounted. Lastly, check the dates between the server and the client. An incorrect date may show the file created in the future causing confusion

Ramdev

Ramdev

I have started unixadminschool.com ( aka gurkulindia.com) in 2009 as my own personal reference blog, and later sometime i have realized that my leanings might be helpful for other unixadmins if I manage my knowledge-base in more user friendly format. And the result is today's' unixadminschool.com. You can connect me at - https://www.linkedin.com/in/unixadminschool/

13 Responses

  1. Chetan says:

    Hi,

    I try to implement nfs on server but on client side , whn check nfs server the client service goes to “offline*” state.
    ROOT@SAPTEST4# svcs | grep -i nfs —————– (CLIENT SIDE O/P)
    online Apr_15 svc:/network/nfs/status:default
    online Apr_15 svc:/network/nfs/cbd:default
    online Apr_15 svc:/network/nfs/mapid:default
    online Apr_15 svc:/network/nfs/nlockmgr:default
    online Apr_15 svc:/network/nfs/rquota:default
    disable Apr_20 svc:/network/nfs/server:default
    offline* Apr_15 svc:/network/nfs/client:default

    WHAT means of offline*????????
    I restart svc:/network/nfs/server:default service but within 2 min its again goes to disable state

    Whn i try to mounted the fs on client its completed successfully but whn i tried to create any file or dir its permission denied error.

    PLZ help me………..

    • Ramdev Ramdev says:

      @Chetan, we can start client services using svcs nfs/client,not by restart nfs/server service.

      as per your comment, i assume that this client service running fine on the client machine, that is the reason you are able to mount it.

      And about the permission denier errror. Please check what is the permission you have on the shared directory at the source, and change it to 777 and test to write a file. If that allow you to write than the problem is with permissions. If that doesn’t allow to write than the problem is the sharing options that means you have to share the directory with proper rw access to the client. Please check the section 4 of this post.

  2. chetan says:

    Hi sir,

    Following steps that i taken on nfs server end

    ===============================================================================================
    ######### SERVER SIDE #############
    ===============================================================================================

    bash-3.2# share -F nfs /jumpstart
    bash-3.2# shareall
    bash-3.2# exportfs -va
    shareall -F nfs
    bash-3.2# showmount -e
    export list for APPLE:
    /jumpstart (everyone)
    bash-3.2# ps -ef | grep -i nfs
    daemon 1502 1 0 00:05:07 ? 0:00 /usr/lib/nfs/lockd
    daemon 1486 1 0 00:05:06 ? 0:00 /usr/lib/nfs/statd
    daemon 1483 1 0 00:05:06 ? 0:00 /usr/lib/nfs/nfs4cbd
    root 1643 1 0 00:05:14 ? 0:00 /usr/lib/nfs/mountd
    daemon 1484 1 0 00:05:06 ? 0:00 /usr/lib/nfs/nfsmapid
    daemon 1649 1 0 00:05:14 ? 0:00 /usr/lib/nfs/nfsd
    bash-3.2# svcs | grep -i nfs
    online 0:05:05 svc:/network/nfs/mapid:default
    online 0:05:06 svc:/network/nfs/cbd:default
    online 0:05:06 svc:/network/nfs/status:default
    online 0:05:07 svc:/network/nfs/nlockmgr:default
    online 0:05:12 svc:/network/nfs/client:default
    online 0:05:12 svc:/network/nfs/rquota:default
    online 0:05:14 svc:/network/nfs/server:default
    bash-3.2# dfshares
    RESOURCE SERVER ACCESS TRANSPORT
    APPLE:/jumpstart APPLE – –
    bash-3.2# svcs | grep -i RPC
    online 0:05:05 svc:/network/rpc/bind:default
    online 0:05:11 svc:/network/rpc/gss:default
    online 0:05:12 svc:/network/rpc/cde-calendar-manager:default
    online 0:05:12 svc:/network/rpc/cde-ttdbserver:tcp
    online 0:05:12 svc:/network/rpc/smserver:default
    online 0:05:12 svc:/network/rpc-100235_1/rpc_ticotsord:default
    bash-3.2# ping 192.168.254.11
    192.168.254.11 is alive
    ===============================================================================================
    ######### CLIENT SIDE #############
    ===============================================================================================

    bash-3.2# ps -ef | grep -i nfs
    daemon 1457 1 0 00:08:32 ? 0:00 /usr/lib/nfs/statd
    daemon 1481 1 0 00:08:33 ? 0:00 /usr/lib/nfs/lockd
    daemon 1462 1 0 00:08:33 ? 0:00 /usr/lib/nfs/nfs4cbd
    daemon 1463 1 0 00:08:33 ? 0:00 /usr/lib/nfs/nfsmapid
    bash-3.2# svcs | grep -i nfs
    online 0:08:32 svc:/network/nfs/status:default
    online 0:08:32 svc:/network/nfs/cbd:default
    online 0:08:33 svc:/network/nfs/mapid:default
    online 0:08:33 svc:/network/nfs/nlockmgr:default
    online 0:08:43 svc:/network/nfs/rquota:default
    bash-3.2# svcs /network/nfs/server
    STATE STIME FMRI
    disabled 0:08:47 svc:/network/nfs/server:default
    bash-3.2# svcs /network/nfs/client
    STATE STIME FMRI
    disabled 0:07:41 svc:/network/nfs/client:default
    bash-3.2# svcadm enable /network/nfs/server
    bash-3.2# svcs /network/nfs/server
    STATE STIME FMRI
    disabled 0:11:39 svc:/network/nfs/server:default
    bash-3.2# svcadm enable svc:/network/nfs/client:default
    bash-3.2# svcs /network/nfs/client
    STATE STIME FMRI
    offline* 0:13:49 svc:/network/nfs/client:default
    bash-3.2# mount -F nfs 192.168.254.21:/jumpstart /mnt
    bash-3.2# df -kh /mnt
    Filesystem size used avail capacity Mounted on
    192.168.254.21:/jumpstart
    5.9G 2.2G 3.6G 39% /mnt
    bash-3.2# ls -ld /mnt
    drwxr-xr-x+ 7 root root 512 Apr 29 23:28 /mnt
    bash-3.2# cd /mnt
    bash-3.2# ls -la
    total 62
    drwxr-xr-x+ 7 root root 512 Apr 29 23:28 .
    drwxr-xr-x 40 root root 1024 May 4 12:31 ..
    -r-xr-xr-x 1 root root 17375 Jan 14 2005 analyze_patches
    drwxr-xr-x+ 5 root root 512 Apr 29 20:52 boot
    dr-xr-xr-x+ 3 root root 512 Apr 29 23:24 config
    dr-xr-xr-x+ 2 root root 512 Mar 25 2008 database
    drwx——+ 2 root root 8192 Apr 29 20:50 lost+found
    drwxr-xr-x+ 6 root root 512 Apr 29 21:04 os
    bash-3.2# mkdir test
    mkdir: Failed to make directory “test”; Permission denied
    bash-3.2# dfshares
    nfs dfshares:MANGO: RPC: Program not registered
    bash-3.2# dfshares
    nfs dfshares:MANGO: RPC: Program not registered
    bash-3.2# dfmounts
    nfs dfmounts: can’t contact server: MANGO: RPC: Program not registered
    bash-3.2# svcs | grep -i RPC
    online 0:08:31 svc:/network/rpc/bind:default
    online 0:08:43 svc:/network/rpc/gss:default
    online 0:08:43 svc:/network/rpc/cde-calendar-manager:default
    online 0:08:43 svc:/network/rpc/cde-ttdbserver:tcp
    online 0:08:43 svc:/network/rpc/smserver:default
    online 0:08:43 svc:/network/rpc-100235_1/rpc_ticotsord:default
    bash-3.2# ping 192.168.254.21
    192.168.254.21 is alive
    bash-3.2# showmount -e MANGO
    showmount: MANGO: RPC: Unknown host
    bash-3.2# rpcinfo -s
    program version(s) netid(s) service owner
    100000 2,3,4 udp,tcp,ticlts,ticotsord,ticots rpcbind superuser
    1073741824 1 tcp – 1
    100024 1 ticots,ticotsord,ticlts,tcp,udp status superuser
    100133 1 ticots,ticotsord,ticlts,tcp,udp – superuser
    100021 4,3,2,1 tcp,udp nlockmgr 1
    100234 1 ticotsord gssd superuser
    100424 1 ticotsord – superuser
    100068 5,4,3,2 ticlts – superuser
    100083 1 ticotsord – superuser
    100155 1 ticotsord smserverd superuser
    100134 1 ticotsord ktkt_warnd superuser
    100011 1 udp,ticlts rquotad superuser
    100235 1 ticotsord – superuser
    100099 4 ticotsord – superuser
    100231 1 ticots,ticotsord,ticlts – superuser
    100005 3,2,1 ticots,ticotsord,tcp,ticlts,udp mountd superuser
    100003 4,3,2 tcp,udp nfs 1
    100227 3,2 tcp,udp nfs_acl 1
    100169 1 ticots,ticotsord,ticlts – superuser
    =============================================******==================================================
    sir, as u say “client service running fine on the client machine” but it’s in the offline* state it’s not in enable state 1 more thing on client side, /network/nfs/server service in disabled mode after enabing or restarting also,
    but all this i can able to mount the FS but i check this ls -ld /mnt thn why it’s shomf “drwxr-xr-x+” like that o/p. i think this “+” sign of ACL implemented.

    At server side i put share cmd in /etc/dfs/dfstab file also same o/p…..

    Thanks
    Chetan

    • Ramdev Ramdev says:

      @chetan —

      On client side:

      >> you dont need this to run, server is required for only server side, and it will come online if you have entry in /etc/dfs/dfstab

      bash-3.2# svcadm enable /network/nfs/server
      bash-3.2# svcs /network/nfs/server
      STATE STIME FMRI
      disabled 0:11:39 svc:/network/nfs/server:default

      >> you just need the below

      bash-3.2# svcadm enable svc:/network/nfs/client:default
      bash-3.2# svcs /network/nfs/client
      STATE STIME FMRI
      offline* 0:13:49 svc:/network/nfs/client:default

      but the offline* says there is some error during startup. just run the command svcs -xv and then you will find a log file path for the ntp/client service. That log will tell you why it was not starting.

      >> about the + symbol below

      bash-3.2# ls -ld /mnt
      drwxr-xr-x+ 7 root root 512 Apr 29 23:28 /mnt

      yes you have ACLs in place which are restricting you to write there.. just run “ls -ldv /jumpstart” from the server and see what permissions were actually set. you can also “getfacl /jumpstart”

      you can remove ACL (setfacl -d …… command )if you are in test machine, and just try to configure the NFS

      >> finally, i am little surprised that you are able to mount the client directory when the nfsclient is offline.

      Can you run the command “showmount -e ” instead of using hostname Mango. if that shows the server mounts, then your rpcbind able to reach the server successfully. But it may fail if you restart the box, but i have to still see the output after restarting you client machine.

  3. chetan says:

    sir,as u said i log file from svcs -xv cmd that i goted so o kill that process & again run the “/lib/svc/method/nfs-client start” cmd its work succesfully also start thr “mountd” demon on client side the finally take reboot of client
    after that offline* issue reslove; now from both side all server client RPC server are online
    bash-3.2# svcs -a | grep -i nfs
    disabled 23:57:11 svc:/network/nfs/server:default
    online 23:58:13 svc:/network/nfs/cbd:default
    online 23:58:13 svc:/network/nfs/mapid:default
    online 23:58:14 svc:/network/nfs/status:default
    online 23:58:16 svc:/network/nfs/nlockmgr:default
    online 23:58:21 svc:/network/nfs/rquota:default
    online 0:03:15 svc:/network/nfs/client:default
    =========================================================================
    as u said plz find “ls -ldv /jumpstart” o/p
    bash-3.2# ls -ldv /jumpstart
    drwxr-xr-x 8 root root 512 May 5 00:27 /jumpstart
    0:user::rwx
    1:group::r-x #effective:r-x
    2:mask:r-x
    3:other:r-x
    bash-3.2# showmount -e ——————————— at client side
    no exported file systems for MANGO
    =========================================================================
    Still ACL issue is pending , & if i create any file or dir. thn its showing error,

    bash-3.2# mount -F nfs 192.168.254.21:/jumpstart /mnt
    bash-3.2#
    bash-3.2#
    bash-3.2# showmount -e
    no exported file systems for MANGO
    bash-3.2# cd /mnt/
    bash-3.2#
    bash-3.2#
    bash-3.2# ls -ldv /mnt
    drwxr-xr-x+ 8 root root 512 May 5 00:27 /mnt
    0:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
    /append_data/execute/delete_child/read_attributes/write_attributes
    /read_acl/write_acl/synchronize:allow
    1:owner@::deny
    2:group@:add_file/write_data/add_subdirectory/append_data/delete_child
    /write_attributes/write_acl:deny
    3:group@:list_directory/read_data/execute/read_attributes/read_acl
    /synchronize:allow
    4:group@:add_file/write_data/add_subdirectory/append_data/delete_child
    /write_attributes/write_acl:deny
    5:everyone@:list_directory/read_data/execute/read_attributes/read_acl
    /synchronize:allow
    6:everyone@:add_file/write_data/add_subdirectory/append_data
    /delete_child/write_attributes/write_acl:deny
    bash-3.2# uptime
    12:33am up 37 min(s), 2 users, load average: 0.01, 0.01, 0.20
    bash-3.2# getfacl /jumpstart
    /jumpstart: No such file or directory
    bash-3.2# getfacl /mnt

    # file: /mnt
    # owner: root
    # group: root
    user::rwx
    group::r-x #effective:r-x
    mask:r-x
    other:r-x
    bash-3.2# cd /mnt
    bash-3.2#
    bash-3.2#
    bash-3.2# ls -la
    total 64
    drwxr-xr-x+ 8 root root 512 May 5 00:27 .
    drwxr-xr-x 40 root root 1024 May 7 23:56 ..
    -r-xr-xr-x 1 root root 17375 Jan 14 2005 analyze_patches
    drwxr-xr-x+ 5 root root 512 Apr 29 20:52 boot
    dr-xr-xr-x+ 3 root root 512 Apr 29 23:24 config
    dr-xr-xr-x+ 2 root root 512 Mar 25 2008 database
    drwx——+ 2 root root 8192 Apr 29 20:50 lost+found
    drwxr-xr-x+ 6 root root 512 Apr 29 21:04 os
    drwxrwxrwx+ 2 root root 512 May 5 00:27 test1
    bash-3.2# mkdie test1
    bash: mkdie: command not found
    bash-3.2# mkdiee test1
    bash: mkdiee: command not found
    bash-3.2# mkdier test1
    bash: mkdier: command not found
    bash-3.2# mkdir tet
    mkdir: Failed to make directory “tet”; Permission denied
    bash-3.2# touch tet
    touch: cannot create tet: Permission denied

    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    1 more req. pls write 1 blog on ACL & RBAC (basic to troubleshooting) becoz this kind of issue daily come in sys admin so it will really helpful.

    Many thanks for replying,
    Chetan

  4. Ramdev Ramdev says:

    Glad to know that the client service problem resolved. About the ACL i will post sometime soon.

  5. Chetan says:

    Ramdev@

    Many thanks, for ur help, i think there is patch issue also becoz we logged the case with oracle & they say that there is bug in current patch that’s why ACL issue came at client side i think same issue on sol 10_U10 patch also.

    Any way i am waiting for ACL & RBAC (user mangemnt ) blog ……….

    Again Many thanks…………..

    Chetan

  6. Aravind says:

    We have more then 30 groups and which should pass though NFS v3. We have tried all possibilities but NFS services crashing frequently on Solaris 10.
    Kindly required help to overcome default NFS  group limit (16 group support by default).

    • Ramdev Ramdev says:

      Hi Arvind, you had hit by well known issue of NFS using the auth_sys( authentication method used to authenticate client connection). The problem is auth_sys cannot handle authentication for the users who are having more than 16 groups. Setting the kernel parameter NGROUPS_UMAX=32 wont help in this case.

      as a workaround, I have noticed that Oracle officially recommends for ACL for the user access instead of groups.

      just incase if you have chance to linux as nfs server, then one work around for the problem is running the rpc.mountd with “-g” option ( refer man for more info).

  7. Aravind says:

    Hi Ramdev,
    Thanks for the reply. Same input i received from Sun support and as per there updates it will resolve on  Solaris 11 (Solaris 11  delivered in s11u1_04). I don’t know this will solve or not.

  8. Laxxi says:

    Hi Ram,

    I would like to give user level access to one of the filesystem shared.
    Suppose server1 is my master, server 2 is client, and /test is the filesystem shared from serve1 to server2, and /data being the mount point on client. i want only the root user and DBA to access the FS and restrict this for all others.

    Could you please help me with the syntax here

    Laxxi

  9. Siva says:

    how can i bring this service back online? what is the impact if it is in maintenanace svc:/network/nfs/cbd:default (NFS callback service)
    State: maintenance since Sun Jun 30 07:37:01 2013
    Reason: Restarting too quickly.
    See: http://sun.com/msg/SMF-8000-L5
    See: man -M /usr/share/man -s 1M nfs4cbd
    See: /var/svc/log/network-nfs-cbd:default.log
    Impact: This service is not running.

    /root# svcs -a |grep -i svc:/network/nfs/cbd:default
    maintenance 7:37:01 svc:/network/nfs/cbd:default

  10. venkat says:

    hi ram,

    in solaris client side mount point we are unable to write it showing access denied server side we are writing

    we checked moutpoint permissions
    ls -ld /xyz

    drwxrw-r– oracle dba
    both sides services online we tried remounting fs at client side.
    we share and unsharing at server side.
    we checked /etc/dfs/dfstab /etc/vfstab

    we checked dfmounts,dfshares and showmount -e servername all are fine

    so we unable to find the solustion please help me out form this issiue.

    thanks in advance

What is in your mind, about this post ? Leave a Reply

Close
  Our next learning article is ready, subscribe it in your email

What is your Learning Goal for Next Six Months ? Talk to us