Solaris10 Troubleshooting SMF : Debug Services controlled by Solaris SMF

Solaris 10 Operating System introduced the Service Management Facility (SMF), which manages system services such as daemons. Debugging problems traditionally involves killing and restarting those daemons with debugging options. Under SMF the daemons will be immediately restarted so additional steps need to be taken.

 

 

Steps to Follow       :: There are three options:

1. Disable the service and start the daemon(s) manually

2. Modify the service method

3. Modify the service manifest

1. Disable the service and start the daemon(s) manually.

This is the simplest option and is generally recommended. The only complication is debugging services that other services depend upon or services that themselves start multiple daemons. If a service is disabled it could have undesirable consequences as SMF service state is persistent across boots. The use of the “svcadm disable -t “option is recommended. This ensures that the service is temporarily disabled and will be enabled on reboot.

The steps involved are:

a. The service is temporarily disabled with svcadm

b. The relevant daemons are restarted with the required options

c. Debug the problem

d. If still running, kill the relevant daemons

e. The service is enabled with svcadm

Example – keyserv

 

a. The service is temporarily disabled with svcadm

# svcs keyserv

STATE          STIME    FMRI

online         16:21:12 svc:/network/rpc/keyserv:default

# svcadm disable -t keyserv

# svcs keyserv

STATE          STIME    FMRI

disabled       16:26:55 svc:/network/rpc/keyserv:default

# svcs -l keyserv

fmri         svc:/network/rpc/keyserv:default

name         RPC encryption key storage

enabled      false (temporary)

state        disabled

next_state   none

state_time   Thu May 17 16:26:55 2007

logfile      /var/svc/log/network-rpc-keyserv:default.log

restarter    svc:/system/svc/restarter:default

contract_id

dependency   require_all/restart svc:/network/rpc/bind (online)

dependency   require_all/restart svc:/system/identity:domain (online)

#

b. The relevant daemons are restarted with the required options

# pgrep -lf keyserv

# keyserv -D 2>/var/tmp/keyserv.debug 1>&2 &

[1]     3072

# pgrep -lf keyserv

3072 keyserv -D

#

c. Debug the problem

# tail -f /var/tmp/keyserv.debug

default disk cache size: 1MB

supported mechanisms:

alias           disk cache size

=====           ===============

dh192-0         0MB

d. If still running, kill the relevant daemons

# pgrep -lf keyserv

3072 keyserv -D

# pkill -x keyserv

# pgrep -lf keyserv

[1] + Terminated       keyserv -D 2>/var/tmp/keyserv.debug 1>&2 &

#

e. The service is enabled with svcadm

# svcs keyserv

STATE          STIME    FMRI

disabled       16:26:55 svc:/network/rpc/keyserv:default

# svcadm enable keyserv

# svcs keyserv

STATE          STIME    FMRI

online         16:30:33 svc:/network/rpc/keyserv:default

#

—————————————————————————


2. Modify the service method

This is more complex but does resolve the problems related to dependencies and the services still being available after a reboot. Most (but not all) services are started from scripts that are found in /lib/svc/method. Editing these scripts is not supported outside of explicit instructions to do so by Sun services in the course of an investigation.

 

The steps involved are:

a. The service is disabled with svcadm

b. The relevant method is edited

c. The service is enabled with svcadm

d. Debug the problem

e. The service is disabled with svcadm

f. The relevant method is restored

g. The service is enabled with svcadm

Example – the NIS+ cache manager

a. The service is disabled with svcadm

# svcs nisplus

STATE          STIME    FMRI

online         16:37:12 svc:/network/rpc/nisplus:default

# svcadm disable nisplus

# svcs nisplus

STATE          STIME    FMRI

disabled       16:47:16 svc:/network/rpc/nisplus:default

#

b. The relevant method is edited

# grep nis_cachemgr /lib/svc/method/nisplus

/usr/sbin/nis_cachemgr $cachemgr_flags || exit $?

# vi /lib/svc/method/nisplus

… the start options are changed, eg adding ‘-v’

c. The service is enabled with svcadm

# svcs nisplus

STATE          STIME    FMRI

disabled       16:47:16 svc:/network/rpc/nisplus:default

# svcadm enable nisplus

# svcs nisplus

STATE          STIME    FMRI

online         16:49:57 svc:/network/rpc/nisplus:default

#

d. Debug the problem

# pgrep -lf nis_cachemgr

647 /usr/sbin/nis_cachemgr -v

#

e. The service is disabled with svcadm

# svcs nisplus

STATE          STIME    FMRI

online         16:49:57 svc:/network/rpc/nisplus:default

# svcadm disable nisplus

# svcs nisplus

STATE          STIME    FMRI

disabled       16:50:50 svc:/network/rpc/nisplus:default

#

f. The relevant method is edited

# grep nis_cachemgr /lib/svc/method/nisplus

/usr/sbin/nis_cachemgr -v $cachemgr_flags || exit $?

# vi /lib/svc/method/nisplus

… the start options are restored, eg removing ‘-v’

g. The service is enabled with svcadm

# svcs nisplus

STATE          STIME    FMRI

disabled       16:50:50 svc:/network/rpc/nisplus:default

# svcadm enable nisplus

# svcs nisplus

STATE          STIME    FMRI

online         16:52:13 svc:/network/rpc/nisplus:default

#

3. Modify the service manifest

This is more complex again and similarly does resolve the problems related to dependencies and the services still being available after a reboot. This method could be used to, say, define an alternative service (eg for service foo, a service foo-debug). The original service is disabled, the debug service is then enabled. The problem is that any dependencies on the original service will not take account of the new service name. For this reason defining alternative services is not advised.

The manifests define the start methods and a manifest can be changed to use a different method. The method can be changed either directly with editprop inside svccfg or, again using svccfg, the manifest can be exported, edited and imported again. Modification of the standard manifests is not supported outside of explicit instructions to do so by Sun services in the course of an investigation.

The steps involved are:

a. The service is disabled with svcadm

b. The manifest is edited (eg exported, edited, re-imported)

c. A new method is written

d. The service is enabled with svcadm

e. Debug the problem

f. The service is disabled with svcadm

g. The original manifest is restored

h. The service is enabled with svcadm

Example – keyserv

 

a. The service is disabled with svcadm

# svcs keyserv

STATE          STIME    FMRI

online         16:43:44 svc:/network/rpc/keyserv:default

# svcadm disable keyserv

# svcs keyserv

STATE          STIME    FMRI

disabled       16:58:34 svc:/network/rpc/keyserv:default

#

b. The manifest is edited (eg exported, edited, re-imported)

# svccfg

svc:> export keyserv > /var/tmp/keyserv.xml

svc:> exit

# cp /var/tmp/keyserv.xml /var/tmp/keyserv.xml.orig

# grep sbin/keyserv /var/tmp/keyserv.xml

<… exec=’/usr/sbin/keyserv’ …>

# vi /var/tmp/keyserv.xml

… change the start method, eg /var/tmp/keyserv

 

# svccfg

svc:> delete keyserv

svc:> select *keyserv*

Pattern ‘*keyserv*’ doesn’t match any instances or services

svc:> import /var/tmp/keyserv.xml

svc:> select *keyserv*

svc:/network/rpc/keyserv> listprop

start/exec                astring  /var/tmp/keyserv

svc:/network/rpc/keyserv> exit

#

c. A new method is written

We need to create the method, in this case, /var/tmp/keyserv:

#!/sbin/sh

#

/usr/sbin/keyserv -D 2>/var/tmp/keyserv.debug 1>&2 &

exit 0

d. The service is enabled with svcadm

# svcs keyserv

STATE          STIME    FMRI

disabled       17:04:26 svc:/network/rpc/keyserv:default

# svcadm enable keyserv

# svcs keyserv

STATE          STIME    FMRI

online         17:05:51 svc:/network/rpc/keyserv:default

# pgrep -lf keyserv

688 /usr/sbin/keyserv -D

#

e. Debug the problem

# tail -f /var/tmp/keyserv.debug

default disk cache size: 1MB

supported mechanisms:

alias           disk cache size

=====           ===============

dh192-0         0MB

f. The service is disabled with svcadm

# svcs keyserv

STATE          STIME    FMRI

online         17:05:51 svc:/network/rpc/keyserv:default

# svcadm disable keyserv

# svcs keyserv

STATE          STIME    FMRI

disabled       17:07:25 svc:/network/rpc/keyserv:default

#

g. The original manifest is restored

# svccfg

svc:> delete *keyserv*

svc:> select *keyserv*

Pattern ‘*keyserv*’ doesn’t match any instances or services

svc:> import /var/tmp/keyserv.xml.orig

svc:> select *keyserv*

svc:/network/rpc/keyserv> listprop

start/exec                astring  /usr/sbin/keyserv

#

NOTE: The original manifests in /var/svc/manifest can also be

used to restore the service.

h. The service is enabled with svcadm

# svcs keyserv

STATE          STIME    FMRI

disabled       17:08:17 svc:/network/rpc/keyserv:default

# svcadm enable keyserv

# svcs keyserv

STATE          STIME    FMRI

online         17:09:22 svc:/network/rpc/keyserv:default

# pgrep -lf keyserv

705 /usr/sbin/keyserv

#

 


Ramdev

Ramdev

I have started unixadminschool.com ( aka gurkulindia.com) in 2009 as my own personal reference blog, and later sometime i have realized that my leanings might be helpful for other unixadmins if I manage my knowledge-base in more user friendly format. And the result is today's' unixadminschool.com. You can connect me at - https://www.linkedin.com/in/unixadminschool/

What is in your mind, about this post ? Leave a Reply

Close
  Our next learning article is ready, subscribe it in your email

What is your Learning Goal for Next Six Months ? Talk to us