Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Software Distributor Administration Guide: HP-UX 11i v1, 11i v2, and 11i v3 > Appendix B Troubleshooting

Common Problems

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

This section presents a selection of problems you might encounter and how to resolve them:

Table B-2 Common Problems

Problem

Cannot contact target host’s daemon or agent

GUI won’t start or missing support files

Access to an object is denied

Slow network performance

Connection timeouts and other WAN problems

Disk space analysis is incorrect

The packager fails

Daemon logfile is too long

Cannot read a tape depot

Installation fails

swinstall or swremove fails with a lock error

 

Cannot Contact Target Host’s Daemon or Agent

If you see the following error message:

ERROR: Could not contact host <hostname>. Make sure the hostname is correct.

it means that the hostname you specified could not be found in the hosts database. Make sure you have typed the hostname correctly (you can use the nslookup command to verify hostnames). If the target hostname is not in the hosts database, but you know its network address, you can use it (in standard “dot” notation) in place of the hostname.

If you see this error message:

ERROR:Remote Procedure Call to a daemon has failed.  Could not start a management session for <target>.   Make sure the host is accessible from the network,  and that its daemon, swagentd, is running. If the daemon is running see the daemon logfile on this target for more information.

it means SD-UX could not contact the daemon program on a specific target system. Note that this may occur even if you haven’t specified any targets, for example, if the daemon on your local host is not running.

Resolution

If the SD-UX daemon/agent is not installed on a given target system, you must install it before you can use SD-UX.

If you’ve verified that the daemon/agent component has been installed on a target system and you still have trouble contacting it, check to see that the daemon is running:

  1. On the target system, type:

    ps -e | grep swagentd

  2. If the daemon does not appear to be running, you can start it by typing (as root on the target system):

    /usr/sbin/swagentd

  3. If you attempt to start a daemon when one is already running, you will see a message about the other daemon; this is harmless.

    You can also kill and restart a currently running daemon by typing:

    /usr/sbin/swagentd -r

Other possible causes for this problem are listed in the section “Connection Timeouts and Other WAN Problems ”.

TIP: An easy way to determine if a target system has the SD-UX daemon installed and running is to type:

/usr/sbin/swlist -l depot @ <one or more target hostnames>

which will attempt to contact each target to get a list of registered depots. Those targets which have the SD-UX daemon installed will report either:

# Initializing... # Target <hostname> has the following depot(s): # <...insert list of depots...>

or

# Initializing... WARNING: No depot was found for <hostname>.

For more information on daemon activity, see the daemon logfile in /var/adm/sw/swagentd.log.

GUI Won’t Start or Missing Support Files

You can start the GUI in these ways:

  • For swinstall, swcopy, or swremove, type the command with no additional options or arguments.

  • Include the -i option with any other options and arguments when you type the command on the command line. (Required for swlist.)

  • For the Job Browser, type sd on the command line.

When using the GUI, you might encounter these problems:

  • Can’t open the display or display is set incorrectly

  • Missing GUI support files

Resolution

If you have invoked the GUI on a remote system, you may see the following error messages:

Xlib: connection to <display> refused by server Xlib: Client is not authorized to connect to Server Error: Can’t Open display.

Check that you have set the $DISPLAY environment variable correctly on the remote system to identify your display. If it is correct, you may have to enable the remote host to make connections to your X server via the xhost(1) command or by modifying your /etc/X*.hosts file.

If you see the error message:

swinstall: Error: cannot read file:            /usr/lib/sw/ui/smc_install_copy.ui

or

swremove: Error: cannot read file: /usr/lib/sw/ui/smc_remove.ui

the system is telling you that the file /usr/lib/sw/ui/smc_install_copy.ui must be installed on the system to run either swinstall or swcopy interactively or that the /usr/lib/sw/ui/smc_remove.ui file must be installed to run swremove. Make sure that the directory /usr/lib/sw/ui exists and includes the requested file. If the file does not exist, you must reinstall the SD-CMDS fileset from your OS media.

Access To An Object Is Denied

Denial of access to SD-UX objects may have a number of causes, including:

  • ACL permissions

  • Inter-host secrets

  • Working with image copies of depots

Resolution

Generally, when SD-UX denies access to an object, a message tells you that you do not have the required access permission. Yet, it may be unclear which object is not accessible. For example, when you use swcopy to copy a product from system A to a depot, SD-UX checks these ACLs:

  1. If the destination depot does NOT exist, the host ACL is checked to verify that the user has “insert” permission.

  2. If the destination depot does exist, the depot ACL is checked to verify that the user has write permission.

  3. The source depot’s ACL is checked to make sure the user has read permission on the source depot.

  4. The source product’s ACL is also checked to make sure that the user and the destination system both have read access to the product.

If any of these access permissions is absent, the whole operation is disallowed, and you must read the error message carefully to understanding the exact cause. To see more about what type of security or access problems exist, see the daemon log file on the target system: /var/adm/sw/swagentd.log

The Effects of ACL Modifications

The default ACLs make it fairly easy to administer ACLs, but do not always give the desired level of access control. When you change an ACL to restrict access, especially by removing the any_other read permission, you may restrict access in unexpected ways. Host entries are required for any destination systems for swcopy and swinstall operations.

See Chapter 9: “SD-UX Security ” for a full discussion of the access tests performed or each operation.

Do Not Modify ACL Files Without swacl

Since SD-UX stores ACLs in the file system as plain text files, you may try to edit them with a conventional editor. This can lead to unexpected corruption of the ACL. Most cases of this corruption simply result in a message indicating the corruption, but inserting additions to the ACL file without updating the num_entries value can result in unreported problems and cause SD-UX to deny access. A common failure could occur, for instance, if a you inserted user entry in the ACL file. This could push the any_otherentry down beyond the num_entries limit. The ACL manager would never read the any_other entry, and you would have access problems. The best guard against this situation is to always use the swacl command to manipulate ACLs.

Inter-host Secrets

The default /var/adm/sw/security/secrets file contains a single entry:

default -sdu-

If you wish to explicitly name all hosts from which controllers can be run, you must replace the -sdu- with a different default secret, or eliminate the entire entry. See Chapter 9: “SD-UX Security ” for a thorough discussion of the secrets file.

The controller (for swinstall, swcopy, etc.) looks up the secret for the system on which it runs and passes it in an encrypted form to its agent. The agent receiving a request from the controller looks up the secret for the host from which the call comes, encrypts it, and compares the encryption to that provided by the controller. If the two secrets do not match, access is denied. If you have problems with this mechanism, make sure that all systems have matching entries. You can also revert to the old secrets file (/etc/newconfig/sd/secrets on 9.x and /usr/newconfig/var/adm/sw/security/secrets on 10.x) on all hosts, or simply copy a single secrets file to all hosts.

Working With Depot Images

You may encounter a problem in using cp, tar, cpio, dd, and other commands to copy images of depots for use on other systems. Depot and product ACLs in the image have built-in knowledge of the host on which the depot originated. In particular, an ACL default realm will be wrong and local users will be confused with users on the originating host. For example, attempts to add local users to the access list will, in fact, grant access to remote users. There is no way to alter the default realm of an ACL from that set when it is created.

Another common problem with such images occurs if you import them to systems that cannot resolve all the hostnames (see resolver(4) and nslookup(1)) that exist in the ACLs.

If your purpose is to create a “staged” installation, use swcopy to propagate the depot. This creates new ACLs, based on local templates, for each instance of the depot.

If the sole intent of a depot is for such image distribution, you may wish to set the swpackage create_target_acls option to false to prevent ACL creation on the depot and products during the swpackage operation. This option creates tape and CD-ROM images. Depots and products without ACLs grant the local superuser all privileges, while all other users and systems have read access. Note that when you copy or install this ACL-less depot with swcopy or swinstall, the copies (installations) are automatically protected by ACLs based on templates on the destination host.

Slow Network Performance

When using swinstall or swcopy in an environment where network bandwidth is the “bottleneck,” the file transfer rate between source and target can become very slow.

Resolution

The compress_files=true option compresses files transferred from a source depot to a target. This can reduce network usage by approximately 50%; the exact amount of compression depends on the type of files. Binary files compress less than 50%, text files more.

The greatest throughput improvements are seen when transfers are across a slow network (approximately 50kbyte/sec or less), and the source depot server is serving a few target hosts at a time.

NOTE: This option should be set to true only when network bandwidth is clearly restricting total throughput. If this option is used with a fast network or with a depot server simultaneously connected to many target hosts, this option can actually reduce overall throughput or performance, unless the source depot is already compressed.

If it is not clear that this option will help in your situation, compare the throughput of a few install or copy tasks (both with and without compression) before changing this option value.

See Chapter 8: “Reliability and Performance ” for more information about performance options.

Connection Timeouts and Other WAN Problems

Low-throughput, wide-area networks can cause SD-UX to encounter time-out problems when establishing and maintaining network connections with remote agents on other systems.

If you see the following messages:

ERROR:A Remote Procedure Call to a daemon has failed.  Could not start a management session for <target>.   Make sure the host is accessible from the network, and that its daemon, swagentd, is running.  If the daemon is running see the daemon logfile on this target for more information.

or

ERROR: Could not perform the requested operation for  <target>, possibly due to a network communications  failure. Check that the host is still accessible from  the network.

and you have verified that the system is up and the daemon program (swagentd) is running on it, it may be that network delays are causing the connection to time-out.

Resolution

Increase the time-out value used by SD-UX when performing Remote Procedure Calls (RPCs) by specifying a higher value for the rpc_timeout option, either via the command line or in the defaults file. RPC time-out values range from 0 to 9, with 9 being the longest time-out. The default RPC time-out value is 5. Note that these values do not represent any specific time units. See Appendix A for more information on the rpc_timeout option.

Increasing the rpc_timeout can also help in situations where the target agents in an install or copy session are timing out when trying to contact the source agent. This problem is indicated by the following error messages in the agent log file:

ERROR: Could not open remote depot/root <path> due to  an RPC or network I/O error. ERROR: Cannot open source.  Check above for errors, as well as the daemon logfile on the source host (default location:/var/adm/sw/swagentd.log). ERROR: Cannot continue the Analysis Phase until the previous errors are corrected.

Another factor that can affect RPC timeouts on a slow network is the choice of network protocol. SD-UX supports both UDP- and TCP-based communication (the default is TCP). TCP communication is more reliable on a WAN because it is connection-based. SD will fall back to a UDP connection if the TCP connection fails for some reason. The default binding can be set with the -x rpc_binding_info option.

Note that the daemon program (swagentd) listens for both UDP- and TCP-based RPCs by default. See Appendix A for more information on the rpc_binding_info option.

A final WAN-related issue may arise when using the interactive GUI. During the analysis and execution phases of an interactive session, each target agent is periodically polled for up-to-date status information. The polling_interval option can be used to control the number of seconds that elapse between successive status polls of a given target system. On networks where even this minor data transfer is a problem, you can increase this polling interval, thus decreasing the frequency of polling, and reducing an interactive session’s overall demands on the network. See Appendix A for more information on the polling_interval option.

Disk Space Analysis Is Incorrect

Your installation or copy operation runs out of space even though the disk space analysis succeeded. Upon further checking, you find that the results of the disk space analysis differ from the actual space available.

Resolution

Possible causes of this problem:

  • A control script associated with the installation has consumed disk space by creating or copying additional files that aren’t accounted for during analysis.

  • Your target systems were not idle when the analysis was done and some other activity (unrelated to SD-UX) was consuming disk space.

  • The depot from which the product was installed or copied was created by swpackage with the package_in_place option set to true, and source files have been modified since the product was packaged. The swverify command can be used to diagnose this problem.

Packager Fails

A swpackage operation may fail because of the incorrect use of the end keyword in the Product Specification File (PSF).

Resolution

The end keyword marks the end of a depot, vendor, product, subproduct or fileset specification in a PSF. It requires no value and is optional. However, if you use it and it is incorrectly placed, the specification will fail. Check to make sure, if you use it, there is an end keyword for every object specification (especially the last one).

Command Logfile Grows Too Large

If you want to reduce the contents of a SD-UX command logfile, follow this procedure:

Resolution

To reduce messages to a minimum, set the verbose command option to 0 in one of the option files or by using the -x option on the command line. For example, entering -x spackage.verbose=0 on the command line when you run swpackage would reduce the number of entries to the swpackage log to a minimum. See Appendix A for details about setting options.

Daemon Logfile Is Too Long

If you want to shorten (truncate) the SD-UX daemon logfile because it is getting too long, follow this procedure:

Resolution

If the daemon is currently running, DO NOT remove its logfile. The running daemon continues to log messages to its logfile even after you’ve removed it, causing any subsequent information to be lost. Also, the disk space used by the logfile will not be freed as long as the daemon is running.

Instead, truncate the logfile by typing (as root):

echo > /var/adm/sw/swagentd.log

This replaces the previous data in the log with an empty string.

If you inadvertently remove the daemon logfile while it is running, you must kill and restart the daemon if you want to see subsequent daemon log messages and free up the disk space used by the logfile. You can stop (kill) a daemon by typing:

/usr/sbin/swagentd -k

You can also kill and restart a currently running daemon by typing:

/usr/sbin/swagentd -r

Cannot Read a Tape Depot

If you are trying to access a tape depot and see the following error message in the daemon logfile, it means that the tape is either corrupt or is not in SD-UX format.

ERROR: The INDEX file on the source did not exist or could not be read. ERROR: The target <depot_path> could not be opened.

Resolution

Make sure that you have correctly specified the tape device and that the correct tape is in the drive. SD-UX only reads tapes that are in SD-UX format. For example, SD-UX does not read update format tapes.

Installation Fails

An installation may fail while only part way through the process.

Resolution

SD-UX gives you several restart options:

  • Re-execute the same command from the command line.

  • Recall the session file swinstall.last that was automatically saved for you. (See “Session Files”.)

  • Reset the checkpointing options.

    By default, SD-UX checkpoints to the fileset level, meaning that the operation will start transferring files with the last fileset to be attempted. By setting the reinstall_files option to false, SD-UX restarts distribution and installation with the file that was last attempted. (SD-UX does not support checkpointing below the file level.)

    You can override all checkpointing by setting both the reinstall and reinstall_files options to true. See Appendix A for more information.

swinstall or swremove Fails With a Lock Error

swinstall or swremove fails with the following message:

Cannot lock “/” because another command holds a conflicting lock. The process id of that command is ####.

Resolution

Another SD command is running that prevents the swinstall or swremove command from running. Wait for that command to finish and try again.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1997, 2000-2003, 2006, 2007, 2008 Hewlett-Packard Development Company, L.P.