Warning: This HTML rendition of the RFC is experimental. It is programmatically generated, and small parts may be missing, damaged, or badly formatted. However, it is much more convenient to read via web browsers, however. Refer to the PostScript or text renditions for the ultimate authority.

OSF DCE SIG M. Hubbard (IBM)
Request For Comments: 12.0 August 1992

DCE SERVICEABILITY STRATEGY/DESIGN

DISCUSSION PAPER

INTRODUCTION

Serviceability refers to those functional enhancements that improve the diagnostic features of software. These enhancements involve (1) message logging, (2) tracing and (3) error logging.

This paper stems from the Serviceability Requirements document issued within the DCE SIG by the Serviceability workgroup (see [RFC 11.0]). It is recommended that the reader be familiar with the requirements document before reviewing this paper.

This document is not an official IBM proposal to OSF. It represents the ideas of the author and is only an informal starting point intended to spur discussion on how to implement serviceability within DCE. The proposals/descriptions are not necessarily complete or final. Comments, questions and disagreements are welcome.

Proposed Minimum V1.1 Content

  1. General
    1. Replace \*(sBprintf()\*(sE calls with serviceability interfaces (hooks) proposed within this document to facilitate better integration with the base platform.
    2. Establish guidelines for the placement of these serviceability hooks.
  2. Messaging
    1. Move all message format strings into message catalogs.
    2. Define a common format for all messages.
  3. Error Logging
    1. Define error scenarios where special error logging (beyond just logging a message) is required.
    2. Define a common format for all local error log records.
    3. Enable remote notification of errors via the DME event logger.
    4. Define contents of remote notification events.
  4. Tracing
    1. Enable the remote activation/deactivation of tracing.
    2. Define a set of trace types.
    3. Define trace points within each DCE component.
    4. Define a common format for all trace records.
  5. Documentation
    1. Create a Problem Determination Guide that at a minimum explains all messages, trace options and remote error notification events.

Terminology

  1. log entry

    A data structure containing a header of identifying information plus several bytes of defined data (e.g., error entry, trace entry) which is recorded to disk, screen or memory.

  2. log point

    The location at which a (group of) code statement(s) that generate an entry is placed within a software program (e.g., trace point, error point).

  3. hook

    An interface called to record an entry at a given point in a software program.

  4. hook code

    The code that implements the hook.

  5. ID

    A unique combination of bytes that identifies the component and error/trace logging point within the software program.

MESSAGE/ERROR LOGGING

Purpose

Message and error logging serve a common purpose which is to notify the end user or system administrator about software program activity.

Messages are logged to give immediate local notification of an error or to provide informational feedback. In the case of an error situation, they concisely describe the error that occurred. The administrator/user will then refer to a Problem Determination Guide for a list of causes for the error and the suggested action to be taken. For usability reasons, messages are localized (it is the responsibility of each DCE system vendor to translate messages).

The main purpose of error logging is to locally log control information when an error occurs. This information supplements any error message and is aimed at reducing/avoiding the need for any diagnostic investigation requiring tracing (recreating serious errors should be avoided and tracing can impact system performance).

A secondary element of message/error logging is remote notification. Remote notification is intended to allow:

  1. centralized problem management
  2. unattended operation of DCE server nodes

Activation/Deactivation

  1. Control over the message/error log and the filtering/routing of messages/errors is controlled by the local operating system (and any end user interface it offers). A generalized server that exports an RPCable serviceability control interface (which DME may define) for remotely activating local commands that control serviceability related features could be created.
  2. Logging within client/server stubs and runtimes require special consideration:
    1. Multi-user systems will need to distinguish between executables running as production system services and executables running as end user applications (or system applications under development) in order to avoid centralized system logs from being overrun with user error activity (vs production services activity). The logging service would need to distinguish between these two cases (and route to the appropriate private/central log).
    2. User applications may have to explicitly initialize/terminate the various system logging facilities unless this can be made part of RPC (and DUA) initialization/termination.
    3. Hooks within stubs will likely be manually inserted unless the IDL compiler can be updated.
  3. A way to set the minimum severity level for message/error logging is required.
  4. Catalogs should automatically be opened within the message logging hook code.
  5. Catalogs may exist for each component or for each executable.
  6. The error logging hook must handle the case where the local error log is filled. One possible solution is to maintain two files where logging can alternate (i.e., when one file is filled, logs entries can be written to the secondary file while the first can be archived and cleared becoming the new secondary file).
  7. A serviceability profile file, a set of environment variables or new/extended runtime verbs (RPC and DUA) is required to identify the local log file(s) and to set the minimum recording severity.
  8. Local error logging can be considered optional (leaving just message logging and remote error notification). However, this decision depends on the implementation of the hook, not the code calling it.

General Message/Error Logging Guidelines

  1. Each DCE executable should always log the following general informational messages:
    1. process has successfully started (i.e., ready for requests)
    2. tracing is active (lists all trace flags set)
    3. process has normally/abnormally ended
    4. \&...(TBD)
  2. Other informational messages specific to a server/component that would be of interest to an administrator should also be logged:
    1. deletion of directory entries
    2. login requests
    3. successful completion of replication activity
    4. \&...(TBD)
  3. Two levels of informational severities should exist to allow suppression of detailed informational messages but still allowing key general informational messages to be seen.
  4. A message must be logged whenever an error occurs.
  5. Errors must be logged whenever an error situation results that is not part of normal processing (e.g., cds returning an entry not found error is considered normal whereas an out of memory error is considered abnormal). Places where unexpected errors are detected include:
    1. system/library calls returning negative return codes (or \*(sBNULL\*(sE pointers)
    2. DCE client API calls returning negative status codes
    3. signal handlers
    4. thread recovery environments (established via \*(sBpthread_cleanup_pop()\*(sE and \*(sBpthread_cleanup_push()\*(sE)
    5. process recovery environments (established via \*(sBatexit()\*(sE)
    6. code segment recovery environments established using \*(sBCATCH\*(sE/\*(sBCATCHALL\*(sE exception handling clauses
    7. points where non-zero RPC status and comm failures are received by the runtime

    Note: It is assumed that the establishing of recovery environments will increase as part of the V1.1 code cleanup.

  6. Errors must be logged whenever manual (immediate or eventual) intervention is required.
  7. Errors must be logged whenever a server or server function becomes unavailable.
  8. Multiple error log entries may result from a single error condition.
  9. Error log entries should be filled with keywords and data information rather than conversational language descriptions (a stand-alone formatter can be used to generate easy-to-read reports).
  10. No message text should be sent remotely to the event logger. The logger should be able to reconstruct a text message based on the hex code points it receives.
  11. The actual recording of error data to memory/disk may be done inline or by issuing requests to a separate thread/daemon depending on the particular operating system. This decision depends on the implementation of the error logging hook code.

Logging Interfaces

The proposed message/error logging interfaces are the following:

    \ int logmsg( nl_catd cd, /* catalog descriptor (optional) */ unsigned32 id, /* message id */ unsigned32 severity, /* message severity */ uuid_t *e_uuid, /* error correlator */ unsigned32 line, /* line number where error occurred */ char *file, /* source file where error occurred */ ... /* message substitution variables */ );

    int logerr( unsigned32 id, /* error id */ unsigned32 severity, /* error severity */ unsigned32 e_info, /* error codepoints */ unsigned32 e_status, /* errno/status/exception/signal */ uuid_t *e_uuid, /* error correlator */ unsigned32 e_len, /* length of error data */ char *e_data /* hex error data stream */ );

More discussion and design effort may result in these 2 interfaces being combined:

    \ int lognotice( nl_catd cd, /* catalog descriptor (optional) */ unsigned32 id, /* message/error id */ unsigned32 severity, /* message severity */ unsigned32 e_info, /* error codepoints */ unsigned32 e_status, /* errno/status/exception/signal */ uuid_t *e_uuid, /* error correlator */ unsigned32 line, /* line number where error occurred */ char *file, /* source file where error occurred */ ... /* message substitution variables */ /* or error log hex data */ );

New RPC interfaces should be defined to allow passing of error uuids between RPC and user code portions of executor threads. For example:

    \ rpc_get_err_uuid( uuid_t *e_uuid, /* creates a uuid if one */ unsigned32 *status /* doesn't already exist */ ); /* else returns existing one */ rpc_test_err_uuid( uuid_t *e_uuid, /* tests if a uuid has been */ unsigned32 *status /* created and returns it */ );

Format

  1. Message/error identifiers (passed into the hooks) need to be architected to allow for vendor specific errors (as part of DCE extensions or operating system differences). These identifiers are used for remote notification as well as catalog access.

    \ --------------------------------------------------------------- Bits Usage --------------------------------------------------------------- 00-04: DCE executable identifier 05-08: DCE subcomponent identifier 09-16: Vendor identifier (OSF is the vendor of base DCE code) 17-32: Message/error number ---------------------------------------------------------------

  2. Informational messages need only include:

    1. message identifier
    2. message text (retrieved from a catalog) and user substitution variables
  3. Error messages must include:

    1. message identifier
    2. message text (retrieved from a catalog) and user substitution variables
    3. error UUID (to correlate with other messages and local/remote error logs)
    4. where the error occurred (\*(sB__FILE__\*(sE, \*(sB__LINE__\*(sE)
  4. Messages may be prefaced by various system information depending on the logging implementation:

    1. date and time (locale specific)
    2. process id
    3. severity indicator
    4. vendor specific local message identifiers
  5. Messages may require multiple lines.
  6. Catalogs should store I18N enabled format strings (see the XPG3 guidelines regarding ordered substitution in format strings). Following is an example message format (`\e' at the end of a line indicates line-continuation):

    \ --------------------------------------------------------------- 02/21/92 9:29:45 [1948382] rpcd: RPC0001I: \e RPC daemon successfully started. 02/21/92 10:29:31 [1948382] cdsd: RPC0034E: \e Error receiving IP packet, uuid=<8888...>. 02/21/92 10:29:45 [1948382] cdsd: RPC0034I: \e uuid=<8888...>,line=32,file=<dce/rpc/runtime/comnlsn.c> 02/21/92 10:45:30 [1948383] cdsclerk: CDS0002E: \e No more heap memory available, uuid=<9999...>. 02/21/92 10:45:42 [1948382] cdsclerk: CDS0002I: \e uuid=<9999...>,line=99,file=<dce/directory/cds/server/db_xyz.c> ---------------------------------------------------------------

    Note: RPC0001I, RPC0034E, CDS0002E, ..., are vendor specific prefixes that represent the message id (I=informational, E=error). They are part of the message text in the catalog.

  7. Error log entries must include:

    1. error identifier (defines the error that occurred)
    2. severity
    3. error UUID (to correlate local and remote error logs)
    4. an architected hex value indicating the error type
    5. an architected hex value indicating the likely cause
    6. an architected hex value indicating the suggested action
    7. the errno/status/exception value that led to the error
  8. Suggested architected error code points (type, cause and action):

    \ --------------------------------------------------------------- Value Code Point Description --------------------------------------------------------------- [Error Types:] S_NO_ERR x'00000000' No error occured. S_SYSTEM_ERR x'01000000' Operating system failure. S_DEVICE_ERR x'02000000' Device failure. S_NETWORK_ERR x'03000000' Failure during communications. S_DATA_ERR x'04000000' Failure during data processing. \&... [Causes:] S_UNKNOWN x'00010000' Cause is unknown. S_CONFIG_ERR x'00020000' Improper network configuration. \&... [Suggested Actions:] S_NO_ACTION x'00000000' No action required. S_ATTEMPT_RETRY x'00000100' Attempt to retry. S_CORRECT_AND_RETRY x'00000200' Correct problem and retry. S_CONTACT_REP x'00000300' Contact service representative and report keywords. S_CONTACT_ADMIN x'00000400' Contact system administrator. \&... ---------------------------------------------------------------

  9. Error UUIDs should be passed back to the client (this may not be possible in V1.1 since it affects DCE protocols, but it is a good recommendation for new servers).
  10. Example error log format:

    \ --------------------------------------------------------------- Date Time ID Sev CodePts Status UUID \e Len Data 02/21/92 9:29:45 00003 03 02000100 0000002A 123456782003FA10 \e 005 002003FA38 02/21/92 9:29:45 00423 04 02000200 00000042 123456783003FA10 \e 004 12345678 ---------------------------------------------------------------

Notes

  1. All messages should be reviewed for consistent terminology.
  2. Systems with multiple destinations can route/filter the message based on severity or message number.
  3. It is assumed that the DME event logger has architected event types to support the creation of specialized servers to separately handle error events, security events (e.g., centralized auditing), ...
  4. OSF may choose to define the format of an error report to be used as a guide by vendors creating error log formatters (error logs only contain raw hex data).
  5. No design of the remote catcher or agent of remote events is provided within this document.
  6. No design of a local serviceability command catcher or its RPCable interface is provided within this document.

TRACING

Purpose

Tracing offers a way to achieve further information to isolate an error or irregular behavior. There are several uses for tracing:

  1. port verification (aimed at DCE system vendors)
  2. application debugging (aimed at application developers)
  3. diagnostic investigation (aimed at service personnel)
  4. performance monitoring (aimed at system administrators)

Each type of tracing has a different main audience (certainly each could be used in conjunction with another) and attempts to record different information (and sometimes, in a different way).

Activation/Deactivation

  1. Port verification / application debugging
    1. These types of tracing are activated using a compile option in combination with a series of event flags controlled via a serviceability profile file (or optionally via environment variables). For example:

      \ #ifdef _DEBUG if (tracing_on & debug_flag1) /* optional to allow for scoped debug tracing */ fprintf(dbgfh, "cds: about to bind to the server\en"); #endif #ifdef _DEBUG if (tracing_on & api_check) /* print out all input parameters */ fprintf(dbgfh,"cds: %d %x %x %c %s\en"); #endif

    2. The debug/event flags should be settable during runtime by using environment variables or a separate event file loaded during initialization. The code itself can also set these flags.
    3. The CDS code has created a flexible trace utility that implements the above concepts using a set of macro hooks which could be used by all DCE components (processes and runtimes):

      1. \*(sBDEBUG_EVENT(event_flag, event_message_string)\*(sE -- if tracing is on, \*(sBfprintf()\*(sE's a message to the debug file
      2. \*(sBDEBUG_TRACE(message_string)\*(sE -- if tracing is on, checks an event flag and conditionally \*(sBfprintf()\*(sE's to debug file
      3. \*(sBLOG_EVENT(event_flag, event_message_string)\*(sE -- same as \*(sBDEBUG_TRACE\*(sE but can't be compiled out
      4. \*(sBLOG_TRACE(message_string)\*(sE -- same as \*(sBDEBUG_EVENT\*(sE but can't be compiled out

      \*(sBLOG_EVENT\*(sE and \*(sBLOG_TRACE\*(sE should still permit other system vendors to compile them out (perhaps using a \*(sB#define _DEBUG2\*(sE flag).

    4. \*(sBAssert()\*(sE calls should be used for all input parameters within DCE client stubs (possible IDL compiler enhancement).
  2. Diagnostic investigation / performance monitoring
    1. A new set of rpc management interfaces should be created for controlling tracing (since tracing can affect multiple users, it should be treated as an authorized interface):

      \ /* turns trace on/off */ void rpc_mgmt_set_trc_flags( rpc_binding_handle_t binding, unsigned32 flags, unsigned32 options, unsigned32 *status ); /* queries trace flags */ void rpc_mgmt_inq_trc_flags( rpc_binding_handle_t binding, unsigned32 *flags, unsigned32 *status );

      Notes:

      1. The meaning of each bit in the \*(sBflags\*(sE variable is defined by each server that supports this interface.
      2. Both these interfaces must be added to the \*(sBrpc_mgmt_set_authorization_fn\*(sE routine.
      3. Non-RPCable processes (e.g., cdsclerk) can only be controlled via local trace commands (using local sockets??).
      4. What about the GDS processes?
    2. Each of the DCE server control programs or admin packages should support a new \*(sBtrace\*(sE command, to set trace flags and trace options:

      \ trace <component> on -a -d -p -all -e <flag1,flag2,...> trace <component> off -all -e <flag1,flag2,...>

      where:

      -a -- archive wraparound buffer when full -d -- defer recording until a subsequent \*(sBtrace on\*(sE command is issued -p -- trace flags should be persistent across server restarts -all -- set all trace flags on/off -e -- list of event flags

    3. A set of trace flags common to all DCE components may be created in addition to the component specific trace points.
    4. A serviceability profile file (or a set of environment variables) is required to store the on/off state of trace points (so they are persistent across restarts) as well as identifying the trace file (for trace data archiving).
    5. The actual recording of trace data to memory/disk may be done inline or by issuing requests to a separate thread or daemon depending on the particular operating system. This decision depends on the implementation of the trace hook code.

    General Trace Point Guidelines

    1. Port verification:
      1. anywhere/everywhere
    2. Application debugging:
      1. Values of input parameters upon entry to key subroutines should be recorded.
      2. Values of return parameters and return code upon exit should be recorded.
      3. Contents of de-referenced storage (and not just the value of pointers) should be recorded.
      4. Key low level algorithm processing should be recorded.
    3. Diagnostic investigation:
      1. process to process communications within a component
      2. cross component interactions (e.g., login requests)
      3. database interactions
      4. network interactions
      5. replication activity
      6. major events
      7. flow of control (e.g., subroutine entry/exit)
      8. thread creation/termination
      9. synchronization events (mutexes, condition variables, semaphores)
      10. internal API boundary crossings (e.g., cdsapi)
      11. executor thread activity
      12. RPC runtime system threads activity (e.g., listener thread)
    4. Performance monitoring:
      1. event counting (e.g., number of cache hits/misses)
      2. frequency measurements (e.g., number of requests per hour)
      3. elapsed time measurements (e.g., lifespan of executor threads)

    Logging Interfaces

    1. For port verification, \*(sBfprintf()\*(sE may be used (or better yet, exploitation of the CDS debug facility).
    2. For application debugging, \*(sBfprintf()\*(sE may be used (or better yet, exploitation of the CDS debug facility).
    3. For performance monitoring involving counts, a data structure of performance data can be created where no interface is required (updates to the variables occur directly). Logging of activity can then occur on every increment/decrement or at various threshhold levels.
    4. For performance and diagnostic tracing, the following trace logging interfaces should be used:

      \ /* create a trace buffer - multiple buffers are supported */ int crttrcbuf ( int num_entries, int entry_length, int *trcbh ); /* destroy a trace buffer */ int dsttrcbuf ( int trcbh );

      /* log a diagnostic/performance trace record */ int logtrc ( int trcbh, unsigned16 compid, unsigned16 trctype, unsigned32 trcword, unsigned32 trclen, char *trcbuf );

      Notes:

      1. A buffer may be global to a process or local to a thread.
      2. For performance reasons, all entries are the same length (allows for efficient locking of a single entry by a single thread rather than locking the entire trace buffer).

    Format

    1. Port verification / application debugging
      1. I18N considerations are not mandatory for this type of tracing.
      2. Each entry should include:
        1. component prefix string
        2. where the trace entry was recorded (\*(sB__FILE__\*(sE, \*(sB__LINE__\*(sE)
        3. thread id
        4. event name

      Following is an example debug tracing format:

      \ --------------------------------------------------------------- pfx: <event_name> <__FILE__> <__LINE__ > \e <thread:xx> <debug_text> CDS: <db_trc> </dce/src/cds/cdsd.c> <line:123> \e <thread:22> <updating database> RPC: <listener_trc> </dce/src/cds/rpcrt.c> <line:999> \e <thread:01> <connection accepted from 111.222.333.444> ---------------------------------------------------------------

    2. Diagnostic investigation / performance monitoring

      The structure of each entry in the diagnostic trace buffer would be:

      \ --------------------------------------------------------------- Date Time Comp/Type Word Len Data 02/21/92 9:29:45 02100006 12345678 004 01020304 02/21/92 9:32:41 02100099 87654321 008 0102030405060708 ---------------------------------------------------------------

      Notes

      1. Other than port verification tracing, message-like text phrases should not be used.
      2. This interface can be a macro vs a subroutine hook.
      3. OSF may choose to define the format of a trace report to be used as a guide by vendors creating trace log formatters (trace logs only contain raw hex data).

      PROBLEM DETERMINATION GUIDE

      The following figure represents an example message description within the PDG:

      \ --------------------------------------------------------------------- RPC00001 FAILURE SENDING DATAGRAM TO NETWORK ADDRESS xxx.xxx.xxx.xxx Explanation: The RPC runtime was unable to send a datagram to the specified network address xxx.xxx.xxx.xxx Severity: WARNING Problem Determination: Likely cause is that the node indicated by the network address is unavailable. User Response: Contact the network administrator and report the failure. Operator Response: Check the error log for the cause of the node becoming unavailable and make appropriate adjustments. System Action: The system is not halted. The application program that discovered the error does not need to be restarted. ---------------------------------------------------------------------

      Note: The following severities exist:

      1. \*(sBALERT\*(sE -- Immediate intervention required. System or DCE service is unusable until action is taken. Usually reserved for permanent errors (non-retriable conditions).
      2. \*(sBCRITICAL\*(sE -- Severe problem exists that makes some DCE services unavailable. Intervention is required. Usually reserved for permanent errors (non-retriable conditions).
      3. \*(sBERROR\*(sE -- Failure occurred fulfilling a user request. Service is still available but without intervention, any retries of a similar request may fail.
      4. \*(sBWARNING\*(sE -- Eventual intervention required. System is still usable.
      5. \*(sBNOTICE\*(sE -- No error condition exists but some action needs to be taken.
      6. \*(sBINFORMATIONAL\*(sE -- Informational message that requires no action to be taken.

      In addition, the following items should be described:

      1. Diagnostic/Performance trace flags common to all DCE processes.
      2. Diagnostic/Performance trace flags specific to each component.
      3. Error type, cause and suggested action code points.
      4. Control program trace command syntax.
      5. Trace log entry contents.
      6. Error log entry contents.
      7. RPC runtime status and comm failure codes.
      8. List of errors/messages that result in remote notification.
      9. Location of logs (operating system specific).
      10. Description of formatting tools (operating system specific).

      LOGGING INTERFACE MANPAGES

      To be done...

      AIX IMPLEMENTATION EXAMPLE

      This section describes a possible AIX V3 implementation using existing facilities. This example is only meant as a means to more clearly demonstrate what the placement of hooks within DCE code would achieve.

      Existing AIX Services

      Within AIX, the existing set of serviceability related functions include:

      1. \*(sBopenlog()\*(sE, \*(sBsyslog()\*(sE, \*(sBcloselog()\*(sE
      2. \*(sBcatopen()\*(sE, \*(sBcatgets()\*(sE, \*(sBcatclose()\*(sE
      3. \*(sBprintf()\*(sE
      4. \*(sBperror()\*(sE
      5. \*(sBerrlog()\*(sE
      6. \*(sBassert()\*(sE
      7. \*(sBtrcstart()\*(sE, \*(sBtrchk()\*(sE, \*(sBtrcgen()\*(sE, \*(sBtrcon()\*(sE, \*(sBtrcoff()\*(sE, ...

      The following serviceability daemons exist:

      1. \*(sBsyslogd\*(sE
      2. \*(sBtrace\*(sE
      3. \*(sBerrdemon\*(sE

      Serviceability Header File

      \
      /* DCE serviceability header file: <dce/service.h> */
      
      #include <sys/syslog.h>
      #include <dce/uuid.h>
      
      /* prototypes */
      int  loginit(char *);
      int  logterm();
      int  logmsg(nl_catd, unsigned32, unsigned32, uuid_t *, unsigned32,
                  char *, ...);
      int  logerr(unsigned32, unsigned32, unsigned32, unsigned32, uuid_t *,
                  unsigned32, char *);
      int  logtrc(int, unsigned16, unsigned16, unsigned32,
                  unsigned32, char *);
      int  crttrcbuf(unsigned32, unsigned32, int *);
      int  dsttrcbuf(int);
      void settrcflags(unsigned32, unsigned32);
      
      /* limits and defaults */
      #define  S_MAX_MSG_LTH              132         /* maximum message
                                                         length */
      #define  S_MAX_ERR_LTH              132         /* maximum error
                                                         string length */
      #define  S_CATALOG_ERRMSG           "Error accessing catalog!\en
      #define  S_DEFAULT_MSG              "Message not found!\en
      
      /* example message/error identifiers */
      #define RPC00001                    0x11000001  /* RPC R/T Error */
      #define RPC00002                    0x12000002  /* RPCD Error    */
      #define RPC00003                    0x12000003  /* RPCD Error    */
      #define CDS00001                    0x21000001  /* CDSD Error    */
      
      #define S_MSG_CATALOG_NUMBER(x)     (x)>>28     /* first nibble
                                                         determines the
                                                         catalog containing
                                                         the message */
      
      #define S_MSG_COMP_MASK             0xF0000000
      #define S_MSG_SUBCOMP_MASK          0x0F000000
      #define S_MSG_VENDOR_MASK           0x00FF0000
      #define S_MSG_NUMBER_MASK           0x0000FFFF
      
      /* message severities */
      #define  S_INFO1                    LOG_INFO
      #define  S_INFO2                    LOG_INFO
      #define  S_NOTICE                   LOG_NOTICE
      #define  S_WARNING                  LOG_WARNING
      #define  S_ERROR                    LOG_ERR
      #define  S_CRITICAL                 LOG_CRIT
      #define  S_ALERT                    LOG_ALERT
      
      #define  S_MSG_NOTIFY_MASK          0x80000000
      
      /* message string prefixes */
      #define  S_RPCD_STR                 "rpcd:
      #define  S_RPCRT_STR                "rpcrt:
      #define  S_SECD_STR                 "secd:
      #define  S_SEC_CLIENTD_STR          "sec_clientd:
      #define  S_CDSD_STR                 "cdsd:
      #define  S_CDS_CLERK_STR            "cds_clerk:
      #define  S_CDSADV_STR               "cds_adv:
      
      /* error types */
      #define  S_NO_ERR                   0x00000000
      #define  S_SYSTEM_ERR               0x01000000
      #define  S_DEVICE_ERR               0x02000000
      #define  S_NETWORK_ERR              0x03000000
      #define  S_DATA_ERR                 0x04000000
      
      /* error causes */
      #define  S_UNKNOWN                  0x00000000
      #define  S_CONFIG_ERR               0x00010000
      
      /* suggested actions */
      #define  S_NO_ACTION                0x00000000
      #define  S_ATTEMPT_RETRY            0x00000100
      #define  S_CORRECT_AND_RETRY        0x00000200
      #define  S_CONTACT_REP              0x00000300
      #define  S_CONTACT_SYS_ADMIN        0x00000400
      
      /* trace component ids */
      #define  S_RPCD_ID                  0x0110
      #define  S_RPCRT_ID                 0x0120
      #define  S_CDSD_ID                  0x0210
      #define  S_CDSCLERK_ID              0x0220
      #define  S_CSDADV_ID                0x0230
      #define  S_SECCLIENTD_ID            0x0310
      #define  S_SECSERVER_ID             0x0320
      
      #define  S_TRC_COMP_MASK            0x0F00
      #define  S_TRC_SUBCOMP_MASK         0x00F0
      
      /* general trace types */
      #define  S_SUBROUTINE_FLOW          0x0001
      #define  S_DATABASE_TRACE           0x0002
      #define  S_NETWORK_TRACE            0x0003
      #define  S_THREADS_TRACE            0x0004
      #define  S_SYNCH_TRACE              0x0005
      #define  S_REPLICATION_TRACE        0x0006
      #define  S_SECURITY_TRACE           0x0007
      #define  S_MARSHALLING_TRACE        0x0008
      
      /* specific component trace types */
      #define  S_SECTRC_LOGINS            0x0200
      #define  S_CDSTRC_CACHE             0x0100
      
      /* example component trace words */
      #define S_CDS_REPLICATION_STARTED   0x0001
      #define S_CDS_REPLICATION_FINISHED  0x0002
      #define S_SEC_LOGIN_REQUEST         0x0003
      #define S_SEC_TGT_ISSUED            0x0004
      

      Source Code

      The following section provides a simple implementation of message/error logging and tracing. Since the AIX operating system already has a serviceability infrastructure, there is no need to implement a trace or error logging daemon. It is also assumed that the underlying services used by these hook implementations are thread-safe.

      This example is NOT a recommendation of how these hooks should be implemented on AIX, but rather as an illustration of what the hooks should do.

      For simplicity, thread-safing and use of a serviceability profile have not been incorporated.

      Mainline

      \
      #include <nl_types.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <dce/uuid.h>
      #include <dce/service.h>
      
      main()
      {
        nl_catd    cd;
        uuid_t     e_uuid;
        char       *string1="string1";
        char       *string2="string2";
        unsigned32 errnum =42;
        unsigned32 status;
      
        printf("msg: start\en");
        uuid_create(&e_uuid, &status);
        loginit(S_RPCD_STR);
      
        printf("\enmsg: logging a message\en\en");
        logmsg(0, RPC00003, (S_INFO1 | S_MSG_NOTIFY_MASK), &e_uuid,
          __LINE__, __FILE__, string1, string2);
      
        printf("\enmsg: logging an error\en\en");
        logerr(RPC00003, S_ERROR, S_DEVICE_ERR | S_UNKNOWN |
          S_ATTEMPT_RETRY, errnum, &e_uuid, 4, 0x01020304);
      
        printf("\enmsg: logging a trace entry\en\en");
        logtrc(0, S_CDSD_ID,
          S_REPLICATION_TRACE, S_CDS_REPLICATION_STARTED, 4,
          0x01020304);
      
        logterm();
        printf("\enmsg: end\en");
      }
      

      Hook code

      \
      #include <nl_types.h>
      #include <stdio.h>
      #include <stdarg.h>
      #include <sys/syslog.h>
      #include <sys/errids.h>
      #include <sys/trchkid.h>
      #include <dce/service.h>
      #include <dce/uuid.h>
      
      #define  MAX_DTM_LTH  80
      #define  ON            1
      #define  OFF           0
      #define  YES           1
      #define  NO            0
      
      static void _getdtm(char *);
      
      static struct {
        int  cd;
        char *name;
      } dce_catalogs[] =
      {
        0, "mymsgs.cat",
        0, "mymsgs.cat",
        0, "mymsgs.cat
      };
      
      static unsigned32 trcglobalflags;
      static unsigned32 trcstate=OFF;
      
      /* things normally in the serviceability profile file */
      static int production=YES;
      static int min_severity;
      static int msgfh;
      static int errfh;
      static int trcfh;
      
      /*=================================================================*/
      /* routine:     loginit()                                          */
      /* description: Initializes all resources and services related to  */
      /*              logging.                                           */
      /*=================================================================*/
      int loginit(
        char *name     /* string name of executable */
      )
      {
        production     = YES;
        min_severity   = S_INFO1;
        trcstate       = ON;
        trcglobalflags = 0xFFFFFFFF;
        msgfh          = stdout;
        errfh          = stdout;
        trcfh          = stdout;
      
        return(openlog(name, LOG_PID, LOG_USER));
      
        /* Optionally, this routine would:                               */
        /* -open/read the serviceability profile file (/dev/service ??)  */
        /* -open the specified log files (message, error and tracing)    */
        /* -set the specified minumum severity level for message/error   */
        /*   logging                                                     */
        /* -initialize the global trace flags                            */
        /* -initialize the remote logging service                        */
        /* -set the production logging flag                              */
      }
      
      /*=================================================================*/
      /* routine:     logterm()                                          */
      /* description: Cleans up all resources and services related to    */
      /*              logging.                                           */
      /*=================================================================*/
      int logterm()
      {
        return(closelog());
      }
      
      /*=================================================================*/
      /* routine:     logmsg()                                           */
      /* description: Logs a message to the local message log.           */
      /*=================================================================*/
      int logmsg(
        nl_catd     cd,       /* catalog descriptor               */
        unsigned32  id,       /* message identifier               */
        unsigned32  severity, /* message severity                 */
        uuid_t      *e_uuid,  /* error correlator                 */
        unsigned32  line,     /* line number where error occurred */
        char        *file,    /* source file where error occurred */
        ...                   /* caller's substitution variables  */
      )
      {
        va_list arg_ptr;               /* caller's substitution
                                          variables             */
        char    *fmtstr;               /* format string         */
        char    msg[S_MAX_MSG_LTH];    /* formatted message     */
        char    errstr[S_MAX_ERR_LTH]; /* error entry string    */
        char    dtmstr[MAX_DTM_LTH];   /* date and time string  */
        int     catnum;                /* catalog array index   */
      
        /* check if message is to be suppressed */
        if (severity < min_severity)
          return(0);
      
        /* automagically open the catalog based on the identifier */
        if (cd == 0)
        {
          catnum = S_MSG_CATALOG_NUMBER(id & S_MSG_COMP_MASK);
          if (dce_catalogs[catnum].cd == 0)
          {
            cd = catopen(dce_catalogs[catnum].name, 0);
            if (cd == -1)
            {
              fprintf(msgfh, S_CATALOG_ERRMSG);
              return(-1);
            }
            dce_catalogs[catnum].cd = cd;
          }
        }
      
        /* log the message */
        fmtstr = catgets(cd, 1, (id & S_MSG_NUMBER_MASK), S_DEFAULT_MSG);
        va_start(arg_ptr, file);
        vsprintf(msg, fmtstr, arg_ptr);
        va_end(arg_ptr);
        if (production)
        {
          syslog(severity, "%s", msg);
          syslog(LOG_INFO, "uuid=<%.16X>,line=%d,file=<%s>\en", e_uuid,
            line, file);
        }
      else
        {
           _getdtm(dtmstr);
           fprintf(msgfh, "%s: %s", dtmstr, msg);
           fprintf(msgfh, "%s: uuid=<%.16X>,line=%d,file=<%s>\en",
                dtmstr,e_uuid,line,file);
        }
      
        /* send a remote error notification to the event handler */
        if (severity & S_MSG_NOTIFY_MASK);
          /* NLNotify(); */       /* DME notification mechanism */
      }
      
      /*=================================================================*/
      /* routine:     logerr()                                           */
      /* description: Logs error data to the local error log as well as  */
      /*              sending a remote notification event to the DME     */
      /*              event logger.                                      */
      /*=================================================================*/
      int logerr(
        unsigned32  id,       /* message identifier            */
        unsigned32  severity, /* message severity              */
        unsigned32  e_info,   /* error codepoints              */
        unsigned32  e_status, /* errno/status/exception/signal */
        uuid_t      *e_uuid,  /* error correlator              */
        unsigned32  e_len,    /* length of user data           */
        char        *e_data   /* hex error data stream         */
      )
      {
        char        errstr[S_MAX_ERR_LTH];     /* error log entry      */
        char        dtmstr[MAX_DTM_LTH];       /* date and time string */
      
        /* log error information to the local error log */
        _getdtm(dtmstr);
        sprintf(errstr,"%s: %.5d %.2X %.8X %.8X %.16X %.3d %.*X", dtmstr,
          id, severity, e_info, e_status, e_uuid, e_len, e_len, e_data);
          /* memcpy would be better for production logging */
        if (production)
          errlog(errstr, (unsigned int)strlen(errstr));
        else
          fprintf(errfh, "%s\en", errstr);
      
        /* send a remote error notification to the event handler */
        /* NLNotify(...); */       /* DME notification mechanism */
      }
      
      /*=================================================================*/
      /* routine:     logtrc()                                           */
      /* description: Logs trace information based on trace flags set.   */
      /*=================================================================*/
      logtrc(
        int        trcbh,   /* trace buffer handle          */
        unsigned16 compid,  /* component/subcomponent id    */
        unsigned16 trctype, /* component trace type         */
        unsigned32 trcword, /* component defined trace word */
        unsigned32 trclen,  /* trace buffer length          */
        char       *trcbuf
      )
      {
        char  dtmstr[MAX_DTM_LTH];   /* date and time string */
      
        if (trcstate & ((compid | trctype) && trcglobalflags))
        {
          if (production)
          {
            if (trclen > 0)
              trcgent(0, ((compid<<16) | trctype), trcword, trclen,
                trcbuf);
            else
              trchkl(((compid<<16) | trctype), trcword);
          }
          else
          {
            _getdtm(dtmstr);
            fprintf(trcfh,"%s: %.8X %.8X %.3d %.*X\en",dtmstr,
              ((compid<<16) | trctype), trcword, trclen, trclen, trcbuf);
          }
        }
      }
      
      /*=================================================================*/
      /* routine:     crttrcbuf()                                        */
      /* description: Creates a buffer to be used for tracing.           */
      /*=================================================================*/
      int crttrcbuf(
        unsigned32 num_entries,
        unsigned32 max_entry_lth,
        int        *trcbh         /* trace buffer handle is returned */
      )
      {
        trcbh = 0;
        return(0);     /* support for multiple trace buffers is optional */
      }
      
      /*=================================================================*/
      /* routine:     dsttrcbuf()                                        */
      /* description: Destroys a trace buffer.                           */
      /*=================================================================*/
      dsttrcbuf(
        int trcbh
      )
      {
        return(0);
      }
      
      /*=================================================================*/
      /* routine:     settrcflags()                                      */
      /* description: Sets trace state (on/off) and trace flags.         */
      /*=================================================================*/
      void settrcflags(
        unsigned32 state,
        unsigned32 flags
      )
      {
        trcstate       = state;
        trcglobalflags = flags;
      }
      
      /*=================================================================*/
      /* routine:     _getdtm()                                          */
      /* description: Returns a formatted string.                        */
      /*=================================================================*/
      static void _getdtm(char *string)
      {
        time_t     temp;
        struct tm  *timeptr;
      
        *string = '\e0';
        temp    = time(NULL);
        timeptr = localtime(&temp);
        strftime(string, MAX_DTM_LTH-1, "%x %X", timeptr);
      }
      

      REFERENCES

      [RFC 11.0]
      M. Hubbard, DCE SIG Serviceability Requirements, August 1992.

      AUTHOR'S ADDRESS

      Mark Hubbard Internet email: hubbard@torolab5.vnet.ibm.com
      Distributed System Services Telephone: +1-416-448-3919
      IBM Canada Laboratory
      Stn. 2G, 1150 Eglinton Avenue East
      Toronto, Ontario
      CANADA