Saturday, 28 December 2013

Troubleshooting SAP startup problems in Windows

 Troubleshooting SAP startup problems in Windows


There's probably nothing worse than not being able to start your SAP system … Especially the production system! Aside from the operating system and the database server you must pay close attention to certain places in SAP to find out what caused the problem and how to solve it. Here are the two places you will definitely need to check: EventViewer (Application and System logs) and the SAP Management Console (MMC).
EventViewer can provide useful information and it may help you pinpoint where the problem resides. The SAP MMC gives you the ability to visually see the system status (green, yellow or red lights), view the work processes status and view the developer traces, which are stored in the "work" directory. Example: /usr/sap/TST/DVEBMGS00/work.
For a central SAP instance to start successfully, both the message server and the dispatcher need to start. If one of them or both fail to start, users cannot log in to the system. The following scenarios will illustrate possible causes of why an SAP instance might not start and the reason of the message:
"*** DISPATCHER EMERGENCY SHUTDOWN ***".

Things you need to get familiar with:
  • Developer Traces:
    -- dev_disp Dispatcher developer trace
    -- dev_ms Message Server developer trace
    -- dev_wp0 Work process 0 developer trace
  • The "services" file, which contains TCP and UDP services and their respective port numbers. This plain-text configuration file is located under %winnt%/system32/drivers/etc.
  • Windows Task Manager (TASKMGR.exe).
  • Dispatcher Monitor (DPMON.exe), which is located under /usr/sap/<SID>/sys/exe/run.
  • Database logs.
  • EventViewer (EVENTVWR.exe).

Scenario 1: Dispatcher does not start due to a port conflict


Symptoms
  • No work processes (disp+work.exe) exist in Task Manager.
  • Dispatcher shows status "stopped" in the SAP MMC.
  • Errors found in "dev_disp":
***LOG Q0I=> NiPBind: bind (10048: WSAEADDRINUSE: Address already in use) [ninti.c 1488]
*** ERROR => NiIBind: service sapdp00 in use [nixxi.c 3936]
*** ERROR => NiIDgBind: NiBind (rc=-4) [nixxi.c 3505]
*** ERROR => DpCommInit: NiDgBind [dpxxdisp.c 7326]
*** DP_FATAL_ERROR => DpSapEnvInit: DpCommInit
*** DISPATCHER EMERGENCY SHUTDOWN ***
Problem Analysis
I highlighted the keywords in the error messages above:
  • Address already in use
  • Service sapdp00 in use
The TCP port number assigned in the "services" file is being occupied by another application. Due to the conflict, the dispatcher shuts down.
Solution
If your server has a firewall client, disable it and attempt to start the SAP instance again.
If the instance starts successfully you can enable the client firewall back again.
If there is no firewall client at all, or if disabling it did not resolve the problem, edit the "services" file and check what port the appropriate "sapdp" is using.
If the instance number is 00, look for sapdp00. If the instance number is 01 look for sapdp01 and so on. You can use the following OS command to help you resolve port conflicts:
netstat -p TCP
There are also utilities on the Internet that can help you list all the TCP and UDP ports a system is using.


Scenario 2: Dispatcher dies due to a database connection problem

Symptoms
  • No database connections.
  • No work processes.
  • SAP MMC -> WP Table shows all processes as "ended".
  • Errors found in "dev_disp":
    C setuser 'tst' failed -- connect terminated
    C failed to establish conn. 0
    M ***LOG R19=> tskh_init, db_connect (DB-Connect 000256) [thxxhead.c 1102]
    M in_ThErrHandle: 1
    M *** ERROR => tskh_init: db_connect (step 1, th_errno 13, action 3, level 1) [thxxhead.c 8437]
    *** ERROR => W0 (pid 2460) died [dpxxdisp.c 11651]
    *** ERROR => W1 (pid 2468) died [dpxxdisp.c 11651]
    *** ERROR => W2 (pid 2476) died [dpxxdisp.c 11651]
    . . .
    *** ERROR => W11 (pid 2552) died [dpxxdisp.c 11651]
    *** ERROR => W12 (pid 2592) died [dpxxdisp.c 11651]
    my types changed after wp death/restart 0xbf --> 0x80
    *** DP_FATAL_ERROR => DpEnvCheck: no more work processes
    *** DISPATCHER EMERGENCY SHUTDOWN ***
    DpModState: change server state from STARTING to SHUTDOWN
Problem Analysis
A connection to the database could not be established because either the SQL login specified in parameter "dbs/mss/schema" is set incorrectly or the SQL login was deleted from the database server. This parameter needs to be set in the DEFAULT.pfl system profile (under /usr/sap/<SID>/sys/profile). In the messages above, we see that the SQL login 'tst' is expected but it does not exist at the database level.
Solution
Set the entry to the appropriate database owner. If the system is based on Basis <= 4.6 or if the system was upgraded from 4.x to 4.7 the database owner should be "dbo". But, if the system was installed from scratch and it's based on the Web AS 6.x the database owner should match the SID name in lower case. Example: if the SID is TST then the database owner should be "tst". If the parameter is set correctly in the DEFAULT.pfl profile check at the database level if the SQL login exists. If it doesn't, create it and give it database ownership to the <SID>.

Scenario 3: SAP does not start at all: no message server and no dispatcher

Symptoms
  • The message server and the dispatcher do not start at all in the SAP MMC.
  • The following error when trying to view the developer traces within the SAP MMC: The network path was not found.
  • No new developer traces written to disk (under the "work" directory.)
Problem Analysis
The network shares "saploc" and "sapmnt" do not exist. That explains the "network path not found" message when attempting to view the developer traces within the SAP MMC.
Solution
Re-create the "saploc" and "sapmnt" network shares. Both need to be created on the /usr/sap directory.


Scenario 4: Users get "No logon possible" messages

Symptoms
  • Work processes start but no logins are possible.
  • Users get the login screen but the system does not log them in. Instead, they get this error: No logon possible (no hw ID received by mssg server).
  • In the SAP MMC, the message server (msg_server.exe) shows status "stopped".
  • The dev_ms file reports these errors:
    [Thr 2548] *** ERROR => MsCommInit: NiBufListen(sapmsTST) (rc=NIESERV_UNKNOWN) [msxxserv.c 8163]
    [Thr 2548] *** ERROR => MsSInit: MsSCommInit [msxxserv.c 1561]
    [Thr 2548] *** ERROR => main: MsSInit [msxxserv.c 5023]
    [Thr 2548] ***LOG Q02=> MsSHalt, MSStop (Msg Server 2900) [msxxserv.c 5078]
Problem Analysis
Work processes were able to start but the message server was not. The reason is because the "services" file is missing the SAP System Message Port entry. Example: SAPmsTST 3600/tcp
Solution
Edit the "services" file and add the entry. Then, re-start the instance. Make sure you specify the appropriate TCP port (e.g. 3600) for the message server.


Scenario 5: The message server starts but the dispatcher doesn't

Symptoms
  • The dispatcher shows status "stopped" in the SAP MMC.
  • The "dev_disp" file shows these errors:
    ***LOG Q0A=> NiIServToNo, service_unknown (sapdp00) [nixxi.c 2580]
    *** ERROR => DpCommInit: NiDgBind [dpxxdisp.c 7326]
    *** DP_FATAL_ERROR => DpSapEnvInit: DpCommInit
    *** DISPATCHER EMERGENCY SHUTDOWN ***
Problem Analysis
The keyword in the messages above is "service unknown" followed by the entry name "sapdp00". The dispatcher entry "sapdp00" is missing in the "services" file. Example: sapdp00 3200/tcp
Solution
Add the necessary entry in the "services" file. Example: sapdp00 3200/tcp Then, re-start the instance.


Scenario 6:  Work processes die soon after they start


Symptoms
  • All work processes die right after the instance is started.
  • The SAP MMC shows work processes with status "ended".
  • Only one work process shows status "wait".
  • An ABAP dump saying "PXA_NO_SHARED_MEMORY" is generated as soon as a user logs in.
  • The SAP MMC Syslog shows the following error multiple times: "SAP-Basis System: Shared Memory for PXA buffer not available".
Problem Analysis
The instance profile contains misconfigured memory-related parameters. Most likely the "abap/buffersize" instance profile parameter is set to high.
Solution
Edit the instance system profile at the OS level under /usr/sap/<SID>/sys/profile and lower the value assigned to "abap/buffersize". Then, restart the instance. Also, it's important to find out if any other memory parameter were changed. If not, the system should start once the adequate memory allocation has been set to the the "abap/buffersize" parameter.




No comments:

Post a Comment