MA5800 OLT Board H903GPHF reset issue

Issue Description

​​​​​​The GPHF GPON board on slot 0/14 reset suddenly and then recovered.

Alarm Information

1- The processing timer of the service board did not respond to the heartbeat messages received from the control board on time, in such case the control board reset the service board as a trial to recovery.

2- A hardware alarm is generated on the service board memory chip.

3- The issue did not happen only once recently, it happened multiple times.

Handling Process

The first alarm in the system started on 01/06/2023 at 10:50:41as shown below:

its memory chip as shown below

 

According to the board reset record, the last board reset happened on 06-01 03:14:33  with the reason “The link fault processing timer of the board times out” Also this is not done 1 time but the issue repeated many times as shown below:

 

Such a reset record, reason means the service board did not respond to the heartbeat packets sent by the control board on time, which leads to the communication between the control board and the service board being interrupted.

between the control board and service board interrupted

 

Root Cause

replace the service board with an identical spare

Solution

There is also a hardware alarm reported on the board related to its memory chip as shown below:

The first alarm in the system start on

So, need to replace the service board with an identical spare by applying an RMA application.

Then, the issue will be resolved