1. Checking Whether the Switch Can Ping the Remote Device Address Successfully
On the switch, ping the remote device address to check whether the TCP connection between the two devices is normal.
<HUAWEI> ping 10.1.1.1 PING 10.1.1.1: 56data bytes, press CTRL_C to break Reply from 10.1.1.1: bytes=56 Sequence=1 ttl=255 time=1 ms Reply from 10.1.1.1: bytes=56 Sequence=2 ttl=255 time=1 ms Reply from 10.1.1.1: bytes=56 Sequence=3 ttl=255 time=10 ms Reply from 10.1.1.1: bytes=56 Sequence=4 ttl=255 time=1 ms Reply from 10.1.1.1: bytes=56 Sequence=5 ttl=255 time=1 ms --- 10.1.1.1 ping statistics --- 5 packet(s) transmitted 5 packet(s) received 0.00% packet loss round-trip min/avg/max = 1/2/10 ms
If the ping succeeds, the TCP connection is normal. Continue Checking Whether an ACL Is Configured to Filter TCP Packets Based on Port Number 179 to check whether the configuration is correct.
If the ping fails, run the display ip routing-table x.x.x.x verbose command on the switch to check whether there are BGP routes to the remote device.
If there are BGP routes to the remote device, check why the ping fails using BGP routes. For example, check whether corresponding ARP entries exist. For more details, see A Switch Cannot Be Pinged by a Directly Connected Device.
If there are no BGP routes to the remote device, configure a BGP peer and specify a peer address in the BGP view.
Run the following commands to configure a BGP peer.
<HUAWEI> system-view [HUAWEI] bgp 100 [HUAWEI] router-id 1.1.22.16 [HUAWEI-bgp] peer 1.1.22.17 as-number 100
2. Checking Whether an ACL Is Configured to Filter TCP Packets Based on Port Number 179
Run the display acl all command on the two devices to check whether an ACL is configured to match and deny TCP packets with the source or destination port number 179.
<HUAWEI> display acl all Total nonempty ACL number is 1 Advanced ACL 3001, 2 rules Acl's step is 5 rule 5 deny tcp source-port eq bgp rule 10 deny tcp destination-port eq bgp
If an ACL is configured to deny TCP packets with port number 179, and BGP packet transmission is affected, determine whether to run the undo rule rule-id destination-port and undo rule rule-id source-port commands to delete the corresponding ACL configuration according to service requirements.
3. Checking Whether the Router IDs of the Two Devices Conflict
View BGP peer information on the two devices to check whether their router IDs conflict. For example, if an IPv4 unicast peer relationship cannot be established, run the display bgp peer command to check whether the router IDs of the two devices conflict and whether their peer addresses and AS numbers are correct.
<HUAWEI> display bgp peer BGP local router ID : 223.5.0.109 Local AS number : 41976 Total number of peers : 12 Peers in established state : 4 Peer V AS MsgRcvd MsgSent OutQ Up/Down State PrefRcv 8.9.0.8 4 100 1601 1443 0 23:21:56 Established 10000 9.10.0.10 4 200 1565 1799 0 23:15:30 Established 9999
If their router IDs conflict, run the router id command in the BGP view to change the two router IDs to be different. A loopback interface address is often used as the local router ID.
If their peer addresses and AS numbers are incorrect, send the collected information to technical support personnel.
If the BGP peer relationship needs to be established on two indirectly connected interfaces, continue Checking Whether the Configurations of Both Devices Are Correct When They Establish a BGP Peer Relationship on Two Indirectly Connected Interfaces.
If the BGP peer relationship needs to be established on two directly connected interfaces, continue Checking Whether There Are BGP Peer Flapping Logs If the Two Devices Establish a BGP Peer Relationship on Two Directly Connected Interfaces.
4.Checking Whether the Configurations of Both Devices Are Correct When They Establish a BGP Peer Relationship on Two Indirectly Connected Interfaces
If the two devices need to establish a BGP peer relationship on two indirectly connected interfaces, check whether the peer connect-interface command is configured on the two devices.
<HUAWEI> display ip routing-table 1.1.22.17 Route Flags: R - relay, D - download to fib ------------------------------------------------------------------------------ Routing Table : Public Summary Count : 3 Destination/Mask Proto Pre Cost Flags NextHop Interface? 1.1.22.17/32 OSPF 10 100 D 16.17.100.17 GigabitEthernet0/2/5.100 <HUAWEI> display current-configuration configuration bgp # bgp 100 router-id 1.1.22.16 …… peer 1.1.22.17 as-number 100 peer 1.1.22.17 connect-interface LoopBack0 # ipv4-family unicast undo synchronization …… return
If two devices establish a BGP connection using two indirectly connected interfaces, the peer connect-interface command needs to be configured on the two devices to ensure that the connection is established correctly.
If the peer connect-interface command is not configured in the BGP view, configure it in the BGP view, for example:
<HUAWEI> system-view [HUAWEI] bgp 100 [HUAWEI-bgp] peer 10.16.2.3 as-number 1100 [HUAWEI-bgp] peer 10.16.2.3 connect-interface LoopBack0
Check whether the peer ebgp-max-hop command is configured on the two devices.
<HUAWEI> display bgp peer BGP local router ID : 192.168.1.1 Local AS number : 100 Total number of peers : 1 Peers in established state : 1 Peer V AS MsgRcvd MsgSent OutQ Up/Down State PrefRcv? 2.2.2.9 4 200 2 5 0 00:00:35 Established 0 <HUAWEI> display current-configuration configuration bgp # bgp 100 peer 2.2.2.9 as-number 200 peer 2.2.2.9 ebgp-max-hop 255 peer 2.2.2.9 connect-interface LoopBack0 # ipv4-family unicast undo synchronization peer 2.2.2.9 enable # return
If the switch establishes an EBGP connection with a peer on the indirectly connected network, the peer ebgp-max-hop command must be configured to allow the two devices to establish a TCP connection over multiple hops. If this command is configured on one end of an EBGP connection, it must also be configured on the other end.
If the peer ebgp-max-hop command is not configured in the BGP view, run the peer as-number command to create a peer and then configure the peer ebgp-max-hop command, for example:
<HUAWEI> system-view [HUAWEI] bgp 100 [HUAWEI-bgp] peer 10.1.1.2 as-number 200 [HUAWEI-bgp] peer 10.1.1.2 ebgp-max-hop
Check whether the address families configured on the two devices are consistent.
<HUAWEI> display current-configuration configuration bgp # bgp 100 peer 2.2.2.9 as-number 200 peer 2.2.2.9 ebgp-max-hop 255 peer 2.2.2.9 connect-interface LoopBack0 # ipv4-family unicast undo synchronization peer 2.2.2.9 enable # Return
If the address families configured on the two devices are inconsistent, modify them to be consistent in the BGP view.
<HUAWEI> system-view [HUAWEI] bgp 100 [HUAWEI-bgp] ipv4-family unicast
5.Checking Whether There Are BGP Peer Flapping Logs If the Two Devices Establish a BGP Peer Relationship on Two Directly Connected Interfaces
Run the display bgp peer x.x.x.x log-info command on the switch to check error codes and error subcodes for the BGP peer Down event.
<HUAWEI> display bgp peer 10.1.1.2 log-info Peer : 10.1.1.2 Date/Time : 2016-10-31 08:02:04+00:00 State : Down Error Code : 6(CEASE) //Error code Error Subcode : 4(Administrative Reset) //Error subcode Notification : Receive Notification //The peer sends or receives a Notification packet Date/Time : 2016-10-31 08:01:53+00:00 State : Up
The error code 4 indicates hold timer expired.
Check whether the Notification field displays send notification or receive notification. If this field displays send notification, the local end does not receive the Keepalive message of the remote end. If this field displays receive notification, the remote end does not receive the Keepalive message of the local end.
Collect IGP routing information in BGP peer relationship establishment to check whether the timestamp is updated.
<HUAWEI> display ip routing-table 10.1.1.2 verbose Route Flags: R - relay, D - download to fib, T - to vpn-instance, B - black hole route ------------------------------------------------------------------------------ Routing Table : _public_ Summary Count : 2 Destination: 10.1.1.0/24 Protocol: Direct Process ID: 0 Preference: 0 Cost: 0 NextHop: 10.1.1.1 Neighbour: 0.0.0.0 State: Active Adv Age: 12d06h01m19s //Route lifetime Tag: 0 Priority: critical Label: NULL QoSInfo: 0x0 IndirectID: 0x240000E5 RelayNextHop: 0.0.0.0 Interface: Ethernet3/0/1 TunnelID: 0x0 Flags: D
If the timestamp is updated, IGP routes may have changed. As a result, protocol packets cannot be sent, and the connection between the two devices is disconnected.
Check whether the switch can ping the remote end successfully.
Ping the remote device address.
<HUAWEI> ping 10.1.1.3 PING 10.1.1.3: 56 data bytes, press CTRL_C to break Request time out Request time out Request time out Request time out Request time out --- 10.1.1.3 ping statistics --- 5 packet(s) transmitted 0 packet(s) received 100.00% packet loss
If the ping fails, check whether TCP is normal according to Checking Whether an ACL Is Configured to Filter TCP Packets Based on Port Number 179. If the ping succeeds, continue Collecting Information and Seeking Technical Support.
ro.
The error code 3 often indicates that the BGP connection is interrupted because received Update messages are incorrect.
BGP/6/SEND_NOTIFY log will be generated, indicating that the switch sends a Notification message to its BGP peer:
Sep 13 2016 05:56:16+10:00 HUAWEI %%01BGP/6/SEND_NOTIFY(l)[4904452]:The router sent a NOTIFICATION message to peer x.x.x.x. (ErrorCode=3, SubErrorCode=9, BgpAddressFamily=Public, ErrorData=900e0371000184045¡
The error code 3 contains the following error subcodes:
- Malformed attribute list
- Unrecognized well-known attribute
3: Well-known attribute is missing
- Attribute flags error
5: Attribute length error
6: Invalid Origin attribute
7: AS routing loop
8: Invalid Next_Hop attribute
9: Optional attribute error
10: Invalid network field
11: Abnormal AS_Path
A log indicating that the neighbor status changes because of a packet resolution error.
Sep 13 2016 05:56:16+10:00 HUAWEI %%01BGP/3/STATE_CHG_UPDOWN(l)[4904453]:The status of the peer x.x.x.x changed from ESTABLISHED to IDLE. (InstanceName=Public, StateChangeReason=VPN-Target NLRI Parsed Error)
The error code 5 indicates finite state machine error.
If the error code is 5 and the error reason is socket read failed, the remote device closes the TCP connection with the switch. Check the reason why the remote device closes the TCP connection and then rectify the fault accordingly.
s.
The error code 6 contains the following error subcodes:
1: Maximum number of prefixes exceeded
- Administrative shutdown
- Peer deleted
4: Administrative reset
5: Connection rejected
6: other configuration change
7: Connection collision resolution
8: out of resources
9: BFD session Down
For example:
If the error subcode is 1, the BGP connection is disconnected because the number of route prefixes exceeded the maximum value. You need to check whether the peer route-limit command is configured, check the number of received routes, and then limit received routes.
If the error subcode is 9, the BGP connection is disconnected because the BFD session is Down. You need to rectify the fault according to related BFD logs.
6.Collecting Information and Seeking Technical Support
If the fault persists, collect related information and seek technical support.
Collecting Fault Information
Collect operation results of the preceding steps and record the results in a file.
Collect all diagnostic information and export the information to a file.
Run the display diagnostic-information file-name command in the user view to collect diagnostic information and save the information to a file.
<HUAWEI> display diagnostic-information dia-info.txt Now saving the diagnostic information to the device 100% Info: The diagnostic information was saved to the device successfully.
When the diagnostic file is generated, you can export the file from the device using FTP, SFTP, or SCP.
NOTICE:
You can run the dir command in the user view to check whether the file is generated.
You can also run the display diagnostic-information command and save terminal logs in a diagnostic file on a disk.
If this command displays a long output, press Ctrl+C to abort this command.
This command displays diagnostic information, which helps locate faults but may affect system performance. For example, CPU usage may become high. Therefore, do not use this command when the system is running properly.
Running the display diagnostic-information command simultaneously on multiple terminals connected to the device is prohibited. This is because CPU usage of the device may obviously increase and the device performance may be degraded.
Collect the log and trap information on the device and export the information to files.
Run the save logfile all command in the user view to save the logs in the user log buffer area and diagnostic log buffer area to the user log file and diagnostic log file, respectively.
<HUAWEI> save logfile all Info: Save logfile successfully. Info: Save diagnostic logfile successfully.
When the diagnostic file is generated, you can export the file from the device using FTP, SFTP, or SCP.
NOTE:
You can also run the display logbuffer and display trapbuffer commands to view the log and trap information on the device, and save the information in diagnostic files on a disk.
NOTE:
Technical support personnel will provide instructions for you to submit all the collected information and files, so that they can locate faults.
Leave a comment