Troubleshoot Incidents
Focus
Focus

Troubleshoot Incidents

Table of Contents

Troubleshoot Incidents

Follow the troubleshooting steps for resolving incidents generated in Prisma SD-WAN.
Prisma SD-WAN generates incidents and alerts when the system reaches system-defined or customer-defined thresholds or there is a fault in the system. Use the incident and alert event codes to view the details of the incidents and alerts generated in the system.
Follow the troubleshooting steps for each incident in the order listed below. Each step is intended to resolve the issue. Proceed to the next step only if the previous step did not resolve the problem.
For each incident raised on the web interface, you can select Troubleshoot to follow a step-by-step troubleshooting procedure. If the issue persists, select Go to Support to create a support ticket. A Palo Alto Networks Support executive will contact you. You can also return the device to Palo Alto Networks.
Incident CodeTroubleshooting Steps
APPLICATION_CUSTOM_RULE_CONFLICT
  1. Select Incidents & AlertsIncidents.
  2. Locate the required application for troubleshooting.
  3. Click the ellipsis under Action and select Troubleshoot.
  4. If the Application has an override, you may need to access the override rules to make the shift, possibly before clicking the ellipsis.
  5. You can make changes on the App configuration screen.
DEVICEHW_DISKENC_SYSTEM
This event code was raised when one disk partition failed to convert into an encrypted partition during the last device upgrade.
  1. Go to WorkflowsDevicesClaimed Devices.
  2. Locate your device and click Upgrade in the Software column.
  3. Select the required software version to upgrade.
    After the upgrade is complete, repeat the same steps and downgrade to the target version. If the target version is the latest and you still see the error, upgrade to the version before the latest and then upgrade to the latest again.
DEVICEHW_DISKUTIL_PARTITIONSPACE
This event code is raised due to high disk capacity utilization. To verify, follow the steps:
  1. SSH to the device and run the command:
    dump disk info
  2. Check the available space for the attached volumes and contact the Palo Alto Networks Support team to clear the utilized volume.
DEVICEHW_INTERFACE_ ERRORS
This event code is raised due to a faulty cable, SFP, port, or patch panel connection.
  1. Inspect and replace the cable. Ensure that the correct cable is used.
  2. If the port requires a transceiver, replace the SFP. Ensure that the SFP is correct.
  3. Try a different patch panel port if you are using a patch panel.
  4. Attempt using a different port on the ION device.
  5. Inspect the device and port that the ION device connects to. Sometimes the issue could be due to the other hardware device.
DEVICEHW_INTERFACE_HALFDUPLEX
This event code is raised due to issues with port configuration and cable. First, verify the cable connection and swap the cable. Second, check the port and the remote end (auto or hard-coded). To change the port configuration:
  1. Go to WorkflowsDevicesClaimed Devices.
  2. Locate your device and click it.
  3. Go to Interfaces and select the interface with half/duplex.
  4. Select Advanced Options - PHYSICAL and change it.
  5. Verify if the connection is UP and has a correct configuration with full-duplex
DEVICEHW_INTERFACE_DOWNInterface down requires an assessment to see if the incident is intentional or real.
  1. SSH to the device and run the commands:
    dump interface status <port number>
    dump interface config <port number>
  2. Check if the port is admin up and connected/not connected.
DEVICEHW_MEMUTIL_SWAPSPACE
To verify if High Memory Utilization is happening in real-time:
  1. Select Monitor ION DevicesDevice Activity.
  2. Select the ION device and confirm the free memory.
  3. SSH to the device and run the
    inspect memory summary
    command to verify the memory.
DEVICEHW_POWER_LOST This event code is raised by an unplugged or a loose power cable.
  1. Try using a new cable or re-seating the existing cable. If this does not help, replace the power supply unit (PSU). Note down which PSU failed for devices that have dual PSUs. Order a replacement PSU from Palo Alto Networks for the particular ION device.
  2. When the new PSU is available on hand at the device's site, pull out and replace the affected PSU.
DEVICEIF_ADDRESS_DUPLICATEIf static IP address configuration is used, confirm that the IP address used is not explicitly assigned to another device or within a range already allocated by a DHCP server.
DEVICESW_CONCURRENT_FLOWLIMIT_EXCEEDED
To verify the concurrent flows:
  1. Select MonitorBranch SitesActivity.
  2. Select the ION device and verify the concurrent flows generated by UDP/TCP packets and time.
  3. Verify in Flow Browser for the same flow, the source/destination IP address initiating multiple sessions. Then, based on the source/destination IP address, confirm a network scanning in the environment.
DEVICESW_DHCPRELAY_ RESTARTProcess stopped requires further investigation. Contact Palo Alto Networks Support.
DEVICESW_DHCPSERVER_ERRORS
  1. Check interfaces configuration and state.
  2. Verify that at least one device interface is active and configured with static IP configuration.
  3. Check DHCP server configuration.
  4. Verify that the subnet address does not overlap across the site.
  5. If custom options are configured, verify that the custom option definition and option value are compatible with each other.
  6. If the problem still persists, contact Palo Alto Networks Support.
DEVICESW_DHCPSERVER_RESTARTProcess stopped requires further investigation. Contact Palo Alto Networks Support.
DEVICESW_DISCONNECTED_FROM_CONTROLLER
  1. Check if there is any network connectivity problem at the site. Look for invalid interface configurations, interface incidents or network incidents. If present, clear those faults.
  2. Check if there are any process incidents which indicate that processes are stopped. If present, take action on those faults.
  3. Check if any firewall rules both on the ION device (if used) and external to the ION device prevent communication between the ION device and controller. If present, fix those rules.
  4. Ensure that the controller is not undergoing maintenance. If notification indicates maintenance activity, wait until the activity is completed.
  5. If none of the choices apply, please open a case with Palo Alto Networks Support.
DEVICESW_FPS_LIMIT_ EXCEEDED
  1. Check the flow browser and identify the rogue client and isolate it.
  2. If the problem still persists, contact Palo Alto Networks Support.
DEVICESW_GENERAL_PROCESSRESTARTProcess restart is an alert and does not require immediate action. If several process restart alerts repeat in a given hour or day, contact Palo Alto Networks Support.
DEVICESW_GENERAL_PROCESSSTOPProcess stopped requires further investigation. Contact Palo Alto Networks Support.
DEVICESW_IMAGE_INCOMPATIBLE
  1. Check the software version of the device on the Device List screen.
  2. Click Upgrade and check if the device's software version is present in the available software list.
  3. If the software version on the device is not on the available software list, upgrade or downgrade the device to an available software version. After successful software change, issue Recheck SW Version command from the device list for that device.
  4. If the software version is not on the available software list but the software version on the device is the desired software version for your network, contact Palo Alto Networks Support for further instructions.
DEVICESW_LICENSE_VERIFICATION_FAILED
  1. Obtain additional licenses or free up unused licenses and then bring up the virtual ION device.
  2. If the problem still persists, contact Palo Alto Networks Support.
DEVICESW_MONITOR_DISABLEDSystem monitoring disabled requires further investigation.
  1. Attempt a device reboot to clear the incident.
  2. If system monitoring disabled incident is raised again after a reboot, contact Palo Alto Networks Support.
DEVICESW_NTP_NO_SYNCCould not reached the configured NTP server. Contact Palo Alto Networks Support.
DEVICESW_SNMP_AGENT_ RESTARTProcess stopped requires further investigation. Contact Palo Alto Networks Support.
DEVICESW_SYSTEM_BOOTDevice reboot is an alert and may need further investigation.
  1. If the device rebooted due to operations performed including forced reboot by administrator or a software upgrade, the alert is normal and for informational purposes only.
  2. If the device rebooted itself without any administrator operation reasons, contact Palo Alto Networks Support.
DEVICESW_TOKEN_VERIFICATION_FAILED
  1. Generate a new token and use that token in the creation of virtual ION device metadata.
  2. If the problem still persists, contact Palo Alto Networks Support.
DEVICESW_CONNTRACK_FLOWLIMIT_EXCEEDED
  1. Use the device toolkit to dump and inspect the entries in the connection tracking table.
  2. Contact Palo Alto Networks Support.
NAT_POLICY_LEGACY_ALG_CONFIG_OVERRIDEContact Palo Alto Networks Support to remove the legacy configuration from the device.
NAT_POLICY_STATIC_NATPOOL_OVERRUNMake sure that traffic selector has a 1:1 mapping for the converted NATPOOL range to CIDR.
NETWORK_DIRECTINTERNET_DOWN (Branch sites only)
  1. Check if there are any interface down incidents on the interfaces connecting to the internet circuit.
  2. Log in to the device via SSH/Remote access.
  3. Verify internet interface status and reachability on the interface by pinging public IP addresses.
  4. Check the ARP entry of the gateway IP address on the internet interface by running the inspect system arp command.
  5. Capture packets on the internet interface and verify the packet flow.
  6. Check the internet modem, if present, to ensure that it is powered up. Then, as a possible recovery step, power cycle the modem.
  7. If the problem persists, contact Palo Alto Networks Support.
NETWORK_DIRECTPRIVATE_DOWN (Branch sites only)
  1. Check if there are any interface down incidents on the interfaces connecting to private WAN routing devices. Then, follow interface troubleshooting and incident clearance procedures for that interface.
  2. Log in to the device via SSH/Remote access.
  3. Verify interface status and reachability by pinging the gateway IP address.
  4. Check if connectivity between the remote office and the data center exists by pinging a data center’s private WAN interface(s) IP addresses from the affected site.
  5. Verify BFD connectivity between the remote office and the data center private WAN interface(s) IP addresses.
  6. Capture packets on the Private wan interface and verify the packet flow. If the problem still persists, contact Palo Alto Networks Support.
NETWORK_POLICY_RULE_CONFLICTUpdate the two conflicting policy rules identified or remove one of the rules to ensure that there is no conflict.
NETWORK_POLICY_RULE_DROPPEDUpdate the identified policy rule to remove some applications or remove some source and destination prefixes in the rule.
NETWORK_PRIVATEWAN_DEGRADED (DC Sites only)
  1. Verify that the prefixes configured on the remote site are correct.
  2. Verify that the BGP configuration on the WAN edge router is such that routes sent to the Palo Alto Networks data center device are received from the provider without any summarization.
NETWORK_PRIVATEWAN_UNREACHABLE (DC Sites only)
  1. Check if there are any interface down incidents on the interfaces connecting to private WAN routing devices. Follow interface troubleshooting and incident clearance procedures for that interface.
  2. Check if local network endpoints connected to the affected ION device are reachable by pinging the interface through which the private WAN traffic is supposed to traverse.
  3. For a data center site, check for PEERING_EDGE_DOWN incidents. Follow PEERING_EDGE_DOWN troubleshooting and incident clearance steps.
  4. Check if connectivity between the remote office and the data center exists by pinging the private WAN interface(s) from the affected site. From a data center site, choose one or more remote office sites to ping to.
  5. If the problem still persists, contact Palo Alto Networks Support.
PEERING_BGP_DOWN
  1. Check if there are any interface down faults on the interfaces connecting to peer routing devices. Follow interface troubleshooting and fault clearance procedures for that interface.
  2. Check if local network endpoints connected to the affected ION device are reachable using a ping operation using the interface through which traffic to peer routing device is supposed to traverse.
  3. Check and validate configuration on the peer routing device and check for interface and routing faults.
  4. If none of the choices apply, please open a case with Palo Alto Networks support
PRIORITY_POLICY_RULE_CONFLICTUpdate the two conflicting policy rules identified or remove one of the rules to ensure that there is no conflict.
PRIORITY_POLICY_RULE_DROPPEDUpdate the identified policy rule to remove some applications or remove some source and destination prefixes in the rule.
SITE_CIRCUIT_ABSENT_FOR_POLICYAssign the labels that have been reported in the incident as missing to the site WAN interface at the site.
SPOKEHA_CLUSTER_DEGRADEDCheck the spoke cluster switch over event history to find out the device for which the effective priority has become zero. If so, then check:
  • If any of the tracked interfaces of the device are down.
  • If any of the system services for the device are down.
SPOKEHA_CLUSTER_DOWNCheck the spoke cluster switch over event history to find out the device for which the effective priority has become zero. If so, then check:
  • If any of the tracked interfaces of the device are down.
  • If any of the system services for the device are down.
SPOKEHA_MULTIPLE_ACTIVE_DEVICES
  1. Check the operational state of the interfaces that are specified as the source interface for cluster operation to find out if they are up.
  2. If the interfaces on both devices are up, check the switch configurations to confirm the interfaces are in the same VLAN.
  3. Ping the IP address on the interface on one of the devices from the other device to confirm the connectivity between the devices.
SPOKEHA_STATE_UPDATEIf the device has become a backup device, check the device configuration, and incidents or alerts to find out:
  • If a failure condition caused the device to become a backup.
  • If another device with a higher priority became active in the cluster.
  • If the device configuration was updated to disable the device.