WildFire Cluster Upgrade Validation
Focus
Focus
Advanced WildFire Powered by Precision AI™

WildFire Cluster Upgrade Validation

Table of Contents

WildFire Cluster Upgrade Validation

Where Can I Use This?What Do I Need?
  • WildFire Appliance
  • WildFire License
The following steps are recommended for validating the WildFire appliance cluster node(s) after upgrading the software on the appliance.
  1. The following steps are to be performed on all WildFire appliance cluster controller nodes (active / passive), the worker server, and worker nodes.
    1. View the status of the reboot tasks on the WildFire controller node.
      On the WildFire cluster controller, run the following command and look for the job type Install and Status FIN:
      admin@WF-500(active-controller)> show
      cluster task pending
    2. Check that the WildFire appliance is ready to resume sample analysis. This ensures all install jobs are completed, all services are up and functional.
      1. Verify that the sw-version field shows the upgraded release version:
        admin@WF-500(passive-controller)> show
        system info | match sw-version
      2. Confirm that all processes are running:
        admin@WF-500(passive-controller)> show
        system software status
      3. Confirm that the auto-commit (AutoCom) job is complete:
        admin@WF-500(passive-controller)> show
        jobs all
      4. Confirm that data migration has successfully completed. Run show cluster data-migration-status to view the progress of the database merge. After the data merge is complete the completion timestamp displays:
        100% completed on Mon Sep 9 21:44:48 PDT 2019
        The duration of a data merge depends on the amount of data stored on the WildFire appliance. Be sure to allot at least several hours for recovery as the data merge can be a lengthy process.
    3. Validate cluster membership. Ensure all nodes are still in their respective roles, all services have maintained their status (Leader, JoinedCluster, StandyWorker, etc). Make sure no services are in commit-lock status.
      admin@WF-500(passive-controller)> show
      cluster membership
      admin@WF-500(passive-controller)> show
      cluster all-peers
    4. Make sure all the jobs have been completed:
      admin@WF-500(passive-controller)> show jobs pending
      admin@WF-500(passive-controller)> show jobs processed
    5. Verify that all interfaces are up and the counter does not show any anomalies:
      admin@WF-500(passive-controller)> show system disk-space
      admin@WF-500(passive-controller)> show interface all
      admin@WF-500(passive-controller)> show arp all
      admin@WF-500(passive-controller)> show interface eth1
      admin@WF-500(passive-controller)> show interface eth2
      admin@WF-500(passive-controller)> show interface eth3
      admin@WF-500(passive-controller)> show counter interface management
      admin@WF-500(passive-controller)> show counter interface eth1
      admin@WF-500(passive-controller)> show counter interface eth2
      admin@WF-500(passive-controller)> show counter interface eth3
      
    6. Verify that all consul related configuration and task queue status are operational:
      admin@WF-500(passive-controller)> debug cluster diagnostic
      admin@WF-500(passive-controller)> debug cluster agent connectivity
      admin@WF-500(passive-controller)> debug cluster agent dump-kv
      
  2. (WildFire cluster active controller only) The following steps are to be performed after completing
    Step 1
    .
    1. While the active controller is being upgraded, validate that the passive controller switches over to become active.
    2. After the active controller is upgraded and accessible. Validate active controller comes up as passive controller.
  3. (After all nodes in the WildFire cluster have been upgraded) Verify that a controller node (active / passive) or worker server is Ready/ReadyLeader for Global-db and Global-queue service.
    admin@WF-500(passive-controller)> show cluster membership
    Active controller example:
    Service Summary:  wfpc signature
    Cluster name:     cluster1
    Address:          1.2.3.321
    Host name:        wf101
    Node name:        wfpc-123456789123456-internal
    Serial number:    123456789123456
    Node mode:        controller
    Server role:      True
    HA priority:      primary
    Last changed:     Mon, 10 Mar 2025 02:47:33 -0700
    Services:         infra signature wfcore wfpc
    Monitor status:
                      Serf Health Status: passing
                          Agent alive and reachable
                      Service 'infra' check: passing
    Application status:
                      global-queue-service: ReadyLeader
                      global-db-service: ReadyLeader
                      siggen-db: ReadyMaster
                      wildfire-management-service: Done
                      wildfire-apps-service: Ready
    Work queue status:
                      sample analysis queued: 0
                      sample analysis running: 0
                      sample copy queued: 0
                      sample copy running: 0
    
    Diag report:
                      2.2.2.202: reported leader '2.2.2.204', age 0.
                      2.2.2.204: local node passed sanity check.
    
    Passive controller example:
    Service Summary:  wfpc signature
    Cluster name:     cluster1
    Address:          1.2.3.789
    Host name:        wf102
    Node name:        wfpc-1234567891234-internal
    Serial number:    1234567891234
    Node mode:        controller
    Server role:      True
    HA priority:      secondary
    Last changed:     Mon, 10 Mar 2025 02:38:53 -0700
    Services:         infra signature wfcore wfpc
    Monitor status:
                      Serf Health Status: passing
                          Agent alive and reachable
                      Service 'infra' check: passing
    Application status:
                      global-queue-service: JoinedCluster
                      global-db-service: Ready
                      siggen-db: ReadySlave
                      wildfire-management-service: Done
                      wildfire-apps-service: Ready
    Work queue status:
                      sample analysis queued: 0
                      sample analysis running: 0
                      sample copy queued: 0
                      sample copy running: 0
    
    Diag report:
                      2.2.2.202: reported leader '2.2.2.204', age 0.
                      2.2.2.205: local node passed sanity check.
    
    Worker server node example:
    Service Summary:  wfpc
    Cluster name:     cluster1
    Address:          1.2.3.456
    Host name:        wf103
    Node name:        wfpc-123456789123456-internal
    Serial number:    123456789123456
    Node mode:        worker
    Server role:      True
    HA priority:
    Last changed:     Mon, 10 Mar 2025 02:54:53 -0700
    Services:         infra wfcore wfpc
    Monitor status:
                      Serf Health Status: passing
                          Agent alive and reachable
                      Service 'infra' check: passing
    Application status:
                      global-queue-service: JoinedCluster
                      global-db-service: JoinedCluster
                      siggen-db: Stopped
                      wildfire-management-service: Done
                      wildfire-apps-service: Ready
    Work queue status:
                      sample analysis queued: 0
                      sample analysis running: 0
                      sample copy queued: 0
                      sample copy running: 0
    
    Diag report:
                      2.2.2.202: reported leader '2.2.2.204', age 0.
                      2.2.2.202: local node passed sanity check.
    
    
    
    Worker client node example:
    Service Summary:  wfpc
    Cluster name:     cluster1
    Address:          1.2.3.123
    Host name:        wf206B
    Node name:        wfpc-123456789123456-internal
    Serial number:    123456789123456
    Node mode:        worker
    Server role:      False
    HA priority:
    Last changed:     Tue, 18 Mar 2025 09:08:16 -0700
    Services:         infra wfpc
    Monitor status:
                      Serf Health Status: passing
                          Agent alive and reachable
                      Service 'infra' check: passing
    Application status:
                      global-queue-service: StandbyAsWorker
                      global-db-service: StandbyAsWorker
                      siggen-db: Deregistered
                      wildfire-management-service: Done
                      wildfire-apps-service: Ready
    Work queue status:
                      sample analysis queued: 0
                      sample analysis running: 0
                      sample copy queued: 0
                      sample copy running: 0
    
    Diag report:
                      2.2.2.201: reported leader '2.2.2.205', age 0.
                      2.2.2.206: local node passed sanity check.