: Caveats for a Collector Group with Multiple Log Collectors
Focus
Focus

Caveats for a Collector Group with Multiple Log Collectors

Table of Contents

Caveats for a Collector Group with Multiple Log Collectors

You can Configure a Collector Group with multiple Log Collectors (up to 16) to ensure log redundancy, increase the log retention period, and accommodate logging rates that exceed the capacity of a single Log Collector (see Panorama Models for capacity information). In any single Collector Group, all the Log Collectors must run on the same Panorama model: all M-700 appliances, all M-600 appliances, all M-500 appliances, all M-300 appliances, all M-200 appliances, or all Panorama virtual appliances. For example, if a single managed firewall generates 48TB of logs, the Collector Group that receives those logs will require at least three Log Collectors that are M-300 appliances or one Log Collector that is an M-700 appliance or similarly resourced Panorama virtual appliance.
A Collector Group with multiple Log Collectors uses the available storage space as one logical unit and uniformly distributes the logs across all its Log Collectors. The log distribution is based on the disk capacity of the Log Collectors (see Panorama Models) and a hash algorithm that dynamically decides which Log Collector owns the logs and writes to disk. Although Panorama uses a preference list to prioritize the list of Log Collectors to which a managed firewall can forward logs, Panorama does not necessarily write the logs to the first Log Collector specified in the preference list. For example, consider the following preference list:
Managed Firewall
Log Forwarding Preference List Defined in a Collector Group
FW1
L1,L2,L3
FW2
L4,L5,L6
Using this list, FW1 will forward logs to L1 so long as that primary Log Collector is available. However, based on the hash algorithm, Panorama might choose L2 as the owner that writes the logs to its disks. If L2 becomes inaccessible or has a chassis failure, FW1 will not know because it can still connect to L1.
Example - Typical Log Collector Group Setup
In the case where a Collector Group has only one Log Collector and the Log Collector fails, the firewall stores the logs to its HDD/SSD (the available storage space varies by firewall model). As soon as connectivity is restored to the Log Collector, the firewall resumes forwarding logs where it left off before the failure occurred.
In the case of a Collector Group with multiple Log Collectors, the firewall does not buffer logs to its local storage if only one Log Collector is down. In the example scenario where L2 is down, FW1 continues sending logs to L1, and L1 stores the log data that would be sent to L2. Once L2 is back up, L1 no longer stores log data intended for L2 and distribution resumes as expected. If one of the Log Collectors in a Collector Group goes down, the logs that would be written to the down Log Collector are redistributed to the next Log Collector in the preference list.
Palo Alto Networks recommends adding at least three Log Collectors to a Collector Group to avoid split brain and log ingestion issues should one Log Collector go down. See the changes to default Collector Group behavior for more information.
Two Log Collectors in a Collector Group is supported but the Collector Group becomes non-operational if one Log Collector goes down.
Example - When a Log Collector Fails
Palo Alto Networks recommends the following mitigations if using multiple Log Collectors in a Collector Group:
  • Enable log redundancy when you Configure a Collector Group. This ensures that no logs are lost if any one Log Collector in the Collector Group becomes unavailable. Each log will have two copies and each copy will reside on a different Log Collector. Log redundancy is available only if each Log Collector has the same number of logging disks.
    Because enabling redundancy creates more logs, this configuration requires more storage capacity. When a Collector Group runs out of space, it deletes older logs.
    Enabling redundancy doubles the log processing traffic in a Collector Group, which reduces its maximum logging rate by half, as each Log Collector must distribute a copy of each log it receives.
  • Obtain an On-Site-Spare (OSS) to enable prompt replacement if a Log Collector failure occurs.
  • In addition to forwarding logs to Panorama, configure forwarding to an external service as backup storage. The external service can be a syslog server, email server, SNMP trap server, or HTTP server.