How to configure, monitor, and troubleshoot Threat Intelligence Exchange Server database replication

Technical Articles ID: KB85751
Last Modified: 2022-03-28 09:06:32 Etc/GMT

Environment

Threat Intelligence Exchange (TIE) Server 3.x, 2.x

Summary

This article describes how to configure, monitor the health of, and troubleshoot TIE Server PostgreSQL database replication.

NOTE: As of TIE Server version 2.1.0, the naming convention for Master and Slave operations changed to Primary and Secondary. For example:

Master becomes Primary
Slave becomes Secondary

Previous versions of TIE Server retain the original Master/Slave designations.

Look at the current TIE Server primary and secondary PostgreSQL database configuration:

The configuration file for TIE Server PostgreSQL database replication is /data/tieserver_pg/postgresql.conf. This file contains replication configuration parameters such as wal_level, archive_mode, archive_command, max_wal_senders, wal_keep_segments, and hot_standby. For information about these configuration parameters, see the official PostgreSQL documentation.

We recommend that you do not change the configuration. It allows the TIE Server PostgreSQL database to use native streaming replication in a primary or secondary configuration. The primary or secondary configurations might have hot-standby capabilities that enable scalability for read operations. You can tune the parameters below, as needed to address unusual delays:

max_wal_senders: To be set in the primary instance. Specifies the maximum number of possible concurrent connections from the secondary server to the primary server. Increasing this value might speed up the synchronization process. It can't be set higher than the overall max_connections parameter configuration.

wal_keep_segments: To be set in the primary instance. Specifies the minimum number of past log file segments kept in the pg_xlog directory. Increasing this value helps to prevent the primary server from removing a WAL segment still needed by the standby, in which case the replication connection closes. This situation is potentially possible when the Data Exchange Layer (DXL) architecture has nodes with slow connectivity. As a result, the PostgreSQL replication process is slow.

For more information about these configuration parameters, see the official PostgreSQL documentation.

After changing these values, you must perform further monitoring and testing to assess the overall impact. Factors besides each environment link latency include:

Number of brokers
Number of endpoints
Mix of files running on each endpoint image

Monitor TIE Server PostgreSQL database replication:
Log on as a "Super User" and execute the replication-monitoring.sh script that's available in the home directory. The script is intended to help administrators perform the following:

Understand the status of the replication in a secondary server.
Recover the secondary server if the replication process fails by reconfiguring it.

Use: replication-monitoring.sh <Options>

Options:

-c — Command to execute:
- monitor — Shows information about the replication status (see the Monitor information table for an explanation of the possible output fields).
- reset — Reconfigures the replication secondary server.
- autoreset — Reconfigures the replication secondary server when the gap between the primary server and secondary server reaches a threshold (1-megabyte difference by default).
-t — Threshold in bytes to determine when the replication secondary server is reconfigured. By default, this value is 1048576 bytes (1 megabyte).
--color — Include colors in the output. It makes the output easier to read.
--help — Show help.

Examples:

replication-monitoring.sh -c monitor — Returns the current secondary server replication status.
replication-monitoring.sh -c monitor --color — Returns the current secondary server replication status using colors to make it easier to read.
replication-monitoring.sh -c reset — Reconfigures the secondary server. It creates a fresh database backup from the primary server.
replication-monitoring.sh -c autoreset -t 2097152 — Reconfigures the secondary server when the secondary server replication falls behind by 2 MB compared with the primary server. This validation is made only once when the script is run. (It doesn't set a trigger if the threshold is reached in the future.) So, to validate again, you must run the command again.

Monitor information:

Field	Description
Is Replication in progress	Returns True if the server is set as a secondary server and the replication is properly configured.
Is Replication paused	It's possible to pause the replication. It returns True if the replication is paused.
Last xLog receive location	Last replication log received in the secondary server. Logs can be received, but not replayed yet.
Current xLog location in server	Current replication log on the primary server. To get this value, the script executes a remote query to the primary server.
Gap between Server and Receive	The difference in bytes between the primary server and last received log on the secondary server. It helps to determine by how much the replication process is behind.
Last xLog replay location	Last replication log replayed in the secondary server.
Gap between Server and Replay	The difference in bytes between the primary server and secondary server. It's probably the most important value to determine the health of the replication process.
Last xLog replay time	The time that the last log was replayed in the secondary server. If there's no activity in the primary server, the replication activity doesn't occur in the secondary server and the value remains the same. The value being the same doesn't mean that the replication fails.

Also, a new Health Status section is added to reflect the Secondary's database replication status. To view the health status, go to the Menu, Server Settings, TIE Server Topology Management page.

Troubleshooting TIE Server PostgreSQL database replication:

How do I reconfigure replication in a secondary server?
Run the following command: /usr/sbin/replication-monitoring.sh -c reset

How do I check whether the replication is working?
Run the following command: /usr/sbin/replication-monitoring.sh -c monitor. Then, validate whether the value of Gap between Server and Replay is 0 bytes or close to 0.

What do the following log entries mean?

FATAL: Could not connect to the primary server: FATAL: No pg_hba.conf entry for replication connection from host "xx.xx.xx.xx", user "rep", SSL on

The replication user configuration is incorrect. Validate the permissions in the file /data/tieserver_pg/pg_hba.conf on the primary server. There must be an entry similar to the following:

hostssl replication rep xx.xx.xx.xx/xx cert clientcert=1 map=rep

Here, xx.xx.xx.xx/xx must be the IP address and bitmask of the server to which the primary server is replicating.

Also, there must be an entry similar to the following in the file /data/tieserver_pg/pg_ident.conf:

rep xx.xx.xx.xx rep

Here, xx.xx.xx.xx must be the IP address of the server to which the primary server is replicating.
FATAL: Could not receive data from WAL stream:
There are connectivity issues between the secondary server and primary server.
FATAL: Could not receive data from WAL stream: ERROR: Requested WAL segment xxxxxxxxxxxxxxxxxxxxxxxx has already been removed
The primary server might have removed a WAL segment still needed by the secondary server. To fix this issue, run the command /usr/sbin/replication-monitoring.sh -c reset to reconfigure the secondary server and start with a fresh backup.

Attachment

replication-monitoring.sh.gz
3K • < 1 minute @ broadband

Affected Products

Languages:

This article is available in the following languages:

Knowledge Center

How to configure, monitor, and troubleshoot Threat Intelligence Exchange Server database replication

Environment

Summary

Attachment

Affected Products

Languages:

Title

Title

Knowledge Center

How to configure, monitor, and troubleshoot Threat Intelligence Exchange Server database replication

Environment

Summary

Attachment

Affected Products

Languages:

Choose your region

Title

Title