Understanding Real Time Indexing
Search Indexes in PeopleSoft are deployed once and then they need to be scheduled on a recurrence so that the data can get updated. Recurrence schedule can be based on the frequency with which data gets updated.
For example, PTPORTALREGISTRY index can be scheduled once a day and navigation menu is not something that changes often. However, HC_HR_JOB_DATA needs to be scheduled frequently because the data keeps changing through the day and it needs to get reflected in the index.
This creates a "data lag window" during which data has been updated in PeopleSoft but it is not reflected in the search index because it has not run yet.
Real Time Indexing is a solution for that. Real Time Indexing eliminates stale data in Elasticsearch indexes and ensures that discrepancies do not exist between the data in the PeopleSoft database and the indexed data in Elasticsearch.
The real time indexing process ensures that data is updated on the search server as soon as an application transaction is saved. Based on the volume of data under processing in the real time indexing queue, the transaction update may appear to be real time or near real time
How Does Real Time Indexing Work?
PeopleSoft Search Framework supports real time indexing through SQL triggers and uses set based processing. Any search definition with the source type of query or connected query can be configured for real time indexing.
Note that Oracle does not deliver search definitions enabled for real time indexing automatically. You will need to enable search definitions for real time indexing according to your business needs.
Real time indexing uses database triggers as the initializing point for the communication to the search server. As the data is inserted, updated, or deleted, the database trigger associated with the application record inserts a row in the real time indexing staging table.
The staging table acts as an interim data-holder by storing the keys of the transaction. Your Process Scheduler domain has a new process for real time indexing, PSRTISRV, which polls the staging table at regular intervals for any data to process.
Enabling Real Time Indexing
By default, real time indexing is enabled and if you would like to disable it, you can do so from Process Scheduler Configure Domain option.
By default only one process is initiated for PSRTISRV but you can change it in the psprcs.cfg file.
Real time indexing processes the transactions stored in the staging table as a set. The size of the set can be configured in the Search Options page, which can be specified as per the available resources in the Process Scheduler server. A set can contain just one transaction or it can contain the maximum number of transactions as specified in the Real Time Indexing Set Size property based on the available data at any point in time.
Real time indexing again processes each set for a specific search definition and begins the data retrieval process for each search definition. After the data is collected and formatted to a JSON structure, it is transferred to the search server using Direct Transfer.
Note: You should periodically run the AE maintenance program (PTRTI_TRUNC) to ensure that the real time indexing staging tables are de-fragmented after running large volume batch update for better performance.