Indexing Details
Index Structure
All data that is delivered by simple rest endpoints is indexed in search indices. Queries and data delivery takes place directly out of the search index (not from Pimcore database).
For each Datahub configuration separate search indices will be created and updated.
Indexing of data takes places asynchronously with an update queue and the process queue command
datahub:simple-rest:process-queue
. This command needs to be executed on a regular basis, e.g.
every 5 minutes.
Index mapping and queue filling takes place automatically when creating and updating Datahub
configurations. In addition, also commands for index management are available
(datahub:simple-rest:create-or-update-mapping
, datahub:simple-rest:init-index
).
Per endpoint, multiple indices are created - one for each dataobject class, one for dataobject folders, one for assets and one for asset folders.
For assets metadata exif, xmp and iptc, indexed dynamic objects are used. It might be necessary to turn off the indexing for these objects in order to avoid indexing conflicts or when the limit of maximum data fields is reached.
Indexing can be turned off in bundle configuration in indexing_options
area. Then data is stored
in index and delivered via endpoint, but it is not indexed.
Tree Hierarchy Management The indexing process tries to keep a valid folder structure in index. Based on workspace settings a combined parent folder is calculated. This combined parent folder, might be a sub folder of the parent folder in Pimcore folder structure, and all element paths are rewritten to it.
Also it might be possible, that due to workspace and data schema settings, missing links in folder structure occur. In this case, the indexing process creates virtual folders to fill up these gaps.
For updating whole index structure after changes, multiple runs of datahub:simple-rest:process-queue
might
be necessary (since additional items might be added to queue during queue processing).
Indexing Options
Via symfony configuration, detailed indexing options can be configured.
Global Options
Define some global options like automatic data type detection for dynamic objects.
pimcore_data_hub_simple_rest:
indexing_options:
global_options:
# Enable numeric detection for dynamic objects (like embedded asset meta data, etc.)
numeric_detection: false
# Enable date detection for dynamic objects (like embedded asset meta data, etc.)
date_detection: true
Assets
Define, if embedded metadata of assets should be indexed. If, they will be indexed as dynamic objects.
pimcore_data_hub_simple_rest:
indexing_options:
assets:
# Enable indexing for exif data
enable_exif: true
# Enable indexing for xmp data
enable_xmp: true
# Enable indexing for iptc data
enable_iptc: true
Number of Shards
By default, number of shards for indexes is set to 1
. Configurations allow to change that for all created indices via
the default_number
and on index level via the index_specific
setting if needed.
# Configure number of shards for created indices
pimcore_data_hub_simple_rest:
indexing_options:
number_of_shards_config:
# default number is picked if no index specific settings is set
default_number: 3
# Define number of shards for certain indices. Define index name (without -odd/-even postfix) as key, and number of shards as value.
index_specific:
enterprise_simple_rest__pt_rest__asset: 5
enterprise_simple_rest__<ENDPOINTNAME>__<asset|DataObjectClass>: 3