Monday, June 17, 2013

What is Platform Search Application in BI 4.0 and How Search Application Indexing works?


Environment

SAP BusinessObjects Business Intelligence Platform 4.0

Resolution

Platform Search Application Indexing is a continuous process that involves the following sequential tasks:

1. Use Crawling mechanism to poll the CMS repository and identifies objects that are published, modified, or deleted. It can be done in two ways: continuous and scheduled crawling. 

2. Use Extracting mechanism to call the extractors based upon the document type. There is a dedicated extractor for every document type that is available in the repository. There are following extractors:

  •  Metadata Extractor 
  • Crystal Reports Extractor 
  • Web Intelligence Extractor 
  • Universe Extractor 
  • BI Workspace 
  • Agnostic Extractor (Microsoft Word/Excel/PPT, Text, RTF, PDF) 

3. The extracted content will be stored in the local file system 

(<BI 4.0 Install folder>\Data\PlatformSearchData\workplace\Temporary Surrogate Files) in an xml format called as Surrogate files.

4. These surrogate files will be uploaded to Input File Repository Server (FRS) and will be removed from the local file system

5. The content of the surrogate files will be read and will be indexed by using specific index Engine into temporary location called as Delta Indexing Area(<BI 4.0 Install folder>\Data\PlatformSearchData\workplace\DeltaIndexes).

6. The Delta index will be uploaded to Input FRS and will be deleted from the local file system.

7. The Delta Index will be read and will be merged into Master Indexed Area

(<BI 4.0 Install folder>\Data\PlatformSearchData\Lucene Index Engine\index) which is the final indexed area in the local file system.

8. After completing the Indexing task the following things will be generated:

•  Content store: The content store contains information such as id, cuid, name, kind, and instance extracted from the master index in a format that can be read easily. This helps to quicken the search.

Each AdaptiveProcessingServer creates its own content store.
  e.g.
<BI 4.0 Install folder >\Data\PlatformSearchData\workplace<NodeName>.AdaptiveProcessingServer\ContentStores

•  Speller/Suggestions: The similar words will be created from the master indexed data and will be indexed. The speller folder will be created under “Lucene Index Engine” folder (<BI 4.0 Install folder>\Data\PlatformSearchData\Lucene Index Engine\speller)

9. The Delta index files which were uploaded to the Input FRS will be removed from Input FRS after a few hours.

Remark:

BI 4.0 Platform Search Application supports indexing in a clustered environment.

  • The search and indexing will be active in all nodes, but only one node can merge the delta index to master index. This master index can be used in all the nodes for search operation.
  • All nodes share the same master index, however, each node builds its own content store.  
  • Master Index location needs to be shared by all nodes, it’ s not necessary to share Persistent data location and Non-persistent data location. 
Hope you find this useful.
Cheers,
Umang Patel
+919979084870
SAP BO BI Solution Architect/Consultant

3 comments:

  1. What is the advantage between continous and scheduled crawling?

    ReplyDelete
  2. Crafsol is the top IT company in Thane, Nashik which provide Top SAP Application Maintenance Support.It provides services like User Support, Applications Continuity, Minor Upgrades.

    ReplyDelete