Parquet Data Sources

Description of process

This process involves creating a Parquet Data Source based on information sources and other views, and loading a .parquet file into the Azure Data Lake Gen 2 Storage folder.

Add new Parquet Data Source: 

Through IFS Cloud Web, Parquet Data Sources can be created and saved. Selection of the Source Origin and Load Type, required columns can be done depending on the requirement. 

Load Parquet Data Source:

Once the Parquet data source is created, it can be loaded into a target Data Lake destination (Self-Hosted Data Lake, IFS.ai Platform Data Lake). Loading moves data from oracle into the ADLSG2 by creating .parquet files in the defined folders. Loading action can be triggered via an Analysis Model, Workload Job Definition or based on an explicit trigger to load a Parquet Data Source. The specific destination can be defined when creating a New Data Source. Else, it will be determined based on the case of the trigger.

 During the source creation, Load Type can be selected, and during the Parquet Data Source refresh process, .parquet files are created based on the specified Load Type

Edit existing Parquet Data Source:   

Once the  Parquet Data Source is Loaded, there is an option to Edit the Parquet Data Sources. 

  1. Edit Max Age 
  2. Edit Columns 
  3. Edit Description 
  4. Edit Destination
  5. Edit Incremental Load Details   

Once the editing is completed, performing an Explicit Load via IFS Cloud Web is a must, to reflect the changes in the ADLS Gen 2.   

Import/Export Parquet Data Source: 

Loaded Parquet Data Sources can be exported from one environment and imported to another environment (From DEV ,UAT/PROD).