Class IngestDataSourceJob

  • All Implemented Interfaces:
    java.lang.Runnable

    public class IngestDataSourceJob
    extends java.lang.Object
    implements java.lang.Runnable
    Ingest a DataSource
    • Field Detail

      • DEFAULT_DATASOURCE_FORMATTER

        public static final java.util.function.Function<org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>> DEFAULT_DATASOURCE_FORMATTER
      • formatDatasetForDataSource

        public static java.util.function.Function<org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>> formatDatasetForDataSource
        transform on DataSource DataFrame to alter Timestamp and Date type columns to String Timestamp and Date type columns are not supported in Phoenix at this time
      • performFeatureCatalogCalculations

        public java.util.function.Supplier<java.lang.Boolean> performFeatureCatalogCalculations
        Perform FeatureCatalog calculations, is an expensive operation and may be set to false depending on the dataset
    • Constructor Detail

      • IngestDataSourceJob

        public IngestDataSourceJob​(java.lang.String project,
                                   java.lang.String sourceName,
                                   CortexContext cortexContext)
        Constructor
        Parameters:
        project - the project
        sourceName - the DataSource name
        cortexContext - the context
    • Method Detail

      • run

        public void run()
        Specified by:
        run in interface java.lang.Runnable