Interface FeatureReportCalculator
-
- All Known Implementing Classes:
DefaultFeatureReportCalculator
public interface FeatureReportCalculator
Interface for computing Feature information (@see FeatureReport) from a Cortex DataSource and source data.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description FeatureReport
computeDataSourceFeatures(java.lang.String project, java.lang.String sourceName, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sourceDf, java.lang.Boolean performCalculations)
Computes theFeatures
associated with a DataSource from the given Dataset.FeatureReport
computeFeatureReport(java.lang.String project, java.lang.String sourceName, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sampleDf, boolean performCalculations, java.util.List<org.apache.spark.sql.Row> previewCollection, java.lang.String profileGroup)
Computes theFeatures
associated with a given DataSource and ProfileGroup from a sample of the data.FeatureReport
computePreviewFeatures(java.lang.String project, java.lang.String sourceName, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sourceDf)
Computes theFeatures
from an explicit sample of the DataSource.FeatureReport
computeProfileFeatures(java.lang.String project, java.lang.String sourceName, java.lang.String profileGroup, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sourceDf, java.lang.Boolean performCalculations)
Computes theFeatures
associated with a DataSource and specificProfileGroup
.
-
-
-
Method Detail
-
computeFeatureReport
FeatureReport computeFeatureReport(java.lang.String project, java.lang.String sourceName, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sampleDf, boolean performCalculations, java.util.List<org.apache.spark.sql.Row> previewCollection, java.lang.String profileGroup)
Computes theFeatures
associated with a given DataSource and ProfileGroup from a sample of the data.- Parameters:
project
- project the DataSource belongs tosourceName
- Cortex DataSource namesampleDf
- source dataperformCalculations
- whether additional calculations should be performed based on the source data to fill out feature information. If false, not all properties will be filledpreviewCollection
- explicit preview of the dataprofileGroup
- name of the profile group, maybe null- Returns:
FeatureReport
feature information with a reference to the sample the features were inferred from
-
computeDataSourceFeatures
FeatureReport computeDataSourceFeatures(java.lang.String project, java.lang.String sourceName, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sourceDf, java.lang.Boolean performCalculations)
Computes theFeatures
associated with a DataSource from the given Dataset. Features will not be associated to a specificProfileGroup
.- Parameters:
project
- project the DataSource belongs tosourceName
- Cortex DataSource namesourceDf
- source dataperformCalculations
- perform analytic calculations- Returns:
FeatureReport
feature information with a reference to the sample the features were inferred from.
-
computePreviewFeatures
FeatureReport computePreviewFeatures(java.lang.String project, java.lang.String sourceName, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sourceDf)
Computes theFeatures
from an explicit sample of the DataSource. The provided Dataset should be a sample of the entire dataset, as implementations should use the given dataset for calculations, and not a sub-sample. Features will not be associated to a specificProfileGroup
.- Parameters:
project
- project the DataSource belongs tosourceName
- Cortex DataSource namesourceDf
- source data- Returns:
FeatureReport
-
computeProfileFeatures
FeatureReport computeProfileFeatures(java.lang.String project, java.lang.String sourceName, java.lang.String profileGroup, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> sourceDf, java.lang.Boolean performCalculations)
Computes theFeatures
associated with a DataSource and specificProfileGroup
.- Parameters:
project
- project the DataSource belongs tosourceName
- DataSource nameprofileGroup
- profile group namesourceDf
- source dataperformCalculations
- perform analytic calculations- Returns:
FeatureReport
-
-