Version: latest

Connection Types

This page serves as a reference for the connection types available in Cortex. It includes information about the following:

The parameters available for connection definitions, including the common parameters available to all connections and the parameters specific to different connection types.

info

Connections can be defined in the Admin Console or in CLI.

Common connection parameters

All connection definitions must include the following.

Attribute	Description
`name`	The name given to the connection
`title`	Human readable connection type
`description`	Description of the file contents
`group`
`type`	The connection type: file, s3, mongo, or hive
`tags`	The `tags` list contains optional values that you can enter to differentiate connections. Each tag item should include a `label` and `value`.
`connectionParams`	The parameters required for the connection type selected, described in the following sections.

each connectionParam has the following content: name, title, description, type, required (boolean), validation, errorMessage.

The specific file types/delimiter options for all connection types are: csv/sep, csv/lineSep, csv/encoding, csv/comment, csv/quote, csv/escape, csv/multiline, csv/header, jaon/style, json/multiline, json/lineSep, json/encoding

S3 Connections

S3 connectionParams

When creating a connection definition for s3, the following parameters are available.

Parameter	Type	Default	Description	Required
`publicKey`	String		An AWS public key with at least read access to the S3 Bucket. (Do NOT use your AWS admin key.)	true
`secretKey`	String		An AWS secret key with at least read access to the S3 Bucket. (Do NOT use your AWS admin key.)	true
`s3Endpoint`	String		The S3 HTTP(s):// URL to use. Typically only applicable when using a server like Minio and hosting a private instance.	false
`pathStyleAccess`	Boolean		Enable/disable path style access for non-AWS s3 connections (minio/noobaa)	false
`sslEnabled`	Boolean		True if the connection uses SSL encryption when connecting to S3.	false
contentType	String		The type of file; validValues are CSV, JSON, Parquet	true
`irsaEnabled`	Boolean	false	Set to `true` when IRSA is enabled in Cortex Helm chart .yaml; used to distinguish if a connection should provide AWS API creds or inherit them via IRSA	false
`qualifiedBy`	Boolean		A property of `publicKey` and `secretKey` when IRSA is enabled	false

S3 YAML

- name: s3
    title: S3 Connection
    description: File storage with S3.
    group: cortex
    type: s3
    tags:
      - label: category.connection.type
        value: Files
      - label: category.connection.type
        value: Cloud Storage
    connectionParams:
      - name: irsaEnabled
        title: Use IAM Role Service Account (IRSA) Authentication
        description: >
          Allows for credentials to be inherited through IRSA
        type: Boolean
        default: false
        required: false
        validation: "/^true|false$/g"
      - name: publicKey
        title: Public Access Key
        description: >
          An AWS public key with at least read access to the S3 Bucket. (Do NOT use your AWS admin key.)
        type: String
        required: true
        validation: "/^.+$/g"
        errorMessage: Invalid public access key.
        qualifiedBy: irsaEnabled=false
      - name: secretKey
        title: Secret Access Key
        description: >
          An AWS secret key with at least read access to the S3 Bucket. (Do NOT use your AWS admin key.)
        type: String
        secure: true
        required: true
        validation: "/^.+$/g"
        errorMessage: Invalid secret access key.
        qualifiedBy: irsaEnabled=false
      - name: s3Endpoint
        title: S3 API Endpoint
        description: >
          The S3 HTTP(s):// URL to use. Typically only applicable when using a server like Minio and hosting a private instance.
        type: String
        required: false
        validation: "/^http(s)?:\\/\\/.+$/g"
        errorMessage: Invalid URL.
      - name: pathStyleAccess
        title: Path Style Access (Non-AWS)
        description: >
          Enable/disable path style access for non-AWS s3 connections (minio/noobaa).
        type: Boolean
        required: false
        validation: "/^true|false$/g"
        errorMessage: Must be true or false.
      - name: sslEnabled
        description: >
          True if the connection uses SSL encryption when connecting to S3.
        title: SSL Enabled
        type: Boolean
        required: false
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
      - name: contentType
        type: String
        title: Content Type
        description: Description of the file type.
        required: true
        validValues:
          - CSV
          - JSON
          - Parquet
      - name: csv/sep
        type: String
        title: Separator
        description: Character used to delimit fields in the record.
        defaultValue: ","
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid separator; must be a single character.
      - name: csv/lineSep
        type: String
        title: Line Separator
        description: The line separator that should be used for parsing. Maximum length is 1 character.
        defaultValue: "\n"
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid line separator; must be a single character.
      - name: csv/encoding
        title: Encoding
        description: decodes the CSV files by the given encoding type.
        type: String
        qualifiedBy: contentType
        defaultValue: "UTF-8"
        required: false
        validation: "/^.+$/g"
        errorMessage: Incorrect encoding type.
      - name: csv/comment
        type: String
        description: sets a single character used for skipping lines beginning with this character.
        defaultValue: "\""
        title: Comment Character
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid comment character; must be a single character.
      - name: csv/quote
        type: String
        description: Character used to denote quotation marks (single or double quotes).
        defaultValue: "\""
        title: Quote Character
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid quote character; must be a single character.
      - name: csv/escape
        type: String
        description: Character used to escape values that contain delimiters.
        defaultValue: "\""
        title: Escape Character
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid escape character; must be a single character.
      - name: csv/multiline
        type: Boolean
        title: Multiline
        description: Parse one record, which may span multiple lines.
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
        required: false
        qualifiedBy: contentType
      - name: csv/header
        type: Boolean
        title: First Line is Header Row
        description: True if the first line of the file contains a header row with column names.
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
        required: false
        qualifiedBy: contentType
      - name: json/style
        type: String
        title: JSON Style
        description: Format style of the JSON file (lines, array, or object).
        defaultValue: "lines"
        validValues:
          - lines
          - array
          - object
        required: true
        qualifiedBy: contentType
        errorMessage: Must be lines, array, or object.
      - name: json/multiline
        type: Boolean
        title: Multiline
        description: Parse one record, which may span multiple lines.
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
        required: false
        qualifiedBy: contentType
      - name: json/lineSep
        type: String
        title: Line Separator
        description: The line separator that should be used for parsing. Maximum length is 1 character.
        defaultValue: "\n"
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid delimiter; must be a single character.
      - name: json/encoding
        title: Encoding
        description: >
          allows to forcibly set one of standard basic or extended encoding for the JSON files.
          For example UTF-16BE, UTF-32LE.
          If the encoding is not specified and multiLine is set to true, it will be detected automatically.
        type: String
        qualifiedBy: contentType
        defaultValue: "UTF-8"
        required:false
        validation: "/^.+$/g"
        errorMessage: Incorrect encoding type.

S3 File Stream Connections

For use with Spark

S3 File Stream ConnectionParams

S3 File Stream uses the same parameters as the S3 connection (table above).

S3 File Stream has all the same properties as the S3 connection plus the following additional connectionParams properties (Refer to the bootstrap.yaml below):

Bootstrap URI: The S3 File URI to use for fetching base records that are used to infer the schema. (e.g. uri is set to s3a://path/to/file.txt).
Stream Read Directory: The S3 directory to stream updates from (e.g. stream_read_dir is set to s3a://path/to/files).
Trigger: Allows you to set up an ingestion schedule for Data Sources that pull from this connection. When isTriggered is set to true, Data Sources must be triggered (rebuilt) manually using the API method or the Fabric Console UI.
When isTriggered is set to false (default), the following parameters are also set to automatically poll the Connection on a schedule:
- pollInterval: (in seconds - how often the Data Sources poll the Connection and rebuild automatically)
- maxFilesPerTrigger (integer - number of files ingested each time the Connection is polled)

Parameter	Type	Default	Description	Required
`uri`	String		The S3 File URI to use for fetching base records that are used to infer the schema, `s3a://path/to/file.txt`	false
`stream_read_dir`	String		The S3 directory to stream updates from, `s3a://path/to/file.txt`.	true
`isTriggered`	boolean	`false`	Stops standard polling of streaming data and instead ingests all available files on trigger.	false
`maxFilesPerTrigger`	integer	"1"	The number of files to process for each poll interval.	false
`pollInterval`	string	"300"	the period between polling in seconds	false

S3 File Stream YAML

...
  - name: s3FileStream
    title: S3 File Stream
    description: Stream files with S3.
    group: cortex
    type: s3FileStream
    tags:
      - label: category.connection.type
        value: Files
      - label: category.connection.type
        value: Cloud Storage
      - label: category.connection.type
        value: Streaming
    connectionParams:
      - name: uri
        title: Bootstrap URI
        description: >
          The S3 File URI to use for base records and schema inference, `s3a://path/to/file.txt`.
        type: String
        required: false
        validation: "/^.+$/g"
        errorMessage: Invalid S3 File URI.
      - name: stream_read_dir
        title: Stream Read Directory
        description: >
          The S3 directory to stream updates from, `s3a://path/to/files`.
        type: String
        required: true
        validation: "/^.+$/g"
        errorMessage: Invalid S3 File URI...
        - name: isTriggered
          title: Trigger manually using Data Source ingest
          description: >
            Stops standard polling of streaming data and instead ingests all available files on trigger.
          type: Boolean
          default: false
          required: false
          validation: "/^true|false$/g"
        - name: maxFilesPerTrigger
          type: String
          title: Max Files per Poll Interval
          description: The number of files to process for each poll interval.
          defaultValue: "1"
          required: false
          qualifiedBy: isTriggered=false
          validation: "/^[1-9]\\d{0,7}$/g"
          errorMessage: Invalid number of files; must be an integer of 8 digits or less.
        - name: pollInterval
          type: String
          title: Poll Interval
          description: The poll interval in seconds.
          defaultValue: "300"
          required: false
          qualifiedBy: isTriggered=false
          validation : "/^[1-9]\\d{0,7}$/g"
          errorMessage: Invalid poll interval; must be an integer of 8 digits or less.

Google Cloud Storage Connections

For use with Spark

GCS ConnectionParams

Parameter	Type	Default	Description	Required
`uri`	string		The GCS File URI to use, `gs://path/to/file.txt`	true
`serviceAccountKey`	string		Google Service Account Json credentials to authenticate against GCS (include: `secure:true`)	false
`storageRoot`	string		The GCS HTTP(s):// URL to use, `https://storage.googleapis.com/`. Typically only applicable when using a server like Minio and hosting a private instance	false
`servicePath`	string	`storage/v1/	The GCS Service Path to use	false

GCS YAML

- name: gcs
    title: GCS Connection
    description: File storage with GCS.
    group: cortex
    type: gcs
    tags:
      - label: category.connection.type
        value: Files
      - label: category.connection.type
        value: Cloud Storage
    connectionParams:
      - name: uri
        title: File URI
        description: >
          The GCS File URI to use, `gs://path/to/file.txt`.
        type: String
        required: true
        validation: "/^.+$/g"
        errorMessage: Invalid GCS File URI.
      - name: serviceAccountKey
        title: Service Account Key Json Secret
        description: >
          Google Service Account Json credentials to authenticate against GCS.
        type: String
        secure: true
        required: false
        validation: "/^#SECURE\\..+$/g"
        errorMessage: Invalid Google Service Account Key
      - name: storageRoot
        title: GCS API Root
        description: >
          The GCS HTTP(s):// URL to use, `https://storage.googleapis.com/`. Typically only applicable when using a server like Minio and hosting a private instance.
        type: String
        required: false
        validation: "/^http(s)?:\\/\\/.+$/g"
        errorMessage: Invalid URL.
      - name: servicePath
        title: GCS Service Path
        description: >
          The GCS Service Path to use, `storage/v1/`.
        type: String
        required: false
        validation: "/^.+$/g"
        errorMessage: Invalid GCS Service Path.
      - name: csv/sep
        type: String
        title: Separator
        description: Character used to delimit fields in the record.
        defaultValue: ","
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid separator; must be a single character.
      - name: csv/lineSep
        type: String
        title: Line Separator
        description: The line separator that should be used for parsing. Maximum length is 1 character.
        defaultValue: "\n"
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid line separator; must be a single character.
      - name: csv/encoding
        title: Encoding
        description: decodes the CSV files by the given encoding type.
        type: String
        qualifiedBy: contentType
        defaultValue: "UTF-8"
        required: false
        validation: "/^.+$/g"
        errorMessage: Incorrect encoding type.
      - name: csv/comment
        type: String
        description: sets a single character used for skipping lines beginning with this character.
        defaultValue: '"'
        title: Comment Character
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid comment character; must be a single character.
      - name: csv/quote
        type: String
        description: Character used to denote quotation marks (single or double quotes).
        defaultValue: '"'
        title: Quote Character
        required: false
        qualifiedBy: contentType
        validation: '/^(''|")$/'
        errorMessage: Invalid quote character; must be a single character.
      - name: csv/escape
        type: String
        description: Character used to escape values that contain delimiters.
        defaultValue: '"'
        title: Escape Character
        required: false
        qualifiedBy: contentType
        validation: "/^.$/g"
        errorMessage: Invalid escape character; must be a single character.
      - name: csv/multiline
        type: Boolean
        title: Multiline of Line Separator
        description: Parse one record, which may span multiple lines.
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
        required: false
        qualifiedBy: contentType
      - name: csv/header
        type: Boolean
        title: First Line is Header Row
        description: True if the first line of the file contains a header row with column names.
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
        required: false
        qualifiedBy: csv/multiline=true
      - name: json/multiline
        type: Boolean
        title: Multiline of JSON Style
        description: Parse one record, which may span multiple lines.
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
        required: false
        qualifiedBy: contentType
      - name: json/style
        type: String
        title: JSON Style
        description: Format style of the JSON file (lines, array, or object).
        defaultValue: "lines"
        validValues:
          - lines
          - array
          - object
        required: false
        qualifiedBy: json/multiline=true
        errorMessage: Must be lines, array, or object.
      - name: json/lineSep
        type: String
        title: Line Separator
        description: The line separator that should be used for parsing. Maximum length is 1 character.
        defaultValue: "\n"
        required: false
        qualifiedBy: json/multiline=true
        validation: "/^.$/g"
        errorMessage: Invalid delimiter; must be a single character.
      - name: json/encoding
        title: Encoding
        description: >
          allows to forcibly set one of standard basic or extended encoding for the JSON files.
          For example UTF-16BE, UTF-32LE.
          If the encoding is not specified and multiLine is set to true, it will be detected automatically.
        type: String
        qualifiedBy: json/multiline=true
        defaultValue: "UTF-8"
        required: false
        validation: "/^.+$/g"
        errorMessage: Incorrect encoding type.

GCS File Stream Connections

GCS File Stream ConnectionParams

GCS File Stream uses the same parameters as the GCS connection (above).

GCS File Stream has all the same properties as the GCS connection plus the following additional connectionParams properties (Refer to the YAML below):

Bootstrap URI: The GCS File URI to use for fetching base records that are used to infer the schema. (e.g. uri is set to gs://path/to/file.txt).
Stream Read Directory: The GCS directory to stream updates from (e.g. stream_read_dir is set to gs://path/to/file.txt).
Trigger: Allows you to set up an ingestion schedule for Data Sources that pull from this connection. When isTriggered is set to true, Data Sources must be triggered (rebuilt) manually using the API method or the Fabric Console UI.
When isTriggered is set to false (default), the following parameters are also set to automatically poll the Connection on a schedule:
- pollInterval: (in seconds - how often the Data Sources poll the Connection and rebuild automatically)
- maxFilesPerTrigger (integer - number of files ingested each time the Connection is polled)

Parameter	Type	Default	Description	Required
`uri`	String		The GCS File URI to use for fetching base records that are used to infer the schema, `gs://path/to/file.txt`	false
`stream_read_dir`	String		The GCS directory to stream updates from, `gs://path/to/file.txt`.	true
`isTriggered`	boolean	`false`	Stops standard polling of streaming data and instead ingests all available files on trigger.	false
`maxFilesPerTrigger`	integer	"1"	The number of files to process for each poll interval.	false
`pollInterval`	string	"300"	the period between polling in seconds	false

GCS File Stream YAML

- name: gcsFileStream
  title: GCS Filestream
  description: Stream files in Google Cloud Storage.
  group: cortex
  type: gcsFileStream
  tags:
    - label: category.connection.type
      value: Files
    - label: category.connection.type
      value: Cloud Storage
    - label: category.connection.type
      value: Streaming
  connectionParams:    
    - name: uri
      title: Bootstrap URI
      description: >
        The GCS File URI to use for base records and schema inference, `gs://path/to/file.txt`.
      type: String
      required: true
      validation: "/^.+$/g"
      errorMessage: Invalid GCS File URI.
    - name: stream_read_dir
      title: Stream Read Directory
      description: >
        The GCS directory to stream updates from, `gs://path/to/file.txt`.
      type: String
      required: true
      validation: "/^.+$/g"
      errorMessage: Invalid GCS File URI.
    - name: isTriggered
      title: Trigger manually using Data Source ingest
      description: >
       Stops standard polling of streaming data and instead ingests all available files on trigger.
      type: Boolean
      default: false
      required: false
      validation: "/^true|false$/g"
    - name: maxFilesPerTrigger
      type: String
      title: Max Files per Poll Interval
      description: The number of files to process for each poll interval.
      defaultValue: "1"
      required: false
      qualifiedBy: isTriggered=false
      validation: "/^[1-9]\\d{0,7}$/g"
      errorMessage: Invalid number of files; must be an integer of 8 digits or less.
    - name: pollInterval
      type: String
      title: Poll Interval
      description: The poll interval in seconds.
      defaultValue: "300"
      required: false
      qualifiedBy: isTriggered=false
      validation : "/^[1-9]\\d{0,7}$/g"
      errorMessage: Invalid poll interval; must be an integer of 8 digits or less.

Mongo connections

Mongo connectionParms

When creating a connection definition for mongo, the following parameters are available.

Parameter	Type	Default	Description	Required
`username`	String		The username for authenticating to the database	false
`password`	String		The secret ref containing the password for authenticating to the database.	false
`uri`	String	mongodb://{host:port}/{database}	The URI string including: database name, username, and password. NOTE: To set a secret variable set the parameter `secure: true`. See https://docs.mongodb.com/manual/reference/connection-string/ for more details.	true
`collection`	String		Enter the name of the collection to query in the Mongo database.	false
`database`	String		Enter the name of the Mongo database to connect to.	false
`sslEnabled`	Boolean	false	True if the connection uses SSL encryption when connecting to the database. (Recommended)	false

Mongo YAML

- name: mongo
    title: MongoDB
    description: |
      Query documents stored in MongoDB.
      Below is an example connection.
      https://docs.mongodb.com/spark-connector/master/

      "```
      name: default/exampleMongoConnection
      title: Example Mongo Connection
      description: Example Mongo Connection
      connectionType: mongo
      allowWrite: true
      params:
        - name: mongoUri
          value: mongodb://mongodb:27017/auto_test
      ```"
    group: cortex
    type: mongo
    tags:
      - label: category.connection.type
        value: NoSQL
      - label: category.connection.type
        value: Document Store
    connectionParams:
      - name: username
        title: Username
        description: The username for authenticating to the database.
        type: String
        required: false
        validation: "/^.+$/g"
        errorMessage: Incorrect username format.
      - name: password
        title: Password
        description: The secret ref containing the password for authenticating to the database.
        type: String
        secure: true
        required: false
        validation: "/^.+$/g"
        errorMessage: Incorrect password format.
      - name: uri
        title: Mongo URI
        description: >
          The URI of the Mongo instance to access.
          The connection string of the form mongodb://host:port/ where host can be a hostname, IP address, or UNIX domain socket. If :port is unspecified, the connection uses the default MongoDB port 27017.
          All options can be specified directly in the URI and if both are provided the option in the URI will take precedence.
          See https://docs.mongodb.com/spark-connector/master/configuration#input-configuration for more details.
        type: String
        required: true
        secure: false
        defaultValue: "mongodb://{host:port}/"
        validation: "/^(mongodb.*?):(?:.+)$/g"
        errorMessage: Invalid Mongo URI.
      - name: collection
        description: Enter the name of the collection to query in the Mongo database.
        title: Mongo Collection
        type: String
        required: false
        validation: "/^\\w+$/g"
        errorMessage: Invalid collection name.
      - name: database
        description: Enter the name of the Mongo database to connect to.
        title: Database Name
        type: String
        required: false
        validation: "/^\\w+$/g"
        errorMessage: Invalid collection name.
      - name: sslEnabled
        description: >
          True if the connection uses SSL encryption when connecting to the database. (Recommended)
        title: SSL Enabled
        type: Boolean
        required: false
        validation: "/^true|false$/g"
        defaultValue: "false"
        errorMessage: Must be true or false.
        # options not added batchSize, localThreshold, readPreference.name, and readPreference.tagSets

Hive Connections

Hive connectionParms

When creating a connection definition for hive, the following parameters are available.

Parameter	Type	Default	Description	Required
`autoCreateAll`	Boolean	"true"	Optional flag that can reduce errors with an empty metastore database as of Hive 2.1.	false
`schemaVerification`	Boolean	"false"	Optional flag that can reduce errors with an empty metastore database as of Hive 2.1.	false
`metastoreUri`	String		The thrift URL of the Hive Metastore Server.	true
connectionUrl	String	"jdbc:hive2://{host:port}/{database}"	The JDBC compliant Hive URI used to connect to the database. URI format should conform to this pattern: `jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName;initFile=<file>;sess_var_list?hive_conf_list#hive_var_list`. See https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-ConnectionURLFormat for more details.https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html	true
`connectionUserName`	String		The username for authenticating to the database.	false
`connectionPassword`	String		The password for authenticating as an authorized user. NOTE: To set a secret variable set the parameter `secure: true`.	false
`metastoreVersion`	String		Version of the Hive Metastore to connect to	true
`metastoreJars`	String		Jars to use when connecting to Hive Metastore, dependent on version of Hive https://docs.databricks.com/data/metastores/external-hive-metastore.html#spark-configuration-options	false
`warehouseDir`	String	"spark-warehouse"	The location to use for the spark warehouse dir.	false

Hive YAML

- name: hive
   title: Hive
   description: >
     Query data in Hive.
     https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/integrating-hive/content/hive_configure_a_spark_hive_connection.html
   group: cortex
   type: hive
   tags:
    - label: category.connection.type
      value: SQL
   connectionParams:
#      - name: datanucleus.schema.autoCreateAll
     - name: autoCreateAll
       title: AutoCreate Schema
       description: Optional flag that can reduce errors with an empty metastore database as of Hive 2.1.
       type: Boolean
       required: false
       validation: "/^true|false$/g"
       defaultValue: "true"
       errorMessage: Must be true or false.
#      - name: hive.metastore.schema.verification
     - name: schemaVerification
       title: Spark Metastore Schema Verification
       description: Optional flag that can reduce errors with an empty metastore database as of Hive 2.1.
       type: Boolean
       required: false
       validation: "/^true|false$/g"
       defaultValue: "false"
       errorMessage: Must be true or false.
#      - name: spark.hadoop.hive.metastore.uris
     - name: metastoreUri
       title: Spark Metastore Uris
       description: The thrift URL of the Hive Metastore Server.
       type: String
       required: true
       validation: "/^.+$/g"
       errorMessage: Incorrect Metastore URL.
#      - name: spark.hadoop.javax.jdo.option.ConnectionURL
     - name: connectionURL
       title: URI
       description: >
         The JDBC compliant Hive URI used to connect to the database. URI format should conform to this pattern: `jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName;initFile=<file>;sess_var_list?hive_conf_list#hive_var_list`.
         See https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-ConnectionURLFormat for more details.
         https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html
       type: String
       required: true
       defaultValue: "jdbc:hive2://{host:port}/{database}"
       validation: "/^jdbc:hive2:(?:.+)$/g"
       errorMessage: Incorrect JDBC URI format.
#      - name: spark.hadoop.javax.jdo.option.ConnectionUserName
     - name: connectionUserName
       title: Username
       description: The username for authenticating to the database.
       type: String
       required: false
       validation: "/^.+$/g"
       errorMessage: Incorrect username format.
#      - name: spark.hadoop.javax.jdo.option.ConnectionPassword
     - name: connectionPassword
       title: Password
       description: The password for authenticating as an authorized user.
       type: String
       secure: true
       required: false
       validation: "/^.+$/g"
       errorMessage: Incorrect password format.
#      - name: spark.sql.hive.metastore.version
     - name: metastoreVersion
       title: Hive Metastore Version
       description: Version of the Hive Metastore to connect to
       type: String
       required: true
       validation: "/^.+$/g"
       errorMessage: Incorrect metastore version.
#      - name: spark.sql.hive.metastore.jars
     - name: metastoreJars
       title: Metastore Jars
       description: >
         Jars to use when connecting to Hive Metastore, dependent on version of Hive
         https://docs.databricks.com/data/metastores/external-hive-metastore.html#spark-configuration-options
       type: String
       required: false
       validation: "/^.+$/g"
       errorMessage: Incorrect Metastore Jars.
#      - name: spark.sql.warehouse.dir
     - name: warehouseDir
       title: Spark Warehouse Dir
       description: The location to use for the spark warehouse dir.
       type: String
       required: false
       defaultValue: "spark-warehouse"
       validation: "/^.+$/g"
       errorMessage: Incorrect Spark Warehouse Dir.
       # options not added spark.hadoop.fs.s3a.credentialsType and spark.hadoop.fs.s3a.stsAssumeRole.arn

JDBC Generic Connections

JDBC-generic ConnectionParams

Parameter	Type	Default	Description	Required
`uri`	String	jdbc:{protocol}://{host:port}/{database}	A fully qualified JDBC URI containing the dialect, host, port, database and other options.	true
`username`	String		The username that is used to gain access to the database	false
`password`	String		The password that is used for authenticating as an authorized user	false
`classname`	String		The classname of the JDBC driver to be loaded into the cortex runtime	true

JDBC-generic.yaml

- name: jdbc_generic
    description: Query data using a JDBC Connection
    title: JDBC Generic
    group: cortex
    type: jdbc
    connectionQueryParams:
      - name: query
        title: SQL Query
        description: An example SQL query to run in order to test connectivity
        type: String
        required: true
    connectionParams:
      - name: uri
        title: URI
        description: A fully qualified JDBC URI containing the dialect, host, port, database and other options.
        type: String
        required: true
        defaultValue: jdbc:{protocol}://{host:port}/{database}
        validation: "/^jdbc:(?:.+)$/g"
        errorMessage: Incorrect JDBC URI format.
      - name: username
        title: Username
        description: The username that is used to gain access to the database.
        type: String
        required: false
        validation: "/^.+$/g"
        errorMessage: Incorrect username format.
      - name: password
        title: Password
        description: The password that is used for authenticating as an authorized user.
        type: String
        secure: true
        required: false
        validation: "/^.+$/g"
        errorMessage: Incorrect password format.
      - name: classname
        title: Driver Class Name
        description: The classname of the JDBC driver to be loaded into the cortex runtime.
        type: String
        required: true
        validation: "/^([a-zA-Z_$][\\w$]*\\.)*[a-zA-Z_$][\\w$]*$/g"
        errorMessage: Incorrect Java class name format.
    tags:
      - label: category.connection.type
        value: SQL

JDBC CData Connections

JDBC CData Connections are built into a Skill template in the cortex-fabric-examples GitHub repo.

CData is a third party provider who abstracts commonly available databases to use JDBC connections (e.g. Salesforce, Twitter). When you select a CData connection type in Fabric Console, the parameters available for that connection type are selectable. The links in this table will take you to documentation provided by CData, so you can better understand how to configure these parameters.

Prerequisites for configuring CData connections are found here.

Instructions for working with CData JDBC connectors are available on the CData website.

Download driver jar file
Upload the Driver to Managed Content and make note of the URI.
Go to the CData help website to view the online documentation for your driver.

JDBC-cdata ConnectionParams

Parameter	Type	Default	Description	Required
`plugin_properties`	String		(secure) The key for the JSON-formatted configuration file stored in Managed Content and passed to the plugin at startup	false
`classname`	string		You can find this in the online documentation for your specific driver (under "Getting Started") on the CData help website	true

JDBC-cdata.yaml

- name: jdbc_cdata
  description: Query data using a CDATA JDBC Connection
  title: JDBC CDATA
  group: cortex
  type: jdbc
  connectionQueryParams:
    - name: query
      title: SQL Query
      description: An example SQL query to run in order to test connectivity
      type: String
      required: true
  connectionParams:
    - name: plugin_properties
      description: The JSON-formatted configuration data provided in this field is passed
        to the plugin at startup
      title: Plugin Properties
      type: String
      required: false
      secure: true
      validation: "/^.+$/g"
      errorMessage: Must be JSON formatted string.
    - name: classname
      title: Driver Class Name
      description: The classname of the CDATA JDBC driver to be loaded into the cortex runtime.
      type: String
      required: true
      validation: "/^cdata\\.jdbc\\.[A-z0-9\\.]*$/g"
      errorMessage: Incorrect CDATA driver Java class name format.
  tags:
    - label: category.connection.type
      value: SQL
    - label: CDATA
      value: CDATA
connections:
- name: content
  title: Cortex Managed Content
  description: Built in storage for files managed by the platform.
  connectionType: managedContent
  allowWrite: true

Common connection parameters​

S3 Connections​

S3 connectionParams​

S3 YAML​

S3 File Stream Connections​

S3 File Stream ConnectionParams​

S3 File Stream YAML​

Google Cloud Storage Connections​

GCS ConnectionParams​

GCS YAML​

GCS File Stream Connections​

GCS File Stream ConnectionParams​

GCS File Stream YAML​

Mongo connections​

Mongo connectionParms​

Mongo YAML​

Hive Connections​

Hive connectionParms​

Hive YAML​

JDBC Generic Connections​

JDBC-generic ConnectionParams​

JDBC-generic.yaml​

JDBC CData Connections​

JDBC-cdata ConnectionParams​

JDBC-cdata.yaml​