Considerations and limitations when using the Spark connector - Amazon EMR
Services or capabilities described in AWS documentation might vary by Region. To see the differences applicable to the AWS European Sovereign Cloud Region, see the AWS European Sovereign Cloud User Guide.

Considerations and limitations when using the Spark connector

The Spark connector supports a variety of ways to manage credentials, to configure security, and to connect with other AWS services. Get familiar with the recommendations in this list in order to configure a functional and resilient connection.

  • We recommend that you activate SSL for the JDBC connection from Spark on Amazon EMR to Amazon Redshift.

  • We recommend that you manage the credentials for the Amazon Redshift cluster in AWS Secrets Manager as a best practice. See Using AWS Secrets Manager to retrieve credentials for connecting to Amazon Redshift for an example.

  • We recommend that you pass an IAM role with the parameter aws_iam_role for the Amazon Redshift authentication parameter.

  • The parameter tempformat currently doesn't support the Parquet format.

  • The tempdir URI points to an Amazon S3 location. This temp directory isn't cleaned up automatically and therefore could add additional cost.

  • Consider the following recommendations for Amazon Redshift:

  • Consider the following recommendations for Amazon S3:

For more information on using the connector and its supported parameters, see the following resources: