AWS Glue – Can not create a Path from an empty string
1. Overview
I was receiving this error while trying to run an AWS Glue job that communicated with a DB2 11 instance:
1 | Exception in thread "main" java.lang.IllegalArgumentException: Can not create a Path from an empty string |
As we know, AWS Glue is a fully managed ETL service that is built on an Apache Spark environment. As such, AWS Glue jobs can be written in Scala or Python (pyspark).
My Glue job was written in Python and this error didn’t really seem to tell me much. Where is there an empty string I wondered?
I searched online and found a lot of results that did not apply to me, such as this one on apache.org, where the solution is: “I degradate the spark version from 2.2.0 to 2.1.1, is resolved with the error.
So phoenix 4.11 -hbase1.20 with spark2.2.0 is not work, the compatibility is not good.”
I have limited AWS Glue knowledge at this point. However, since it is a managed service, I did not find it applicable.
Thankfully, I did manage to solve this issue.
2. Solution
Here are the steps I took to solve the Can not create a Path from an empty string error in my Glue job:
- Ensure you can connect to the database using software like DBeaver. Check the schema.
- Remove all code from Glue job that did not create connection to DB2 database and run a simple command. In my case, this command was df.printSchema()
- Remove connection string information for database and enter it all in again. Double check that it is 100% correct.
- Create a new Glue job and double check IAM permissions in AWS.
- Ensure drivers are available in “Security configuration, script libraries, and job parameters (optional)” portion of job creation.
- Ensure connection is available in Required Connections of job creation/edit job.
3. Simple Code Example for Testing Database
As I said, I chose to test the least amount of code possible to ensure database connectivity. If you are having this error, it is possible you have a database connection issue or perhaps a query issue for your table.
Here is the code I ended up using for my test:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | import sys import boto3 import json from awsglue.transforms import * from awsglue.utils import getResolvedOptions from pyspark.context import SparkContext from awsglue.context import GlueContext from awsglue.dynamicframe import DynamicFrame from awsglue.job import Job from pyspark.sql.functions import * from pyspark.sql.functions import col, asc args = getResolvedOptions(sys.argv, [ 'JOB_NAME' ]) sc = SparkContext() glueContext = GlueContext(sc) spark = glueContext.spark_session job = Job(glueContext) job.init(args[ 'JOB_NAME' ], args) db_username = "username" db_password = "password!" db_url = "jdbc:db2://12.345.67.891:50000/somedatabase" table_name = "database.sometable" jdbc_driver_name = "com.ibm.db2.jcc.DB2Driver" df = glueContext. read . format ( "jdbc" ).option( "driver" , jdbc_driver_name).option( "url" , db_url).option( "dbtable" , table_name).option( "user" , db_username).option( "password" , db_password).load() db2_schema = df .printSchema() |
There are unnecessary imports, of course. If you change this code for your needs, it will work though.
4. Conclusion
If you are receiving this mysterious error on AWS Glue:
1 | Exception in thread "main" java.lang.IllegalArgumentException: Can not create a Path from an empty string |
You can take comfort in being able to isolate the issue to a database issue. Check the steps I mentioned above and do them methodically. I believe you will be able to learn exactly what the cause of it is.
Published on Java Code Geeks with permission by Michael Good, partner at our JCG program. See the original article here: AWS Glue – Can not create a Path from an empty string Opinions expressed by Java Code Geeks contributors are their own. |