Path adjustment logic during deployment

During deployment, dbx supports uploading local files and properly referencing them in the job definition. Any keys referenced in the deployment file starting with file:// or file:fuse:// will be uploaded to the artifact storage. References are resolved with relevance to the root of the project.

There are two types of how the file path will be resolved and referenced in the final deployment definition:

  • Standard - This definition looks like this file://some/path/in/project/some.file. This definition will be resolved into dbfs://<artifact storage prefix>/some/path/in/project/some.file

  • FUSE - This definition looks like this file:fuse://some/path/in/project/some.file. This definition will be resolved into /dbfs/<artifact storage prefix>/some/path/in/project/some.file

The latter type of path resolution might come in handy when the using system doesn’t know how to work with cloud storage protocols.

Please find more examples on path resolution below:

{
    "default": {
        "jobs": [
            {
                "name": "your-job-name",
                "new_cluster": {
                    "spark_version": "7.3.x-cpu-ml-scala2.12",
                    "node_type_id": "some-node-type",
                    "aws_attributes": {
                        "first_on_demand": 0,
                        "availability": "SPOT"
                    },
                    "num_workers": 2
                },
                "libraries": [],
                "max_retries": 0,
                "spark_python_task": {
                    "python_file": "file://placeholder_1.py",
                    "parameters": [
                        "file:fuse://placeholder_1.py",
                        "./placeholder_1.py"
                    ]
                }
            }
        ]
    }
}