Named properties support
Since version 0.2.2 you can also use dbx
for name-based properties instead of providing ids.
The following properties are supported:
existing_cluster_name
will be automatically replaced withexisting_cluster_id
new_cluster.instance_pool_name
will be automatically replaced withnew_cluster.instance_pool_id
new_cluster.driver_instance_pool_name
will be automatically replaced withnew_cluster.driver_instance_pool_id
new_cluster.aws_attributes.instance_profile_name
will be automatically replaced withnew_cluster.aws_attributes.instance_profile_arn
By this simplification, you don’t need to look-up for these id-based properties, you can simply provide the names.
Here are some examples in JSON and YAML:
{
"default": {
"jobs": [
{
"name": "named-props-instance-pool-name",
"new_cluster": {
"instance_pool_name": "some-instance-pool-name",
"driver_instance_pool_name": "some-instance-pool-name"
},
"libraries": [],
"max_retries": 0,
"spark_python_task": {
"python_file": "dbx_named_params/jobs/sample/entrypoint.py"
}
},
{
"name": "named-props-existing-cluster-name",
"existing_cluster_name": "some-cluster",
"libraries": [],
"max_retries": 0,
"spark_python_task": {
"python_file": "dbx_named_params/jobs/sample/entrypoint.py"
}
},
{
"name": "named-props-instance-profile-name",
"new_cluster": {
"aws_attributes": {
"instance_profile_name": "some-instance-profile-name"
}
},
"libraries": [],
"max_retries": 0,
"spark_python_task": {
"python_file": "dbx_named_params/jobs/sample/entrypoint.py"
}
}
]
}
}
environments:
default:
jobs:
- name: "named-props-instance-pool-name"
new_cluster:
instance_pool_name: "some-instance-pool-name"
driver_instance_pool_name: "some-instance-pool-name"
- name: "named-props-existing-cluster-name"
existing_cluster_name: "some-cluster"
- name: "named-props-instance-profile-name"
new_cluster:
aws_attributes:
instance_profile_name: "some-instance-profile-name"
Note
Named properties are also supported for Jobs API 2.1 - simply provide them on the new_cluster
level.