You are viewing an unreleased or outdated version of the documentation

Changelog#

0.15.5#

New#

  • Added documentation and helm chart configuration for threaded sensor evaluations.
  • Added documentation and helm chart configuration for tick retention policies.
  • Added descriptions for default config schema. Fields like execution, loggers, ops, and resources are now documented.
  • UnresolvedAssetJob objects can now be passed to run status sensors.
  • [dagit] A new global asset lineage view, linked from the Asset Catalog and Asset Group pages, allows you to view a graph of assets in all loaded asset groups and filter by query selector and repo.
  • [dagit] A new option on Asset Lineage pages allows you to choose how many layers of the upstream / downstream graph to display.
  • [dagit] Dagit's DAG view now collapses large sets of edges between the same ops for improved readability and rendering performance.

Bugfixes#

  • Fixed a bug with materialize that would cause required resources to not be applied correctly.
  • Fixed issue that caused repositories to fail to load when build_schedule_from_partitioned_job and define_asset_job were used together.
  • Fixed a bug that caused auto run retries to always use the FROM_FAILURE strategy
  • Previously, it was possible to construct Software-Defined Assets from graphs whose leaf ops were not mapped to assets. This is invalid, as these ops are not required for the production of any assets, and would cause confusing behavior or errors on execution. This will now result in an error at definition time, as intended.
  • Fixed issue where the run monitoring daemon could mark completed runs as failed if they transitioned quickly between STARTING and SUCCESS status.
  • Fixed stability issues with the sensor daemon introduced in 0.15.3 that caused the daemon to fail heartbeat checks if the sensor evaluation took too long.
  • Fixed issues with the thread pool implementation of the sensor daemon where race conditions caused the sensor to fire more frequently than the minimum interval.
  • Fixed an issue with storage implementations using MySQL server version 5.6 which caused SQL syntax exceptions to surface when rendering the Instance overview pages in Dagit.
  • Fixed a bug with the default_executor_def argument on repository where asset jobs that defined executor config would result in errors.
  • Fixed a bug where an erroneous exception would be raised if an empty list was returned for a list output of an op.
  • [dagit] Clicking the "Materialize" button for assets with configurable resources will now present the asset launchpad.
  • [dagit] If you have an asset group and no jobs, Dagit will display it by default rather than directing you to the asset catalog.
  • [dagit] DAG renderings of software-defined assets now display only the last component of the asset's key for improved readability.
  • [dagit] Fixes a regression where clicking on a source asset would trigger a GraphQL error.
  • [dagit] Fixed issue where the “Unloadable” section on the sensors / schedules pages in Dagit were populated erroneously with loadable sensors and schedules
  • [dagster-dbt] Fixed an issue where an exception would be raised when using the dbt build command with Software-Defined Assets if a test was defined on a source.

Deprecations#

  • Removed the deprecated dagster-daemon health-check CLI command

Community Contributions#

  • TimeWindow is now exported from the dagster package (Thanks @nvinhphuc!)
  • Added a fix to allow customization of slack messages (Thanks @solarisa21!)
  • [dagster-databricks] The databricks_pyspark_step_launcher now allows you to configure the following (Thanks @Phazure!):
    • the aws_attributes of the cluster that will be spun up for the step.
    • arbitrary environment variables to be copied over to databricks from the host machine, rather than requiring these variables to be stored as secrets.
    • job and cluster permissions, allowing users to view the completed runs through the databricks console, even if they’re kicked off by a service account.

Experimental#

  • [dagster-k8s] Added k8s_job_op to launch a Kubernetes Job with an arbitrary image and CLI command. This is in contrast with the k8s_job_executor, which runs each Dagster op in a Dagster job in its own k8s job. This op may be useful when you need to orchestrate a command that isn't a Dagster op (or isn't written in Python). Usage:

    from dagster_k8s import k8s_job_op
    
    my_k8s_op = k8s_job_op.configured({
     "image": "busybox",
     "command": ["/bin/sh", "-c"],
     "args": ["echo HELLO"],
     },
     name="my_k8s_op",
    )
    
  • [dagster-dbt] The dbt asset-loading functions now support partitions_def and partition_key_to_vars_fn parameters, adding preliminary support for partitioned dbt assets. To learn more, check out the Github issue!

0.15.4#

  • Reverted sensor threadpool changes from 0.15.3 to address daemon stability issues.

0.15.3#

New#

  • When loading an upstream asset or op output as an input, you can now set custom loading behavior using the input_manager_key argument to AssetIn and In
  • The list of objects returned by a repository can now contain nested lists.
  • Added a data retention instance setting in dagster.yaml that enables the automatic removal of sensor/schedule ticks after a certain number of days.
  • Added a sensor daemon setting in dagster.yaml that enables sensor evaluations to happen in a thread pool to increase throughput.
  • materialize_to_memory and materialize now both have the partition_key argument.
  • Output and DynamicOutput objects now work with deep equality checks:
Output(value=5, name="foo") == Output(value=5, name="foo") # evaluates to True
  • RunRequests can now be returned from run status sensors
  • Added resource_defs argument to AssetsDefinition.from_graph. Allows for specifying resources required by constituent ops directly on the asset.
  • When adding a tag to the Run search filter in Dagit by clicking the hover menu on the tag, the tag will now be appended to the filter instead of replacing the entire filter state.

Bugfixes#

  • [dagster-dbt] An exception is now emitted if you attempt to invoke the library without having dbt-core installed. dbt-core is now also added as a dependency to the library.
  • Asset group names can now contain reserved python keywords
  • Fixed a run config parsing bug that was introduced in 0.15.1 that caused Dagit to interpret datetime strings as datetime objects and octal strings as integers.
  • Runs that have failed to start are now represented in the Instance Timeline view on Dagit.
  • Fixed an issue where the partition status was missing for partitioned jobs that had no runs.
  • Fixed a bug where op/resource invocation would error when resources were required, no context was used in the body of the function, and no context was provided when invoking.
  • [dagster-databricks] Fixed an issue where an exception related to the deprecated prior_attempts_count field when using the databricks_pyspark_step_launcher.
  • [dagster-databricks] Polling information logged from the databricks_pyspark_step_launcher is now emitted at the DEBUG level instead of INFO.
  • In the yaml editor in Dagit, the typeahead feature now correctly shows suggestions for nullable schema types.
  • When editing asset configuration in Dagit, the “Scaffold config” button in the Dagit launchpad sometimes showed the scaffold dialog beneath the launchpad. This has been fixed.
  • A recent change added execution timezones to some human-readable cron strings on schedules in Dagit. This was added incorrectly in some cases, and has now been fixed.
  • In the Dagit launchpad, a config state containing only empty newlines could lead to an error that could break the editor. This has been fixed.
  • Fixed issue that could cause partitioned graph-backed assets to attempt to load upstream inputs from the incorrect path when using the fs_io_manager (or other similar io managers).
  • [dagster-dbt] Fixed issue where errors generated from issuing dbt cli commands would only show json-formatted output, rather than a parsed, human-readable output.
  • [dagster-dbt] By default, dagster will invoke the dbt cli with a --log-format json flag. In some cases, this may cause dbt to report incorrect or misleading error messages. As a workaround, it is now possible to disable this behavior by setting the json_log_format configuration option on the dbt_cli_resource to False.
  • materialize_to_memory erroneously allowed non-in-memory io managers to be used. Now, providing io managers to materialize_to_memory will result in an error, and mem_io_manager will be provided to all io manager keys.

0.15.2#

Bugfixes#

  • Fixed an issue where asset dependency resolution would break when two assets in the same group had the same name

0.15.1#

New#

  • When Dagster loads an event from the event log of a type that it doesn’t recognize (for example, because it was created by a newer version of Dagster) it will now return a placeholder event rather than raising an exception.
  • AssetsDefinition.from_graph() now accepts a group_name parameter. All assets created by from_graph are assigned to this group.
  • You can define an asset from an op via a new utility method AssetsDefinition.from_op. Dagster will infer asset inputs and outputs from the ins/outs defined on the @op in the same way as @graphs.
  • A default executor definition can be defined on a repository using the default_executor_def argument. The default executor definition will be used for all op/asset jobs that don’t explicitly define their own executor.
  • JobDefinition.run_request_for_partition now accepts a tags argument (Thanks @jburnich!)
  • In Dagit, the graph canvas now has a dotted background to help it stand out from the reset of the UI.
  • @multi_asset now accepts a resource_defs argument. The provided resources can be either used on the context, or satisfy the io manager requirements of the outs on the asset.
  • In Dagit, show execution timezone on cron strings, and use 12-hour or 24-hour time format depending on the user’s locale.
  • In Dagit, when viewing a run and selecting a specific step in the Gantt chart, the compute log selection state will now update to that step as well.
  • define_asset_job and to_job now can now accept a partitions_def argument and a config argument at the same time, as long as the value for the config argument is a hardcoded config dictionary (not a PartitionedConfig or ConfigMapping)

Bugfixes#

  • Fixed an issue where entering a string in the launchpad that is valid YAML but invalid JSON would render incorrectly in Dagit.
  • Fixed an issue where steps using the k8s_job_executor and docker_executor would sometimes return the same event lines twice in the command-line output for the step.
  • Fixed type annotations on the @op decorator (Thanks Milos Tomic!)
  • Fixed an issue where job backfills were not displayed correctly on the Partition view in Dagit.
  • UnresolvedAssetJobDefinition now supports the run_request_for_partition method.
  • Fixed an issue in Dagit where the Instance Overview page would briefly flash a loading state while loading fresh data.

Breaking Changes#

  • Runs that were executed in newer versions of Dagster may produce errors when their event logs are loaded in older versions of Dagit, due to new event types that were recently added. Going forward, Dagit has been made more resilient to handling new events.

Deprecations#

  • Updated deprecation warnings to clarify that the deprecated metadata APIs will be removed in 0.16.0, not 0.15.0.

Experimental#

  • If two assets are in the same group and the upstream asset has a multi-segment asset key, the downstream asset doesn’t need to specify the full asset key when declaring its dependency on the upstream asset - just the last segment.

Documentation#

  • Added dedicated sections for op, graph, and job Concept docs in the sidenav
  • Moved graph documentation from the jobs docs into its own page
  • Added documentation for assigning asset groups and viewing them in Dagit
  • Added apidoc for AssetOut and AssetIn
  • Fixed a typo on the Run Configuration concept page (Thanks Wenshuai Hou!)
  • Updated screenshots in the software-defined assets tutorial to match the new Dagit UI
  • Fixed a typo in the Defining an asset section of the software-defined assets tutorial (Thanks Daniel Kim!)