- [dagster-k8s] Added an
includeConfigInLaunchedRuns flag to the Helm chart that can be used to automatically include configmaps, secrets, and volumes in any runs launched from code in a user code deployment. See https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm#configure-your-user-deployment for more information. - [dagit] Improved display of configuration yaml throughout Dagit, including better syntax highlighting and the addition of line numbers.
- The GraphQL input argument type
BackfillParams (used for launching backfills), now has an allPartitions boolean flag, which can be used instead of specifying all the individual partition names. - Removed
gevent and gevent-websocket dependencies from dagster-graphql - Memoization is now supported while using step selection
- Cleaned up various warnings across the project
- The default IO Managers now support asset partitions
Fixed sqlite3.OperationalError error when viewing schedules/sensors pages in Dagit. This was affecting dagit instances using the default SQLite schedule storage with a SQLite version < 3.25.0.
Fixed an issues where schedules and sensors would sometimes fail to run when the daemon and dagit were running in different Python environments.
Fixed an exception when the telemetry file is empty
fixed a bug with @graph composition which would cause the wrong input definition to be used for type checks
[dagit] For users running Dagit with --path-prefix, large DAGs failed to render due to a WebWorker error, and the user would see an endless spinner instead. This has been fixed.
[dagit] Fixed a rendering bug in partition set selector dropdown on Launchpad.
[dagit] Fixed the ‘View Assets’ link in Job headers
Fixed an issue where root input managers with resource dependencies would not work with software defined assets
dagster-census is a new library that includes a census_resource for interacting the Census REST API, census_trigger_sync_op for triggering a sync and registering an asset once it has finished, and a CensusOutput type. Thanks @dehume!- Docs fix. Thanks @ascrookes!
- Added a parameter in
dagster.yaml that can be used to increase the time that Dagster waits when spinning up a gRPC server before timing out. For more information, see https://docs.dagster.io/deployment/dagster-instance#code-servers. - Added a new graphQL field
assetMaterializations that can be queried off of a DagsterRun field. You can use this field to fetch the set of asset materialization events generated in a given run within a GraphQL query. - Docstrings on functions decorated with the
@resource decorator will now be used as resource descriptions, if no description is explicitly provided. - You can now point
dagit -m or dagit -f at a module or file that has asset definitions but no jobs or asset groups, and all the asset definitions will be loaded into Dagit. AssetGroup now has a materialize method which executes an in-process run to materialize all the assets in the group.AssetGroups can now contain assets with different partition_defs.- Asset materializations produced by the default asset IO manager,
fs_asset_io_manager, now include the path of the file where the values were saved. - You can now disable the
max_concurrent_runs limit on the QueuedRunCoordinator by setting it to -1. Use this if you only want to limit runs using tag_concurrency_limits. - [dagit] Asset graphs are now rendered asynchronously, which means that Dagit will no longer freeze when rendering a large asset graph.
- [dagit] When viewing an asset graph, you can now double-click on an asset to zoom in, and you can use arrow keys to navigate between selected assets.
- [dagit] The “show whitespace” setting in the Launchpad is now persistent.
- [dagit] A bulk selection checkbox has been added to the repository filter in navigation or Instance Overview.
- [dagit] A “Copy config” button has been added to the run configuration dialog on Run pages.
- [dagit] An “Open in Launchpad” button has been added to the run details page.
- [dagit] The Run page now surfaces more information about start time and elapsed time in the header.
- [dagster-dbt] The dbt_cloud_resource has a new
get_runs() function to get a list of runs matching certain paramters from the dbt Cloud API (thanks @kstennettlull!) - [dagster-snowflake] Added an
authenticator field to the connection arguments for the snowflake_resource (thanks @swotai!). - [celery-docker] The celery docker executor has a new configuration entry
container_kwargs that allows you to specify additional arguments to pass to your docker containers when they are run.
- Fixed an issue where loading a Dagster repository would fail if it included a function to lazily load a job, instead of a JobDefinition.
- Fixed an issue where trying to stop an unloadable schedule or sensor within Dagit would fail with an error.
- Fixed telemetry contention bug on windows when running the daemon.
- [dagit] Fixed a bug where the Dagit homepage would claim that no jobs or pipelines had been loaded, even though jobs appeared in the sidebar.
- [dagit] When filtering runs by tag, tag values that contained the
: character would fail to parse correctly, and filtering would therefore fail. This has been fixed. - [dagster-dbt] When running the “build” command using the dbt_cli_resource, the run_results.json file will no longer be ignored, allowing asset materializations to be produced from the resulting output.
- [dagster-airbyte] Responses from the Airbyte API with a 204 status code (like you would get from /connections/delete) will no longer produce raise an error (thanks @HAMZA310!)
- [dagster-shell] Fixed a bug where shell ops would not inherit environment variables if any environment variables were added for ops (thanks @kbd!)
- [dagster-postgres] usernames are now urlqouted in addition to passwords
- The MySQL storage implementations for Dagster storage is no longer marked as experimental.
run_id can now be provided as an argument to execute_in_process.- The text on
dagit’s empty state no longer mentions the legacy concept “Pipelines”. - Now, within the
IOManager.load_input method, you can add input metadata via InputContext.add_input_metadata. These metadata entries will appear on the LOADED_INPUT event and if the input is an asset, be attached to an AssetObservation. This metadata is viewable in dagit.
- Fixed a set of bugs where schedules and sensors would get out of sync between
dagit and dagster-daemon processes. This would manifest in schedules / sensors getting marked as “Unloadable” in dagit, and ticks not being registered correctly. The fix involves changing how Dagster stores schedule/sensor state and requires a schema change using the CLI command dagster instance migrate. Users who are not running into this class of bugs may consider the migration optional. root_input_manager can now be specified without a context argument.- Fixed a bug that prevented
root_input_manager from being used with VersionStrategy. - Fixed a race condition between daemon and
dagit writing to the same telemetry logs. - [dagit] In
dagit, using the “Open in Launchpad” feature for a run could cause server errors if the run configuration yaml was too long. Runs can now be opened from this feature regardless of config length. - [dagit] On the Instance Overview page in
dagit, runs in the timeline view sometimes showed incorrect end times, especially batches that included in-progress runs. This has been fixed. - [dagit] In the
dagit launchpad, reloading a repository should present the user with an option to refresh config that may have become stale. This feature was broken for jobs without partition sets, and has now been fixed. - Fixed issue where passing a stdlib
typing type as dagster_type to input and output definition was incorrectly being rejected. - [dagster-airbyte] Fixed issue where AssetMaterialization events would not be generated for streams that had no updated records for a given sync.
- [dagster-dbt] Fixed issue where including multiple sets of dbt assets in a single repository could cause a conflict with the names of the underlying ops.
- [helm] Added configuration to explicitly enable or disable telemetry.
- Added a new IO manager for materializing assets to Azure ADLS. You can specify this IO manager for your AssetGroups by using the following config:
`from dagster import AssetGroup
from dagster_azure import adls2_pickle_asset_io_manager, adls2_resource
asset_group = AssetGroup(
[upstream_asset, downstream_asset],
resource_defs={"io_manager": adls2_pickle_asset_io_manager, "adls2": adls2_resource}
)`
- Added ability to set a custom start time for partitions when using
@hourly_partitioned_config , @daily_partitioned_config, @weekly_partitioned_config, and @monthly_partitioned_config - Run configs generated from partitions can be retrieved using the
PartitionedConfig.get_run_config_for_partition_key function. This will allow the use of the validate_run_config function in unit tests. - [dagit] If a run is re-executed from failure, and the run fails again, the default action will be to re-execute from the point of failure, rather than to re-execute the entire job.
PartitionedConfig now takes an argument tags_for_partition_fn which allows for custom run tags for a given partition.
- Fixed a bug in the message for reporting Kubernetes run worker failures
- [dagit] Fixed issue where re-executing a run that materialized a single asset could end up re-executing all steps in the job.
- [dagit] Fixed issue where the health of an asset’s partitions would not always be up to date in certain views.
- [dagit] Fixed issue where the “Materialize All” button would be greyed out if a job had SourceAssets defined.
- Updated resource docs to reference “ops” instead of “solids” (thanks @joe-hdai!)
- Fixed formatting issues in the ECS docs
- Added IO manager for materializing assets to GCS. You can specify the GCS asset IO manager by using the following config for
resource_defs in AssetGroup:
`from dagster import AssetGroup, gcs_pickle_asset_io_manager, gcs_resource
asset_group = AssetGroup(
[upstream_asset, downstream_asset],
resource_defs={"io_manager": gcs_pickle_asset_io_manager, "gcs": gcs_resource}
)`
- Improved the performance of storage queries run by the sensor daemon to enforce the idempotency of run keys. This should reduce the database CPU when evaluating sensors with a large volume of run requests with run keys that repeat across evaluations.
- [dagit] Added information on sensor ticks to show when a sensor has requested runs that did not result in the creation of a new run due to the enforcement of idempotency using run keys.
- [k8s] Run and step workers are now labeled with the Dagster run id that they are currently handling.
- If a step launched with a StepLauncher encounters an exception, that exception / stack trace will now appear in the event log.
- Fixed a race condition where canceled backfills would resume under certain conditions.
- Fixed an issue where exceptions that were raised during sensor and schedule execution didn’t always show a stack trace in Dagit.
- During execution, dependencies will now resolve correctly for certain dynamic graph structures that were previously resolving incorrectly.
- When using the forkserver start_method on the multiprocess executor, preload_modules have been adjusted to prevent libraries that change namedtuple serialization from causing unexpected exceptions.
- Fixed a naming collision between dagster decorators and submodules that sometimes interfered with static type checkers (e.g. pyright).
- [dagit] postgres database connection management has improved when watching actively executing runs
- [dagster-databricks] The databricks_pyspark_step_launcher now supports steps with RetryPolicies defined, as well as
RetryRequested exceptions.
- Docs spelling fixes - thanks @antquinonez!