- The multi_asset_sensor(experimental) now accepts anAssetSelectionof assets to monitor. There are also minor API updates for the multi-asset sensor context.
- AssetValueLoader, the type returned by- RepositoryDefinition.get_asset_value_loaderis now part of Dagster’s public API.
- RepositoryDefinition.load_asset_valueand- AssetValueLoader.load_asset_valuenow support a- partition_keyargument.
- RepositoryDefinition.load_asset_valueand- AssetValueLoader.load_asset_valuenow work with I/O managers that invoke- context.upstream_output.asset_key.
- When running Dagster locally, the default amount of time that the system waits when importing user code has been increased from 60 seconds to 180 seconds, to avoid false positives when importing code with heavy dependencies or large numbers of assets. This timeout can be configured in dagster.yamlas follows:
code_servers:
  local_startup_timeout: 120
- [dagit] The “Status” section has been renamed to “Deployment”, to better reflect that this section of the app shows deployment-wide information.
- [dagit] When viewing the compute logs for a run and choosing a step to filter on, there is now a search input to make it easier to find the step you’re looking for.
- [dagster-aws] The EcsRunLauncher can now launch runs in ECS clusters using both Fargate and EC2 capacity providers. See the Deploying to ECS docs for more information.
- [dagster-airbyte] Added the load_assets_from_airbyte_instancefunction which automatically generates asset definitions from an Airbyte instance. For more details, see the new Airbyte integration guide.
- [dagster-airflow] Added the DagsterCloudOperatorandDagsterOperator, which are airflow operators that enable orchestrating dagster jobs, running on either cloud or OSS dagit instances, from Apache Airflow.
- Fixed a bug where if resource initialization failed for a dynamic op, causing other dynamic steps to be skipped, those skipped dynamic steps would be ignored when retrying from failure.
- Previously, some invocations within the Dagster framework would result in warnings about deprecated metadata APIs. Now, users should only see warnings if their code uses deprecated metadata APIs.
- How the daemon process manages its understanding of user code artifacts has been reworked to improve memory consumption.
- [dagit] The partition selection UI in the Asset Materialization modal now allows for mouse selection and matches the UI used for partitioned op jobs.
- [dagit] Sidebars in Dagit shrink more gracefully on small screens where headers and labels need to be truncated.
- [dagit] Improved performance for loading runs with >10,000 logs
- [dagster-airbyte] Previously, the portconfiguration in theairbyte_resourcewas marked as not required, but if it was not supplied, an error would occur. It is now marked as required.
- [dagster-dbt] A change made to the manifest.json schema in dbt 1.3 would result in an error when using load_assets_from_dbt_projectorload_assets_from_manifest_json. This has been fixed.
- [dagster-postgres] connections that fail due to sqlalchemy.exc.TimeoutErrornow retry
- [dagster-aws] The redshift_resourceno longer accepts aschemaconfiguration parameter. Previously, this parameter would error whenever used, because Redshift connections do not support this parameter.
- We now reference the correct method in the "loading asset values outside of Dagster runs" example (thank you Peter A. I. Forsyth!)
- We now reference the correct test directory in the “Create a New Project” documentation (thank you Peter A. I. Forsyth!)
- [dagster-pyspark] dagster-pyspark now contains a LazyPysparkResourcethat only initializes a spark session once it’s accessed (thank you @zyd14!)
- The new build_asset_reconciliation_sensorfunction accepts a set of software-defined assets and returns a sensor that automatically materializes those assets after their parents are materialized.
- [dagit] A new "groups-only" asset graph feature flag allows you to zoom way out on the global asset graph, collapsing asset groups into smaller nodes you can double-click to expand.
- RepositoryDefinitionnow exposes a- load_asset_valuemethod, which accepts an asset key and invokes the asset’s I/O manager’s- load_inputfunction to load the asset as a Python object. This can be used in notebooks to do exploratory data analysis on assets.
- Methods to fetch a list of partition keys from an input/output PartitionKeyRangenow exist on the op execution context and input/output context.
- [dagit] On the Instance Overview page, batched runs in the run timeline view will now proportionally reflect the status of the runs in the batch instead of reducing all run statuses to a single color.
- [dagster-dbt][dagster-snowflake] You can now use the Snowflake IO manager with dbt assets, which allows them to be loaded from Snowflake into Pandas DataFrames in downstream steps.
- The dagster package’s pin of the alembic package is now much less restrictive.
- The sensor daemon when using threads will no longer evaluate the next tick for a sensor if the previous one is still in flight. This resolves a memory leak in the daemon process.
- The scheduler will no longer remove tracked state for automatically running schedules when they are absent due to a workspace load error.
- The way user code severs manage repository definitions has been changed to more efficiently serve requests.
- The @multi_assetdecorator now respects itsconfig_schemaparameter.
- [dagit] Config supplied to define_asset_jobis now prefilled in the modal that pops up when you click the Materialize button on an asset job page, so you can quickly adjust the defaults.
- [dagster-dbt] Previously, DagsterDbtCliErrors produced from the dagster-dbt library would contain large serialized objects representing the raw unparsed logs from the relevant cli command. Now, these messages will contain only the parsed version of these messages.
- Fixed an issue where the deploy_ecsexample didn’t work when built and deployed on an M1 Mac.
- [dagster-fivetran] The resync_parametersconfiguration on thefivetran_resync_opis now optional, enabling triggering historical re*syncs for connectors. Thanks @dwallace0723!
- Improved API documentation for the Snowflake resource.
- Run status sensors can now monitor all runs in a Dagster Instance, rather than just runs from jobs within a single repository. You can enable this behavior by setting monitor_all_repositories=Truein the run status sensor decorator.
- The run_keyargument onRunRequestandrun_request_for_partitionis now optional.
- [dagster-databricks] A new “verbose_logs” config option on the databricks_pyspark_step_launcher makes it possible to silence non-critical logs from your external steps, which can be helpful for long-running, or highly parallel operations (thanks @zyd14!)
- [dagit] It is now possible to delete a run in Dagit directly from the run page. The option is available in the dropdown menu on the top right of the page.
- [dagit] The run timeline on the Workspace Overview page in Dagit now includes ad hoc asset materialization runs.
- Fixed a set of bugs in multi_asset_sensorwhere the cursor would fail to update, and materializations would be returned out of order forlatest_materialization_records_by_partition.
- Fixed a bug that caused failures in runs with time-partitioned asset dependencies when the PartitionsDefinition had an offset that wasn’t included in the date format. E.g. a daily-partitioned asset with an hour offset, whose date format was %Y-%m-%d.
- An issue causing code loaded by file path to import repeatedly has been resolved.
- To align with best practices, singleton comparisons throughout the codebase have been converted from (e.g.) foo == Nonetofoo is None(thanks @chrisRedwine!).
- [dagit] In backfill jobs, the “Partition Set” column would sometimes show an internal __ASSET_JOBname, rather than a comprehensible set of asset keys. This has been fixed.
- [dagit] It is now possible to collapse all Asset Observation rows on the AssetDetails page.
- [dagster-dbt] Fixed issue that would cause an error when loading assets from dbt projects in which a source had a “*” character in its name (e.g. BigQuery sharded tables)
- [dagster-k8s] Fixed an issue where the k8s_job_opwould sometimes fail if the Kubernetes job that it creates takes a long time to create a pod.
- Fixed an issue where links to the compute logs for a run would sometimes fail to load.
- [dagster-k8s] The k8s_job_executornow uses environment variables in place of CLI arguments to avoid limits on argument size with large dynamic jobs.
- Docs added to explain subsetting graph-backed assets. You can use this feature following the documentation here.
- UI updated to reflect separate version schemes for mature core Dagster packages and less mature integration libraries
- The multi_asset_sensor(experimental) now has improved capabilities to monitor asset partitions via alatest_materialization_records_by_partitionmethod.
- Performance improvements for the Partitions page in Dagit.
- Fixed a bug that caused the op_config argument of dagstermill.get_contextto be ignored
- Fixed a bug that caused errors when loading the asset details page for assets with time window partitions definitions
- Fixed a bug where assets sometimes didn’t appear in the Asset Catalog while in Folder view.
- [dagit] Opening the asset lineage tab no longer scrolls the page header off screen in some scenarios
- [dagit] The asset lineage tab no longer attempts to materialize source assets included in the upstream / downstream views.
- [dagit] The Instance page Run Timeline no longer commingles runs with the same job name in different repositories
- [dagit] Emitting materializations with JSON metadata that cannot be parsed as JSON no longer crashes the run details page
- [dagit] Viewing the assets related to a run no longer shows the same assets multiple times in some scenarios
- [dagster-k8s] Fixed a bug with timeouts causing errors in k8s_job_op
- [dagster-docker] Fixed a bug with Op retries causing errors with the docker_executor
- [dagster-aws] Thanks @Vivanov98 for adding the list_objectsmethod toS3FakeSession!
- [dagster-airbyte] Added an experimental function to automatically generate Airbyte assets from project YAML files. For more information, see the dagster-airbyte docs.
- [dagster-airbyte] Added the forward_logs option to AirbyteResource, allowing users to disble forwarding of Airbyte logs to the compute log, which can be expensive for long-running syncs.
- [dagster-airbyte] Added the ability to generate Airbyte assets for basic normalization tables generated as part of a sync.
- With the new cron_scheduleargument toTimeWindowPartitionsDefinition, you can now supply arbitrary cron expressions to define time window-based partition sets.
- Graph-backed assets can now be subsetted for execution via AssetsDefinition.from_graph(my_graph, can_subset=True).
- RunsFilteris now exported in the public API.
- [dagster-k8s] The dagster-user-deployments.deployments[].schedulerNameHelm value for specifying custom Kubernetes schedulers will now also apply to run and step workers launched for the given user deployment. Previously it would only apply to the grpc server.
- In some situations, default asset config was ignored when a subset of assets were selected for execution. This has been fixed.
- Added a pin to grpcioin dagster to address an issue with the recent 0.48.1 grpcio release that was sometimes causing Dagster code servers to hang.
- Fixed an issue where the “Latest run” column on the Instance Status page sometimes displayed an older run instead of the most recent run.
- In addition to a single cron string, cron_schedulenow also accepts a sequence of cron strings. If a sequence is provided, the schedule will run for the union of all execution times for the provided cron strings, e.g.,['45 23 * * 6', '30 9 * * 0]for a schedule that runs at 11:45 PM every Saturday and 9:30 AM every Sunday. Thanks @erinov1!
- Added an optional boolean config install_default_librariestodatabricks_pyspark_step_launcher. It allows to run Databricks jobs without installing the default Dagster libraries .Thanks @nvinhphuc!
- [dagster-k8s] Added additional configuration fields (container_config,pod_template_spec_metadata,pod_spec_config,job_metadata, andjob_spec_config) to the experimentalk8s_job_opthat can be used to add additional configuration to the Kubernetes pod that is launched within the op.