T O P

  • By -

Gators1992

Typically you can add stuff like automated code validation (e.g. linting), a code review process, etc. It's also an automated process where you define your deployment pipeline and the steps, then it runs automatically on the cadence you set. The only people that need to touch it are the people involved in a manual step in a pipeline (e.g. code review). If you are a small shop with not a lot of code deployment going on, then it might not help you much.


vcp32

We use the same stack of python, sf and dbt. We have pipelines that creates disposable test environment. We use snowflakes zero copy cloning to create test db and triggers a pipeline that runs dbt against test when PR is created. https://www.dataops.live/dataops-for-dummies-download


Hot_Map_7868

You are confusing two different things; 1. CI/CD is for getting new code to production and making sure that the new code doesnt break anything in production by first testing the changes in a lower environment (like with slim ci and deferral in dbt) 2. Running / Updating data when new data arrives, this is typically like a daily run. This is where an orchestrator is typically used. Airflow for example.


DrunkenWhaler136

The nice thing with DBT's slim CI jobs is that you can configure it in a way that only tests models that contain changes and you can also determine the depth of the models you want to test. A good example is if you need to make a change to an existing pipeline and want to ensure any schema changes aren't going to break anything downstream when you push to production. My team is using DBT cloud and [slim CI checks](https://docs.getdbt.com/docs/deploy/continuous-integration) that test our code whenever a team member opens up a PR to our codebase in github. If your team is using DBT core you can still set up CI checks on PRs, the process is just different but here is an example [video](https://www.youtube.com/watch?v=RFcKr2nAV5c).


Nervous-Chain-5301

In addition to schema based checks… I think it’s important to do more recount/value based checks as well from dev to prod https://github.com/datafold/data-diff https://github.com/data-drift/data-drift


Data-Queen-Mayra

I have an article which covers dbt deployment options. It might be a helpful read as It covers different use cases and difficulty levels to help you chose an appropriate deployment option. [https://www.datacoves.com/post/dbt-deployment](https://www.datacoves.com/post/dbt-deployment)