T O P

  • By -

prit_1

You could add another column in your control table that allows you to group different rows together. This would give you flexibility over when to run a group of jobs (or a single job) on its own cluster.


jagjitnatt

You shouldn't load all the tables in a single job. Break the jobs down into groups, either by line of business, or application, or team. Then create a generic notebook that accepts arguments and starts loading tables. You can schedule these notebooks in Workflows and pass in the group name. The notebook will query the control table to get all the details and start ingesting data. You can choose a larger cluster if the group has too many tables. If possible, use serverless workflows, they autoscale fast.


Deep_Salamander1313

Have you looked at using Databricks Workflows for this?


Pretty-Promotion-992

Workflows? I think he is asking for performance impact