dbt (data build tool) interview questions

36. How do I run models downstream of a seed?
You can run models downstream of a seed using the model selection syntax, and treating the seed like a model.
$ dbt run –models country_codes+

37. How do I run one model at a time?
To run one model, use the –models flag (or -m flag), followed by the name of the model:

38. How do I run models downstream of one source?
To run models downstream of a source, use the source: selector:
$ dbt run –models source:jaffle_shop+

39. What happens if I add new columns to my snapshot query?
When the columns of your source query changes, dbt will attempt to reconcile this change in the destination snapshot table. dbt does this by:
Creating new columns from the source query in the destination table
Expanding the size of string types where necessary (eg. varchars on Redshift)
dbt will not delete columns in the destination snapshot table if they are removed from the source query. It will also not change the type of a column beyond expanding the size of varchar columns. That is, if a string column is changed to a date column in the snapshot source query, dbt will not attempt to change the type of the column in the destination table.

40. How do I specify column types?
Simply cast the column to the correct type in your model:
created::timestamp as created
from some_other_table

41. Do model names need to be unique?
Yes. To build dependencies between models, you need to use the ref function. The ref function only takes one argument – the model name (i.e. the filename). As a result, these model names need to be unique, even if they are in distinct folders.


Author: user

Leave a Reply

Your email address will not be published. Required fields are marked *