Back to tutorials

dbt Artifacts Package: semantic_manifest, manifest, catalog, run_results, sources

dbt is a transformative tool in the world of data analytics, enabling data professionals to transform and model data in the warehouse. One of its powerful features is the generation of dbt artifacts—structured outputs from dbt runs that provide insights into the dbt project and its operations.

Basics of dbt Artifacts

dbt artifacts are JSON files generated every time dbt runs. They include:

  • semantic_manifest.json: Contains the compiled SQL code for each model.
  • manifest.json: Offers a comprehensive view of your dbt project at the time of the last run.
  • catalog.json: Provides details about the database schema, including column data types and descriptions.
  • run_results.json: Contains the results of the last dbt run, including success or failure status.
  • sources.json: Details about the source data tables used in the project.

These artifacts are essential for documentation, understanding the state of your dbt project, and visualizing source freshness.

Generating and Accessing Artifacts

Every time you invoke dbt, it generates artifacts. For instance, when you run:

dbt run

dbt will produce artifacts in the target/ directory of your dbt project. You can access these JSON files directly and use tools like dbt's built-in documentation site to visualize their content.

Practical Application: dbt artifacts Package

Brooklyn Data's dbt_artifacts package is a powerful tool that models a dbt project and its run metadata. To utilize it:

Installation

Add the package to your packages.yml:

packages:
  - package: brooklyn-data/dbt_artifacts
    version: 2.5.0

Configuration

Adjust your dbt_project.yml to specify where data is uploaded:

models:
  dbt_artifacts:
    +database: your_destination_database
    +schema: your_destination_schema

Usage

After setting up, run:

dbt run --select dbt_artifacts

Advanced Usage with Elementary dbt Package

Another package that can be useful when using dbt artifacts is the Elementary dbt package. It offers advanced artifact modeling capabilities:

Uploading Artifacts

Elementary uses macros to extract fields from artifacts and insert them into tables, such as dbt_run_results and dbt_models.

Model Execution

When you make changes to your dbt projects, run:

dbt run --models dbt_models

Practical Examples

Generate Artifacts for a Business Sales dbt Project

  1. Create a dbt project focused on sales data.
  2. Run the project using dbt run.
  3. Explore the generated artifacts in the target/ directory.

Use the dbt Artifacts Package

  1. Install the package as described above.
  2. Configure it to upload data to a sales_artifacts schema.
  3. Run the package and explore the generated tables.

Best Practices and Tips

  • Consistency: Always ensure that your dbt models are consistent in naming and structure. This ensures that your artifacts are reliable.
  • Optimization: Use the run_results.json artifact to identify slow-running models and optimize them.
  • Collaboration: Share your artifacts with team members to ensure everyone is aligned.

Conclusion and Further Resources

dbt artifacts are a powerful feature that provides deep insights into your dbt projects. By understanding and utilizing them effectively, you can optimize your data transformation processes and ensure data reliability.

database icon
Unified workspace for your dbt workflow
Forget about the painful parts of dbt development, focus on what matters the most - data analysis