Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Color code different sources in the lineage graph #26

Closed
PowdyPowPow opened this issue May 8, 2019 · 36 comments
Closed

Color code different sources in the lineage graph #26

PowdyPowPow opened this issue May 8, 2019 · 36 comments

Comments

@PowdyPowPow
Copy link

Currently all source tables are shown green in the lineage graph.
It would be helpful if the tables from different sources would be shown in a different color.

image

@drewbanin
Copy link
Contributor

@PowdyPowPow cool idea! I think it might look kind of wonky to color every source differently... maybe that's just something that we need to experiment with. It could be interesting to tone down the color (like use gray instead of green, or something) and then use a differently colored border, for instance.

What do you think?

@PowdyPowPow
Copy link
Author

@drewbanin i guess it could become very colorful and you would need to choose "easy" colors. But as currently the sources get distributed all over the graph it is already colorful with just the green. Maybe if you could somehow cluster them together it would probably also be easier to use some more discreet common element like colored border or just a different shades of grey? Or maybe you could group the different source tables together and put them in a (colored) box/container if they were all in the same place?

@yarodmi
Copy link

yarodmi commented Oct 7, 2019

Hi Team,

I really think this will be a very useful feature.
Thou, I would suggest providing the ability to apply custom colour coding (rather than just different sources) as there might be different use cases.

Few examples where I would love to use it :

  • quite often I am having multiple database schemas representing different architecture layers. Would be great to see them(schemas) in different colours so it is clear whats where
  • highlight facts and dimensions in different colors
  • highlight different materializations (what is a view and what is a table for example). We often start from 'everything as a view' and then deciding what models are useful to materialize. Seeing this in a graph will be very useful.

To simplify the adoption, might be reasonable to pre-define some colours for some tags (config(tags=["dimension"]) or config(tags=["fact"]) config(tags=["base"]) plus the ability to define a custom mapping.

Thanks in advance!

@MartinGuindon
Copy link
Contributor

I agree with @yarodmi, it would be really useful to be able to define/override color rules.

I really like the idea of being able to define background colors and border colors for the nodes. I'd certainly use it to differentiate sources, staging models and fact/dimension tables. I could easily see this as a simple parameter in the source or dbt_project YAML files, eg:

schema.yml

sources:
  - name: retently
     docs_background_color: BFBFC0
     docs_border_color: 3346F3
     ..other source config..

  - name: gsheets
    docs_background_color: BFBFC0
    docs_border_color: 267C0B
    ..other source config..

dbt_project.yml

models:
  my_project:
    marts:
      docs_background_color: 10C3C3

    staging:
      docs_background_color: 4F0BC4

I think it would require being able to set background color, text color and border color at the minimum.

@bcolbert978
Copy link

Reposting here from the dbt Slack #suggestions channel per Drew's advice - another extension of the lineage graph color-coding I'd like to see (not related to sources, but rather the search filter). When I filter down to specific --models it would be helpful if the model I name in my search were lit up (kind of like things light up purple when you select them). I often use the +model+ or @model searches here pretty frequently and sometime there are too many models loading in to stay grounded around my search.

@guy-adams
Copy link

+1 for being able to map colours to tags. One additional thought - may people (e.g. in an ingestion,curation,calcuation,consumption architecture) have clearly defined 'layers' or groupings (e.g. staging, intermediate marts, core marts) of models. To be able to force the models into virtual columns (with a title) would be fantastic e.g.

image

I'm already denoting these layers using model tags.

@diegocamelo92
Copy link

I also think this could be very valuable, it would help us identify regions of different sources.

@ptc-sgauglitz
Copy link

Hi -- would like to upvote both of these feature requests (ability to color code, and ability to set columns for designated layers). This would be super useful, including for reviews such as demoed in your own video here: https://www.youtube.com/watch?v=5W6VrnHVkCA&list=PL0QYlrC86xQmPf9QUceFdOarYcv3ETSsz

(also raised in dbt's Slack community)

Thoughts?

@tnightengale
Copy link

tnightengale commented Jan 5, 2022

@drewbanin Hey Drew! Hope all is well :) this issue seems quite stale but it's something that our team would love to have. Any suggestions on where to start in the code base to make a contribution to this?

Specifically, implementing alignment "zones" based on tags like in @guy-adams drawing above!

@JC-Lightfold
Copy link

@drewbanin - columns and colours would be close to my HIGHEST priority right now. Why? Because we're teaching every client how to live/breathe/think DBT and those two things help the DAG be the central unifying concept of the whole. Without them, any moderate sized DAG becomes fairly useless for understanding as a whole.

@yu-iskw
Copy link

yu-iskw commented Apr 28, 2022

It would be important to visually distinguish nodes by layers. At the same time, I would like to cluster nodes by schema and database in terms of dbt, because our BigQuery tables exist in different google cloud projects and datasets. That would enable us to easily manage and audit dbt models.

image

@saraleon1
Copy link

+1 dbt Cloud user

They would like to visually represent the modelling layers as different color nodes. Ex: stg models are represented by blue vs intermediate represented by orange etc. This is to make differentiating the different types of models easier within the DAG visualization.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Sep 6, 2022

In v1.3, a custom node_color config will be supported for all models, seeds, snapshots, and tests: https://docs.getdbt.com/reference/resource-configs/docs#custom-node-colors

We chose not to extend it to sources, exposures, and metrics for now. Still, that capability should go a way toward enabling a number of the desires in this thread!

@saraleon1
Copy link

Great news - thanks Jeremy!!

@belasobral93
Copy link

+1 I would like to visually represent Databrick's medallion architecture

@serene-capybara
Copy link

In v1.3, a custom node_color config will be supported for all models, seeds, snapshots, and tests: https://docs.getdbt.com/reference/resource-configs/docs#custom-node-colors

We chose not to extend it to sources, exposures, and metrics for now. Still, that capability should go a way toward enabling a number of the desires in this thread!

How does one use node_color on seed nodes?

@patriklundberg
Copy link

patriklundberg commented Jun 15, 2023

Would it be possible to change the text color of the node name?

@dbeatty10
Copy link
Contributor

dbeatty10 commented Aug 7, 2023

How does one use node_color on seed nodes?

@serene-capybara to use node color on seeds, you'd do something like this in your dbt_project.yml:

seeds:
  +docs:
    node_color: "#cd7f32"

Or something like this within a YAML file like models/_seeds.yml:

seeds:
  - name: my_seed
    docs:
      node_color: purple

@dbeatty10
Copy link
Contributor

Would it be possible to change the text color of the node name?

@patriklundberg changing the text color is not possible, but #408 has some discussion of how to achieve higher contrast between the text and the background color.

@ngouass
Copy link

ngouass commented Sep 11, 2023

Hello @everyone, I came accross this issue one week ago while looking for a solution to improve the lineage view of my project.

After unsucessful searchs, I'm planning to explore some solutions that I could build in an external tool. Before starting any development, I would like to know if some participants are still looking for a solution, if yes please let me know with A VOTE on this comment.

Below are potentital features, I was able to extract based on comments in this issue:
-> (7) apply custom color coding
- highlight database schemas
- highlight facts and dimensions in different colors
- highlight different materializations (what is a view and what is a table for example).
- Apply color based on tags
-> (46) Group model by layers and Highlight layers
-> (5) Search functionality with model highlighted when matching search query

Thanks.

@ptc-sgauglitz
Copy link

ptc-sgauglitz commented Sep 11, 2023

Hi @ngouass

Color coding is now supported since dbt 1.3 (see @jtcohen6 's Sep 6, 2022 post); I believe that can cover items 1-5 on your list (using either tags or model file/pathnames as intermediary to apply the color coding to as needed). I've used it to indicate the different materializations (your item 4).

Personally, I'd still be very interested in options to improve the layouting, in particular the ability to have models assigned to certain columns, to represent the layers -- very nicely visualized by @guy-adams in his Oct 25, 2019 post above (your item 6, I believe). This could be in the form of a strict (column) assignment or as "hints" to the auto-layouting. Whether I'd be interested in using an additional tool for that would depend on the ease of use/ease of integration in the toolchain.

@ngouass
Copy link

ngouass commented Sep 12, 2023

Hi @ptc-sgauglitz,

Thank you for your answer, I will check the documentation to use that colour code feature.

About the columns assignment to models, I will try to address that in the incoming weeks and get back to you in order to have some feedbacks.

@jtcohen6 jtcohen6 removed the triage label Sep 22, 2023
@polmonso
Copy link

polmonso commented Dec 6, 2023

+1 for being able to map colours to tags. One additional thought - may people (e.g. in an ingestion,curation,calcuation,consumption architecture) have clearly defined 'layers' or groupings (e.g. staging, intermediate marts, core marts) of models. To be able to force the models into virtual columns (with a title) would be fantastic e.g.

image

I'm already denoting these layers using model tags.

DId you open another issue/feature requests with the layer positioning idea?

@github-actions github-actions bot added the triage label Dec 6, 2023
@ptc-sgauglitz
Copy link

DId you open another issue/feature requests with the layer positioning idea?

I did not open another issue nor am I aware of one; it seems both requests were discussed on this one thread.
At least for my purposes, the "color" aspect of it is addressed since dbt 1.3; I would support opening a separate issue specifically for the layouting/"assign columns" aspect. I think this would still be very helpful. I'd be happy to open that issue, or please copy me on it if somebody else opens it.

@mattyb
Copy link

mattyb commented Dec 6, 2023

#330 and #438 address layout, but maybe not specifically "columns"

@ptc-sgauglitz
Copy link

ptc-sgauglitz commented Dec 6, 2023

thanks @mattyb !
#438 looks to be more specific if I see that correctly.
#330 is, to my understanding, quite similar to (at least some of) the asks/suggestions here, but it hasn't seen any traction and has been closed due to inactivity -- which is curious given that the ask here has repeatedly drawn lots of upvotes/interest 🤔

I'd be inclined to open a new issue, citing the tickets and above and referencing the interest voiced in this thread too 🤔

Copy link
Contributor

github-actions bot commented Jun 4, 2024

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 4, 2024
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 11, 2024
@polmonso
Copy link

we're still interested.

@rafaelamilagres
Copy link

It would be great to change the green of source nodes

@dbeatty10
Copy link
Contributor

@rafaelamilagres as you noted, it is not currently possible to use node_color for sources, and they will always appear green.

Our documentation notes the node types that support the node_color property currently:

The docs field can be used to provide documentation-specific configuration to models.
It also supports node_color for models, seeds, snapshots, and analyses. Other node types are not supported.

If you're seeking more node types to be added, could you open a feature request?

@dbeatty10
Copy link
Contributor

we're still interested.

@polmonso and/or @ejgal -- what are you interested in precisely? Just coloring of nodes? Or their layout?

If it's the latter, could you comment on either #330 and #438 or open a new issue?

@ejgal
Copy link

ejgal commented Jul 28, 2024

Both. For the color part I would just like to change the color of sources. Will look into the issues you linked for the layout part.

@patriklundberg
Copy link

patriklundberg commented Jul 28, 2024 via email

@polmonso
Copy link

polmonso commented Aug 5, 2024

@dbeatty10 same as @ejgal, both. But more so the layout, to distinguish clearly between raw - staging/intermediate - datamart layers of the data pipeline. I'll comment on the mentioned issues.

@katy-sadowski
Copy link

@dbeatty10 @jtcohen6 upvoting request to be able to specify a color for source nodes. should we hope/assume that this issue will be reopened to add that feature, or should i file a new feature request? thanks!

my use case is simply to be able to choose the color for source nodes instead of being stuck with green, so i can fully customize the appearance of my DAG 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests