Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can causal_forest() be used to explore moderation of an observational relationship with unmeasured confounding? #1474

Open
jonasdora opened this issue Dec 1, 2024 · 2 comments
Labels
question requires research An issue that needs additional thought and experimentation before it can be implemented.

Comments

@jonasdora
Copy link

Hello, thank you for this great package.
I am planning an exploratory observational study on the relationship between W and Y using intensive longitudinal data. Each participant contributes multiple observations over time, with both W and Y measured repeatedly (data are clustered within participants). We also have multiple situational variables (X) measured at each timepoint.
Our research goal is to understand how the W-Y relationship varies across different situations. We do not have any concrete hypotheses for individual moderators, we are at a quite exploratory point in this line of research. Specifically, we want to:

Robustly identify which situational variables (if any) most strongly moderate the W-Y relationship
Visualize how the strength of the W-Y relationship varies across these situational variables
Account for the nested data structure (observations within participants)

The causal_forest() approach is appealing because it:

Focuses on how one key relationship varies across other variables
Provides variable importance measures for moderators
Handles clustering through the clusters parameter
Offers useful visualization and analysis tools

However, my key concern is that we cannot assume unconfoundedness. There are almost certainly unmeasured variables (e.g., personality traits) that affect both W and Y.

I am wondering if causal_forest() can be validly used to study patterns of moderation in our case? We do not need to make causal claims, we just want to explore the heterogeneity of the W-Y relationship across X.
If not, are there alternative approaches you would recommend that could provide similar insights about moderation while accounting for our clustered data structure?

@ricor07
Copy link

ricor07 commented Dec 28, 2024

Hello, I'd be glad to work on this. Can i get the issue assigned? Thanks

@swager
Copy link
Member

swager commented Dec 29, 2024

@jonasdora this is an interesting question; unfortunately we don't have any out-of-the box workflows for this task.

@ricor07 thanks for volunteering. To me, this looks more like an open research question (i.e., a topic for a stand-alone research paper) rather than just a feature to be added. If/when a standard solution to this overall problem is adopted by the community, we could talk about whether it makes sense to add an implementation to GRF (or instead have the implementation live in a separate package, like we did with policytree).

@erikcs erikcs added question requires research An issue that needs additional thought and experimentation before it can be implemented. labels Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question requires research An issue that needs additional thought and experimentation before it can be implemented.
Projects
None yet
Development

No branches or pull requests

4 participants