# Causality Question

I have a strong regression trend but would like some additional information related to causality.  I believe the X is "largely" the independant variable however based on deep domain knowledge and because the trend line is asymetric (inverse relationship in negative X territory). If the Y was the independant variable, the regression trend would be linear throughout.

Causality in both directions is supported by the fact that a polynomial trend seems to have a nice fit.

Question - Is there a way (method) to quantify the nature of the two variables having causality in both directions?

Follow-up Question - Can you quantify the nature of the "dual-direction-causality" by measuring the extent to which the polynomial trend line is pulled away from the original hypothesis line (asymimetric line) towards the null hypothesis (linear line)?

(See attached file #3 which include edits)

• Mathe
+1

In Statistics, there is either independence between variables or dependence. If you are interested in measuring the form and strength of dependency, you could do a regression analysis. You would still need to argue why one variable affects the other (maybe provide a channel of influence) and check for different assumptions so that your final results are not spurious.

• Maybe I am using the wrong term. I guess what I am getting at is causality. I will try to edit to make more clear.

• Savionf
+1

The bounty is too low for the level of the question.

• Ahhhh, that sounds hopeful! I figured the answer would be something simple basically ending in.... "correlation does not equal causality." I will add more money momentarilly.

• "proof of causality". I don't think you can establish that with your data. Unless you had done a carefully controlled experiment, or unless you made a strong argument using quasi-experiments, or unless you were controlling for many potential mediators, which doesn't seem to be the case, I don't think you can argue about causality or direction of causality. I'm sorry to say this, but I think you may have inappropriate expectations with your question.

• I have edited my post again in reaction to your comment. I should not have written "proof of causality." This is because I believe I already have very high confidence in the causality based on the fact that the trend line is asymetric (changes direction in negative X territory). The nature of the two variables dictate (based on simple common sense) that the trend would be linear if the current Y was the independent variable (instead of the current X).

• Could you elaborate on how you came to this data?

• The data is related to my job. Which I would like not to disclose much info about. I can assure you it is high quality data. I can also assure you that I am an expert in the nature of the two variables (20+ years of hands-on experience with how the 2 variables are related in real life). If you would like to talk on the phone, I might be willing.

• Instead, what I am after is... more information on how much the two variables have causality in both directions. It may be the case that this is unanswerable with what I have provided. If that is the case, I am open to that answer. However, it seems logical that, since the polynomial trend is "somewhere in between" the two competing hypothesis (for which variable is Indep). Therefor it seems that there is insight to be gained from comparing the polynomial trend line to the other two lines.

• I am working on one paper that will claim: Y variable is dependent and X is independent (which as mentioned, I feel confident about). However, a second paper could dive into the question we are discussing... does the Y variable also have some limited influence on the X? Obviously, I would need help from someone better at statistics, but I thought I would gauge the "promise" of the second paper here before I pursue it much.

• With numbers along, this is impossible to answer. Check this website to see examples of how, looking at the numbers alone, ii would be impossible to establish causation: https://www.tylervigen.com/spurious-correlations

• Also, check this examples to see how a single statistic (correlation coefficient) can be misleading https://en.wikipedia.org/wiki/Correlation#/media/File:Correlation_examples2.svg

• I also have other variables that I use that confirm that the Y is dependent and the X is independent... but I didn't want to get into that in fear that I would muddy the waters.

• If you believe there are other variables at play that could result in changes of Y, and that were not being held constant as X took on different values, your analysis and conclusions would be useless.

• When I say other variables... I meant other data sets that are proxies for what I claim the Z score of the relationship between the two variables really "means." If my hypothesis is true (X is independent, Y dependent), than the Z score "means something specific." If my hypothesis in not true (Y is indep, and X is dep), than the Z score "Means something very different." I have confirmed correllation between these other "proxies" and the Z score to confirm hypothesis. Again, afraid to muddy..

• Mathe - Thank you for the link - I will read. However, please note that I fully understand correlation coefficients can be misleading. That is why my causation conclusion is based on (a) a deep understanding of the nature of the two variables, (2) the shape of the regression line, (3) other proxy datasets. I am hopeful that the link will explain why comparing trend lines does not provide more insight into the level of dual-direction-causality. Thank you so much!

Answers can be viewed only if
1. The questioner was satisfied and accepted the answer, or
2. The answer was disputed, but the judge evaluated it as 100% correct.