4.2.2 variable that simultaneously correlates with the cause

4.

2.2 Integrating ConfoundersA target pollutant is likely to have severaldifferent causal pathways under different environmental conditions, whichindicate the causal pathways we learn may be biased and may not reflect thereal reactions or propagations of pollutants. To overcome this, it is necessaryto model the environmental factors (humidity, wind, etc.) as extraneousvariables in the causality model, which simultaneously influence the cause andeffect we will elaborate how to integrate the environmental factors into theGBN-based graphical model, to minimize the biases in causality analysis andguarantee the causal pathways are faithful for the government’s decisionmaking. We first introduce the definition of confounder and then elaborate theintegration.  Confounder. A confounder is defined as a thirdvariable that simultaneously correlates with the cause and effect, e.

g. genderK may affect the effect of recovery P given a medicine Q, Ignoring theconfounders will lead to biased causality analysis. To guarantee an unbiasedcausal inference, the cause-and-effect is usually adjusted by averaging all thesub-classification cases of K 11,integrating environmentalfactors as confounders, denoted as Et = {E(1) t ;E(2) t …..}g, into the GBN-based causal pathways, one challenge is there canbe too many sub-classifications of environmental statuses. For example, ifthere are 5 environmental factors and each factor has 4 statuses, there willexist 45 = 1024 causal pathways for each sub-classification case.

Directlyintegrating Et as confounders to the cause and effect will result in unreliablecausality analysis due to very few sample data conditioned on eachsub-classification case. Therefore, we introduce a discrete hidden confoundingvariable K, which determines theprobabilities of different causal pathways from Qt  to Pt, . The environmental factors Et are further integrated into K, where K = 1; 2……K. In this ways, the large number of sub-classification cases ofconfounders will be greatly reduced to a small number K, as K clusters of the environmental factors. Based on Markovequivalence (DAGs which share the same joint probability distribution 10), wecan reverse the arrow Et  K to K Et, as shown in the right part of K determines the distributions of P,Qt;Et, thus enabling us to learn the distribution of the graphicalmodel from a generative process. To help us learn the hidden variable K, the generative process further introduces a hyper-parameter thatdetermines the distribution of K.

Best services for writing your paper according to Trustpilot

Premium Partner
From $18.00 per page
4,8 / 5
4,80
Writers Experience
4,80
Delivery
4,90
Support
4,70
Price
Recommended Service
From $13.90 per page
4,6 / 5
4,70
Writers Experience
4,70
Delivery
4,60
Support
4,60
Price
From $20.00 per page
4,5 / 5
4,80
Writers Experience
4,50
Delivery
4,40
Support
4,10
Price
* All Partners were chosen among 50+ writing services by our Customer Satisfaction Team

Thus the graphical modelcan be understood as a mixture model under K clusters. We learn the parametersof the graphical model by maximizing the new log likelihood: