Introduction to Convergent Cross Mapping

This post is a follow-on to my previous post on Granger causality. Granger causality has well-known limitations.  As previously discussed, the test can only find “predictive causality” and not true causality. Further, a key requirement of Granger causality is separability, meaning the casual variable is independent of the variable that it influences. This separability tends to be characteristic of stochastic and linear systems. For systems in which separability is not satisfied or which there are shared driving variables, Granger causality is not applicable. A new approach has been suggested to try to understand causality in dynamical systems where effects of casual variables cannot be separated or uncoupled from the variables that they influence.

This approach, called Convergent Cross Mapping (CCM), was first tested in 1991 but was later evolved by Sugihara et al., 2012 for identifying causation in ecological time series. CCM relies on the fundamental assumption that dynamics in the world aren’t purely stochastic and that in some applications, there are governing dynamics that can be represented by some underlying manifold, M, shown in Figure 1. If two variables, X and Y are causally linked, they should share an underlying M as a common attractor manifold. This means that the state of the variables can be used to inform each other. Furthermore, if X is an environmental driver of a population variable Y, information about the states of X can be recovered from Y, but not vice versa. The CCM approach intends to test how much Y can be a reliable estimate of states of  X [1].

Figure 1: Figure 2 from Sugihara et al., 2012 demonstrating CCM on shadow manifolds Mx and My

 In essence, the approach seeks to create “shadow” manifolds of M, termed Mx and My which are constructed using lagged information from each time series. At some time, t, the nearest neighbors on each manifold are determined. Weights are assigned to each nearest neighbor and Y is estimated using the weighted nearest neighbors from X. Then, the correlation between Y and Y|Mx is computed. The length of the time series is important; a longer time series results in closer nearest neighbors and better predictability [2].

Sugihara et al., 2012 demonstrates situations in which CCM is applicable:

  1. Unidirectional Causality: Species X influences Y, but the reverse is not true.
  2. Transivity/External Forcing: Two species, X and Y, do not interact, but are driven by a common external variable Z. This type of interaction has been studied extensively in the famous anchovy and sardine example, where the two species tend to have linked dynamics (one peaks when the other is in a trough). Some theories suggest that the species are directly related and influence each other’s dynamics, but research has shown that neither species interacts with the other and rather that the population dynamics are shaped by an external driver (temperature). However, using correlation or applying Granger Causality tends to imply a causal link. Using CCM, Sugihara et al., 2012 confirms that there is no cross-map signal between the anchovy and sardine time series, but there is a clear mapping between sea surface temperature and anchovies and sea temperature and sardines.

An R package called multispatialCCM is available to implement CCM but the author, Adam Clark, adapts the methods to be applicable to ecological time series that are shorter (<30 sequential observations which is usually required for the traditional CCM method) and potentially disjoint [3]. In the documentation of the package, there is a full example demonstrating the application of the method to some synthetic time series, A and B, meant to demonstrate A being causally forced by B, but the opposite not being true. In that spirit, I tried out the package with a very simple example and using the same dataset from the last post. Here I use precipitation and outflow where precipitation should causally force outflow, but the opposite should not be true. The following code demonstrates this:

library(multispatialCCM)

#Import data
setwd("D:/Documents/")
data=read.delim("Minimal_Train.txt")
Accm=as.vector(data$observed_cms)[1:200]
Bccm=as.vector(data$Precip)[1:200]

#Simulate data to use for multispatial CCM test
#See function for details - A is causally forced by B,
#but the reverse is not true.
#Calculate optimal E
maxE<-5 #Maximum E to test
#Matrix for storing output
Emat<-matrix(nrow=maxE-1, ncol=2); colnames(Emat)<-c("A", "B")

#Loop over potential E values and calculate predictive ability
#of each process for its own dynamics
for(E in 2:maxE) {
  #Uses defaults of looking forward one prediction step (predstep)
  #And using time lag intervals of one time step (tau)
  Emat[E-1,"A"]<-SSR_pred_boot(A=Accm, E=E, predstep=1, tau=1)$rho
  Emat[E-1,"B"]<-SSR_pred_boot(A=Bccm, E=E, predstep=1, tau=1)$rho
}

#Look at plots to find E for each process at which
#predictive ability rho is maximized
matplot(2:maxE, Emat, type="l", col=1:2, lty=1:2,
        xlab="E", ylab="rho", lwd=2)
legend("bottomleft", c("A", "B"), lty=1:2, col=1:2, lwd=2, bty="n")
E_A<-2
E_B<-3
#Check data for nonlinear signal that is not dominated by noise
#Checks whether predictive ability of processes declines with

#increasing time distance
#See manuscript and R code for details
signal_A_out<-SSR_check_signal(A=Accm, E=E_A, tau=1,
                               predsteplist=1:10)
signal_B_out<-SSR_check_signal(A=Bccm, E=E_B, tau=1,
                               predsteplist=1:10)
#Run the CCM test
#E_A and E_B are the embedding dimensions for A and B.
#tau is the length of time steps used (default is 1)
#iterations is the number of bootstrap iterations (default 100)
# Does A "cause" B?
#Note - increase iterations to 100 for consistent results

CCM_boot_A<-CCM_boot(Accm, Bccm, E_A, tau=1, iterations=100)
# Does B "cause" A?
CCM_boot_B<-CCM_boot(Bccm, Accm, E_B, tau=1, iterations=100)
#Test for significant causal signal
#See R function for details
(CCM_significance_test<-ccmtest(CCM_boot_A,
                                CCM_boot_B))
#Plot results
plotxlimits<-range(c(CCM_boot_A$Lobs, CCM_boot_B$Lobs))
#Plot "A causes B"
plot(CCM_boot_A$Lobs, CCM_boot_A$rho, type="l", col=1, lwd=2,
     xlim=c(plotxlimits[1], plotxlimits[2]), ylim=c(0,1),
     xlab="L", ylab="rho")
#Add +/- 1 standard error
matlines(CCM_boot_A$Lobs,
         cbind(CCM_boot_A$rho-CCM_boot_A$sdevrho,
               CCM_boot_A$rho+CCM_boot_A$sdevrho),
         lty=3, col=1)
#Plot "B causes A"
lines(CCM_boot_B$Lobs, CCM_boot_B$rho, type="l", col=2, lty=2, lwd=2)
#Add +/- 1 standard error
matlines(CCM_boot_B$Lobs,
         cbind(CCM_boot_B$rho-CCM_boot_B$sdevrho,
               CCM_boot_B$rho+CCM_boot_B$sdevrho),
         lty=3, col=2)
legend("topleft",
       c("A causes B", "B causes A"),
       lty=c(1,2), col=c(1,2), lwd=2, bty="n")


First we read in the appropriate columns. I also decided to use just the first 200 observations from my 5-year long dataset. While longer time series are supposed to lead to better convergence, I found that for my time series (characterized by a lot of natural variability and noise), that it was best to limit the sequence to cleaner signal.

The next steps involve determining an appropriate E matrix for A and B. This matrix is an embedding dimension matrix and is similar to determining the number of time steps that a single process needs to predict its own dynamics.

Figure 2: Predictability (rho) as a function of E

When looking at Figure 2, you can see that the rho (predictability) is maximized for A=3 and B=2 (for the value of E). The CCM_Boot function runs the actual Convergent Cross Mapping and implements a bootstrapping method where a subset of the time series is used to determine E and the predictability in order to make sure that the ordering of the sample is not a defining factor of the outcome and to address uncertainty in the prediction. Finally, the results are plotted in Figure 3.  There are two main takeaways from this figure: The predictability of A causing B (outflow causing precipitation) is a flat line around 0 while the predictability of B causing A (precipitation causing outflow) is much higher. Furthermore, note how the predictability increases with an increasing time series length which allows for better convergence on the system dynamics. Clark et al., 2015 contains many other more well-defined ecological examples to demonstrate the package as well.

Figure 3: Predictability as a function of L for A causes B and B causes A

References:

[1] Sugihara, G., May, R., Ye, H., Hsieh, C. H., Deyle, E., Fogarty, M., & Munch, S. (2012). Detecting causality in complex ecosystems. science338(6106), 496-500.

[2] McCracken, J. M., & Weigel, R. S. (2014). Convergent cross-mapping and pairwise asymmetric inference. Physical Review E90(6), 062903.

[3] Clark, A. T., Ye, H., Isbell, F., Deyle, E. R., Cowles, J., Tilman, G. D., & Sugihara, G. (2015). Spatial convergent cross mapping to detect causal relationships from short time series. Ecology96(5), 1174-1181.

Special thanks to @ecoquant who suggested to look into this topic!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s