Correlation describes the direction and strength of the association between two numerical variables, X and Y. Simple linear regression assumes that there is a mathematical relationship between X and Y of the form Y = a + bX that can be used to predict the value of the dependent variable (Y) based on a known value of the independent variable (X). Both quantify the direction and strength of the relationship between two numeric variables. Regression attempts to establish how X causes Y to change, and the results of the analysis will vary if X and Y are swapped. With correlation, the X and Y variables are interchangeable.

The correlations and linear regression analysis is available for both main-level and series-level variables.

Correlation plot

Correlation analysis is a statistical method used to evaluate the relationship between two quantitative, continuous variables. A high correlation means that two or more variables have a strong relationship with each other, while a weak correlation means that the variables are hardly related. 

  1. From the analysis window, click "+ New analysis" and select "Correlation and linear regression" from the dropdown menu.

  2. Select the data model in the "Parameters" card on the right side (Main level or Series level). Select the series of interest from the dropdown menu if you choose to analyse series data.

  3. Select the two numerical variables you want to analyse, one X- and one Y-variable, respectively

  4. If you want to perform independent analysis for subgroups, you can select a grouping variable under “Grouping”. The grouping variable can be categorical or numeric (without decimals). When using "Single series" as the data model, grouping variables can be chosen from the main or series levels and can be categorical, numeric (without decimals), or unique.

  5. Open the "Formatting" card on the right side to choose whether to use category values or labels, or whether to display chart legends on your figure (only an option when grouping is selected).

  6. You can apply filters to the dataset to analyse subgroups (optional).  

  7. Export your results (Optional) 

Correlation coefficients (Pearson’s and Spearman’s Rank)

Correlation does not fit a line through the data points. The correlation coefficient (r) just estimates the extent to which two variables tend to change together. The correlation coefficient (r) ranges from -1.0 to 1.0, and the closer r is to -1.0 or 1.0, the stronger the relation of the variables is. If r is 0, there is no correlation at all. A negative r-value means that there is a negative correlation. 

While Pearson’s Correlation Coefficient is often used to evaluate the linear relationship between two continuous variables, Spearman’s Rank Correlation Coefficient is based on ranked values (ordinal variables). Consult your local statistician if you are unsure of which correlation test would be appropriate for your analysis. 

  1. Activate the toggle switch of your choice in the "Parameters" card

  2. The r- and p-values are shown beneath the plot

Linear regression 

In regression analysis, we want to determine the relationship between the dependent variable (Y) and the independent variable (X) and use it for predictions. For example, if we are interested in the effect of age on height, we can predict the height for a given age by fitting a regression line. The analysis consists of fitting an appropriate model by the method of least squares.

  1. To perform linear regression, activate the "Linear Regression" toggle switch in the "Parameters" card.

  2. A line is drawn in your plot, and the linear regression formula (Y = a + bX) is shown beneath the plot.