Data and variables
To estimate the abovementioned models, we utilize a balanced panel dataset of 120 countries worldwide, from 2002 to 2020. The selection of time and countries is based on data availability.
Economic Growth (ECOG)
ECOG serves as a dependent variable. Consistent with prior studies, we measure it as the “log difference of real per capita GDP multiplied by 100.” The measurement of per capita GDP is based on prices from 2015.
Shadow Economy (SE)
The models use the SE as a core independent variable. SE is defined here as the hidden production of goods and services that occurs outside the official regulation by public authorities for reasons of money, regulation, or institutional. Monetary motives include tax evasion and social security exclusion; regulatory motives include avoiding bureaucracy or onerous regulations; institutional motives include bribery, closely related to weak political institutions and a poor rule of law (Elgin et al. 2021). We utilize two estimates of SE. 1) The MIMIC Model (Multiple Indicators and Multiple Causes) is based on the premise that the SE is determined by several latent factors that can be inferred from observed indicators, such as tax and unemployment rates. The MIMIC model is widely used in the literature due to its ability to account for the unobserved nature of the SE while considering multiple causes of informality, such as regulatory burden and institutional quality. 2) The DGE Model (Dynamic General Equilibrium Model) uses a dynamic generalization of the MIMIC model that incorporates both observed and unobserved heterogeneity, offering an alternative and complementary perspective on SE activities. The DGE model effectively captures cross-country variations in the informal sector, which may be influenced by a range of country-specific factors, such as governance and economic structure. The MIMIC and DGE approaches were chosen for their robustness and ability to address several challenges in SE measurement, such as endogeneity, data gaps, and measurement errors. These methods have been widely used in similar studies due to their validity and ability to deal with unobserved factors that affect the SE. Furthermore, both models have extensive datasets available for nearly all developing countries, allowing for a comprehensive analysis across a diverse range of economies, making them particularly suitable for global studies involving developing nations.
Financial Inclusion (FINI)
FINI acts as a moderating variable in the models. In this study, the FMD and FID are employed as indicators for FINI. The values of the FMD and FID index range from 0 to 1, where a greater value specifies greater financial market and institution development. We multiplied the financial market index and financial institution index data by 100 to standardize the measurement scale with the variable SE.
Control variables
We incorporate multiple control variables. The “log of the initial GDP per capita” is a proxy for initial income (Initial). Including this variable in our growth model adds a dynamic element and allows us to examine whether there is any convergence or divergence in the growth process among the countries in our sample. Physical capital (Phy.Cap) is measured by “gross fixed capital formation as % of GDP”. Human capital (H.Cap) is measured by “gross secondary school enrollment in percentage”. Both physical and human capital are essential to all growth models (Barro, 2003). Industrialization (IND) is measured by the “value added in industry, including construction, as % of GDP”. Here, value added is a measure that captures the additional economic worth created by industries through their production processes. The inflation (INF) is measured by the “GDP deflator as an annual percentage”. The description of variables is also presented in Table 1.
Model specifications
We propose a model based on the above arguments and previous studies. Moreover, to enhance the accuracy of the models and minimize the risk of overlooking crucial variables, we incorporate several control variables as Barro (2003) recommended. First, we examine the effect of the SE on ECOG. To start the exercise, we develop the model as follows:
$${log Y}_{{it}}={\alpha log Y}_{{it}-1}+{\beta }_{1}S{E}_{{it}}+\beta {{^\prime}} Z_{{it}}+{v}_{i}+{\omega }_{t}+{\mu }_{{it}}$$
(2)
Where subscripts i and t denote the country and time period, respectively. vi and ωt are the country and time-specific effects, respectively. μit is the usual error term, and α and β are the respective coefficients. logY represents the log of real per capita GDP. SE is the shadow economy. Z is the vector of control variables, such as physical capital (Phy.Cap), human capital (H.Cap), industrialization (IND), and inflation (INF).
We follow the study of Dollar and Kraay (2003) to formulate the more conventional model in which growth acts as a dependent variable that regresses on initial income, the SE, and a set of control variables. Subtracting lagged income from both sides of Eq. (2), we can express it in an alternative form as follows:
$$\begin{array}{l}{log Y}_{{it}}-{log Y}_{{it}-1}\,={\alpha log Y}_{{it}-1}-{log Y}_{{it}-1}\\\quad\qquad\qquad\qquad\,\,\,+\,{\beta }_{1}S{E}_{{it}}+{\beta^{\prime} Z}_{{it}}+{v}_{i}+{\omega }_{t}+{\mu }_{{it}}\end{array}$$
(3)
Or
$${log Y}_{{it}}-{log Y}_{{it}-1}\,=\left(\alpha -1\right){\mathrm{log}}{Y}_{{it}-1}+{\beta }_{1}S{E}_{{it}}+{\beta^{\prime}Z}_{{it}}+{v}_{i}+{\omega }_{t}+{\mu }_{{it}}$$
(4)
Or
$${{ECOG}}_{{it}}={\beta }_{0}{{Initial}}_{{it}}\,+{\beta }_{1}S{E}_{{it}}+{\beta ^{\prime}Z}_{{it}}+{v}_{i}+{\omega }_{t}+{\mu }_{{it}}$$
(5)
Where \({{\rm{ECOG}}}_{{\rm{it}}}=\,{\log {\rm{Y}}}_{{\rm{it}}}-{\log {\rm{Y}}}_{{\rm{it}}-1}\) represents ECOG, \({{\rm{\beta }}}_{0}={\rm{\alpha }}-1\) is the convergence coefficient, and \({{\rm{Initial}}}_{{\rm{it}}}=\log {{\rm{Y}}}_{{\rm{it}}-1}\) is the initial income use to check the convergence in the growth process. Equation (5) is the final model to be estimated to examine the impact of the SE on ECOG.
To explore the moderating role of FINI in the nexus between the SE and ECOG, we reformulate Eq. (5) as follows:
$${{ECOG}}_{{it}}={\beta }_{0}{{Initial}}_{{it}}\,+{\beta }_{1}S{E}_{{it}}+{\beta }_{2}{{FINI}}_{{it}}+{\beta }_{3}\,(S{E}_{{it}}\times {{FINI}}_{{it}})+{\beta ^{\prime}Z}_{{it}}+{v}_{i}+{\omega }_{t}+{\mu }_{{it}}$$
(6)
Where FINI represents the financial inclusion, SE × FINI denotes the interaction between SE and FINI. In this study, we use two indicators of FINI, including financial market development (FMD) and financial institution development (FID). Equation (6) is rewritten in the extended form as follows:
$${{ECOG}}_{{it}}={\beta }_{0}{{Initial}}_{{it}}\,+{\beta }_{1}S{E}_{{it}}+{\beta }_{2}{{FMD}}_{{it}}+{\beta }_{3}\,(S{E}_{{it}}\times {{FMD}}_{{it}})+{\beta ^{\prime}Z}_{{it}}+{v}_{i}+{\omega }_{t}+{\mu }_{{it}}$$
(7)
$${{ECOG}}_{{it}}={\beta }_{0}{{Initial}}_{{it}}\,+{\beta }_{1}S{E}_{{it}}+{\beta }_{2}{{FID}}_{{it}}+{\beta }_{3}\,(S{E}_{{it}}\times {{FID}}_{{it}})+{\beta ^{\prime}Z}_{{it}}+{v}_{i}+{\omega }_{t}+{\mu }_{{it}}$$
(8)
To explore the marginal impact of the SE on ECOG at different levels of FINI, we take the partial derivative of Eqs. (7) and (8) concerning the SE as follows:
$$\frac{\partial ({{ECOG}}_{{it}})}{\partial (S{E}_{{it}})}={\beta }_{1}+{\beta }_{3}F{{MD}}_{{it}}$$
(9)
$$\frac{\partial ({{ECOG}}_{{it}})}{\partial (S{E}_{{it}})}={\beta }_{1}+{\beta }_{3}F{{ID}}_{{it}}$$
(10)
Where, the signs of β1 and β3 in Eqs. (9) and (10) reveal whether there are complementarity or substitutability effects between FINI and SE. If both coefficients have the same signs, there will be a complementarity effect otherwise substitutability effect.
Methodology
It is important to note that when dealing with ECOG models, the endogeneity issue often arises (Baklouti and Boujelbene, 2020). Endogeneity arises from multiple factors, including omitted variables, simultaneity, measurement errors, or reverse causality. Endogeneity bias can result in inconsistent estimates, potentially leading to inaccurate implications, ambiguous conclusions, and incorrect theoretical interpretations (Ullah et al. 2018). Regarding our current study, the main explanatory variables we are examining appear endogenous, which introduces the possibility of encountering the endogeneity problem in our model. For example, Baklouti and Boujelbene (2020) examined that the SE and ECOG cause one another. Usually, dodging taxes is an SE activity that reduces government revenue. For this reason, this decrease can limit the availability of public money to finance the items considered requirements, such as infrastructure, education, and health care. Government revenue insufficiency impedes ECOG by depriving the local resources (total or supplies) for public investment and development (Baklouti and Boujelbene, 2019). Suppressed economic growth can lead to expansion of the size of the SE as legal businesses push and shrink their market share, while ECOG decline gives rise to the desire to engage in SE activities (Baklouti and Boujelbene, 2020). The literature also found that lagged levels of ECOG may affect their current values (Baklouti and Boujelbene, 2019; Baklouti and Boujelbene, 2020).
As mentioned in the above paragraph, our models have a risk of potential endogeneity problems. Neglecting the endogeneity bias can lead to a significant issue, as the resulting estimates may be misleading or invalid (Ullah et al. 2018). The utilization of OLS presents problems when there are unobservable variations in the data. To address this, employing a fixed effect or within estimation can be a potential solution. However, the fixed effect estimation relies on the assumption that the present values of the explanatory variables are entirely unrelated to previous instances of the dependent variable (Wintoki et al. 2012). The fixed effect estimations will then be biased if the assumption is violated. Furthermore, we cannot assume that the explanatory variables are strictly exogenous when using the fixed effect model. Hence, in the current study, instead of relying on static models, we employ the Generalized Method of Moments (GMM) estimator to estimate our models.
The GMM estimator proposed by Arellano and Bond (1991) and Blundell and Bond (1998) has widespread use in panel data analysis. Even when many sources of endogeneity are present, such as simultaneity, unobserved heterogeneity, and dynamic endogeneity (Roodman, 2009), it provides reliable results. GMM solves the problems of autocorrelation, heteroskedasticity, and endogeneity in the estimates (Blundell and Bond, 1998). This technique allows for controlling for country-specific and period-specific effects while dealing with the issue of endogeneity in variables where dependent variables are used in place of independent variables by taking one or more lags of the dependent variable. The GMM estimator is commonly categorized into “difference GMM” and “system GMM”. For this reason, we employ difference GMM and system GMM estimators over OLS and fixed effects for model estimation.
All data processing and statistical analyses for this study are conducted using the “Stata” software. Stata is a widely used program for managing large datasets and performing econometric analyses, making it a suitable tool for the needs of this study. We apply the “robust” option in the Stata command for both estimators to account for heteroskedasticity and autocorrelation issues in standard errors. Both these techniques are implemented using the “xtabond2” procedure in Stata. Additionally, we compute the Variance Inflation Factor (VIF) to check for multicollinearity among the explanatory variables. VIF values below 10 are considered acceptable and indicate that multicollinearity does not pose a significant issue in the model.
Arellano and Bond (1991) suggested two different tests to validate the post-estimation results in the context of GMM. The Arellano and Bond test starts by testing the hypothesis that the errors from the estimations are first-order correlated (AR1) but not second-order (AR2). The correlation structure of error is explored in this analysis and is vital in helping us understand how to ensure the model is robust. Second, the Hansen-J test assesses the validity of instruments and specifications of instruments to enhance further the assessment of the model’s overall appropriateness and effectiveness.
