Abstract
The Global Repository of Income Dynamics (GRID) is a new open-access, cross-country database that contains a wide range of micro statistics on income inequality, dynamics, and mobility. It has four key characteristics: it is built on micro panel data drawn from administrative records; it fully exploits the longitudinal dimension of the underlying data sets; it offers granular descriptions of income inequality and income dynamics for finely defined subpopulations; and it is designed from the ground up with the goals of harmonization and cross-country comparability. This paper introduces the database and presents a set of global trends in income inequality and income dynamics across the 13 countries that are currently in GRID. Our results are based on the statistics created for GRID by the 13 country teams who also contributed to this special issue with individual articles.
Original language | English (US) |
---|---|
Pages (from-to) | 1321-1360 |
Number of pages | 40 |
Journal | Quantitative Economics |
Volume | 13 |
Issue number | 4 |
DOIs | |
State | Published - Nov 2022 |
All Science Journal Classification (ASJC) codes
- Economics and Econometrics
Keywords
- Administrative data
- E24
- J24
- J31
- cross-country
- database
- granular
- harmonized
- inequality
- longitudinal
- mobility
- volatility
Access to Document
Other files and links
Fingerprint
Dive into the research topics of 'Global trends in income inequality and income dynamics: New insights from GRID'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
}
In: Quantitative Economics, Vol. 13, No. 4, 11.2022, p. 1321-1360.
Research output: Contribution to journal › Article › peer-review
TY - JOUR
T1 - Global trends in income inequality and income dynamics
T2 - New insights from GRID
AU - Guvenen, Fatih
AU - Pistaferri, Luigi
AU - Violante, Giovanni L.
N1 - Funding Information: We begin by investigating whether there are any trends in idiosyncratic income risk. This question is of obvious interest given the importance of idiosyncratic risk for individual decisions and welfare, and consequently, for social insurance and government policy, among others issues. Since the 1990s, the conventional wisdom among economists has been that idiosyncratic risk increased substantially since the 1970s, a conclusion from empirical analyses of survey-based panel data sets showing rising income volatility. Following the seminal work of Gottschalk and Moffitt (1994) and Moffitt and Gottschalk (1995), a long list of papers that analyze US survey data confirmed their finding and found evidence of a continued rise in volatility all the way to the 2010s.32 Against this backdrop, several recent papers studied US administrative data from the Social Security Administration on earnings histories and reached the opposite conclusion: income volatility at both the short and long horizon has been either flat (Congressional Budget Office (2007)) or declining (Sabelhaus and Song (2010) and Bloom, Guvenen, Pistaferri, Sabelhaus, Salgado, and Song (2017)) since the early 1980s. GRID provides an ideal opportunity to not only revisit this question for the US but also examine possible trends in a wide cross-section of countries. Figure 7 plots the standard deviation of annual income growth for both men (line with squares) and women (line with circles). Most lines are fairly flat, with a few countries (e.g., Argentina and Brazil) showing clear declining trends and a few countries (e.g., Italy, Norway, and Sweden) showing a rising trend, more so for men than for women. Second, among Anglo-Saxon countries, the trend is flat or slightly declining for Canada and the UK and strongly declining for the US. Recall that the GRID data source for the US is the Longitudinal Employer Household Dynamics (LEHD) programs from the US Census Bureau (see Table 1), not the SSA as in the studies cited above. Hence, this constitutes independent evidence on the flat/declining income volatility trend for the United States. Other countries, such as Denmark, France, and Mexico, show an overall flat pattern, indicating no specific trends in income volatility. Finally, Spain shows a cyclical rise in volatility during the Great Recession and its aftermath, but volatility falls back to its initial level for men and is lower for women by the end of the period. Trends in income volatility (men and women). Note: The figure shows the P9010 differential for the UK because the standard deviation is not available. To summarize, Figure 7 paints a somewhat mixed picture. with volatility flat for about half of the countries, declining for some countries and rising for others. It does not provide any evidence of a widespread rise in volatility or income risk around the world. This conclusion echoes our findings above for income inequality, which also showed a mixed picture, with trends in inequality being more idiosyncratic and country specific than reflecting a global rise in inequality. As we will see in a moment, this is not always the case, and for a number of empirical questions we study next, global trends that are far clearer are observed in the vast majority of countries. In this section, we discuss the salient properties of the income growth distribution. Whereas the distribution of income levels is informative about the cross-sectional dispersion in income or income inequality, the distribution of income growth tells us about something quite different: how income evolves for the same individuals over time. As such, it is closely related to income risk or uncertainty and is often equated to the latter under assumptions commonly made in the literature.29 Thus, not only do income levels and income growth correspond to different concepts, but their properties can also be vastly different—and indeed that will be the case, as we see in this section. We begin with the density of annual growth in log income, which turns out to succinctly summarize many of the key features we will discuss in the rest of this section and the one that follows. Figure 5 plots the density of annual log income changes for men and women in Canada (the basic features we describe here hold for all the countries in GRID). Throughout the next two sections, income growth refers to the residualized measure defined in Section 2 (git1) unless otherwise noted. Before even looking at the shape, notice that the standard deviation of 1-year log changes is 0.51 for men and 0.59 for women, which indicates a remarkably high level of volatility. If the data were Gaussian, it would imply that the average worker in Canada faces a typical income shock between 50% and 60%. Although large, 0.50 is in fact the average value in the sample of countries, which ranges from 0.38 for Germany to 0.66 for Mexico. Density of annual change in log income by gender. Earnings growth, however, is not close to being Gaussian. That is the second main takeaway from the figure, which superimposes a Gaussian density with the same standard deviation on top of the empirical density. Relative to the Gaussian density, the data have a very sharp peak in the middle, indicating a much larger probability of small income changes; very thin shoulders, indicating a much lower probability of middling shocks (around ±σ); and longer tails or extreme shocks. The peakedness and long tails are reflected in a very high kurtosis in excess of 15 for men and 10 for women, relative to 3 for a Gaussian. The sharp peak compresses the scale and makes it difficult to see some other important features of the density. Therefore, in Figure 6 we plot the log of the density for all GRID countries, which reveals two more key features that are harder to see when plotting the empirical density. All panels are chosen to have the same x- and y-axis limits for ease of comparability across countries. First, the tails of the distribution of income growth are very long and close to linear for the vast majority of countries. For comparison, the Gaussian log density with the same variance that is superimposed has tails that fall very quickly. The near-linear shape of the log density (which can be seen by the good fit of the linear regression line approximately beyond ±3σ) highlights an important feature: that income growth has a double-Pareto tail distribution. Furthermore, the tails are asymmetric, with the left tail thicker than the right tail, and much more so in some countries. To quantify this, we can look at the estimated slopes of the linear regression fit for the right and left tails, respectively, reported in Table 2. The left tail is thicker than the right tail in every country, and except for Italy and Mexico, the gap is sizable, exceeding 0.8 for every country except for Latin American countries (Argentina, Brazil, Mexico) and Italy.30 Density of annual change in log income (men). Left tail Right tail Left − |Right| σ(ΔY) Argentina 1.88 −2.41 −0.53 0.61 Brazil 1.99 −2.49 −0.50 0.63 Canada 1.38 −2.50 −0.98 0.50 Denmark 1.78 −2.97 −1.19 0.40 France 1.88 −2.71 −0.83 0.45 Germany 2.02 −3.51 −1.42 0.38 Italy 2.19 −2.27 −0.09 0.45 Mexico 1.97 −2.22 −0.26 0.66 Norway 1.18 −1.98 −0.80 0.50 Spain 1.58 −2.44 −0.86 0.42 Sweden 1.87 −3.12 −1.50 0.43 US 1.29 −2.14 −0.96 0.57 Note: The numbers in the first two rows report the slope of the linear regression fit to the tails over [−4,−1] and [1,4], respectively, which are the tail indices of the Pareto tails. The UK is omitted from this table because information on the tail index is not available. Comparing across countries, a few remarks are in order. First, there is no one-to-one mapping between the thickness of the tails and the standard deviation of income change. For example, the correlation between the average of the two tail indices for each country and the standard deviation is only −0.42. This is because the tail index measures the slope of the density in the tails only, so a density with a thicker tail may still have a smaller standard deviation if the level of density at the point we start to measure is low. In other words, the tail index can be interpreted as measuring the likelihood of a very large shock relative to a middling shock without reference to the likelihood of the former. This point is often overlooked in the discussion of Pareto tail indexes. Second, the two Anglo-Saxon countries, the US and Canada, are surprisingly similar to each other in terms of the thickness of each tail (1.33 and −2.31 for Canada versus 1.38 and −2.34 for the US) as well as in the overall dispersion (0.50 versus 0.56). Third, two of the Nordic countries, Denmark and Sweden, together with Germany, rank at the other end of the spectrum, with the lowest overall standard deviation of income growth and thinnest right tails, indicating a smaller chance of large upward income swings in these countries relative to others. Another interesting exception is Norway, which has the thickest tails, both right and left, in the sample (1.18 and −1.98). This compares to 1.78 and 1.87 for Denmark and Sweden for the right tail and −3.37 and −2.97 for the left tail. Although we are not aware of an explanation for why this is the case for Norway (given the low income inequality and the similarities in labor market institutions to other Scandinavian countries), this fact certainly seems worth further investigation. Finally, the Latin American and Southern European countries are between these two extremes. Our first look at the density reveals four key properties of the income growth distribution. The distribution has: (i) very high dispersion, (ii) very high excess kurtosis, (iii) thick double Pareto tails, and (iv) negative skewness, especially out in the tails as seen from a significantly thicker left tail relative to the right. These four properties confirm that the earlier results documented by Guvenen, Karahan, Ozkan, and Song (2021) for the US are robust across a broad cross-section of countries.31 We begin by investigating whether there are any trends in idiosyncratic income risk. This question is of obvious interest given the importance of idiosyncratic risk for individual decisions and welfare, and consequently, for social insurance and government policy, among others issues. Since the 1990s, the conventional wisdom among economists has been that idiosyncratic risk increased substantially since the 1970s, a conclusion from empirical analyses of survey-based panel data sets showing rising income volatility. Following the seminal work of Gottschalk and Moffitt (1994) and Moffitt and Gottschalk (1995), a long list of papers that analyze US survey data confirmed their finding and found evidence of a continued rise in volatility all the way to the 2010s.32 Against this backdrop, several recent papers studied US administrative data from the Social Security Administration on earnings histories and reached the opposite conclusion: income volatility at both the short and long horizon has been either flat (Congressional Budget Office (2007)) or declining (Sabelhaus and Song (2010) and Bloom, Guvenen, Pistaferri, Sabelhaus, Salgado, and Song (2017)) since the early 1980s. GRID provides an ideal opportunity to not only revisit this question for the US but also examine possible trends in a wide cross-section of countries. Figure 7 plots the standard deviation of annual income growth for both men (line with squares) and women (line with circles). Most lines are fairly flat, with a few countries (e.g., Argentina and Brazil) showing clear declining trends and a few countries (e.g., Italy, Norway, and Sweden) showing a rising trend, more so for men than for women. Second, among Anglo-Saxon countries, the trend is flat or slightly declining for Canada and the UK and strongly declining for the US. Recall that the GRID data source for the US is the Longitudinal Employer Household Dynamics (LEHD) programs from the US Census Bureau (see Table 1), not the SSA as in the studies cited above. Hence, this constitutes independent evidence on the flat/declining income volatility trend for the United States. Other countries, such as Denmark, France, and Mexico, show an overall flat pattern, indicating no specific trends in income volatility. Finally, Spain shows a cyclical rise in volatility during the Great Recession and its aftermath, but volatility falls back to its initial level for men and is lower for women by the end of the period. Trends in income volatility (men and women). Note: The figure shows the P9010 differential for the UK because the standard deviation is not available. To summarize, Figure 7 paints a somewhat mixed picture. with volatility flat for about half of the countries, declining for some countries and rising for others. It does not provide any evidence of a widespread rise in volatility or income risk around the world. This conclusion echoes our findings above for income inequality, which also showed a mixed picture, with trends in inequality being more idiosyncratic and country specific than reflecting a global rise in inequality. As we will see in a moment, this is not always the case, and for a number of empirical questions we study next, global trends that are far clearer are observed in the vast majority of countries. How does idiosyncratic income risk change over the business cycle? Do any robust patterns hold across this broad set of countries? Or does the answer depend on the labor market and other institutions of each country? The answers to these questions are critical for many macroeconomic and policy design features that account for individual heterogeneity and incomplete insurance. To answer these questions, we begin by examining how different moments of the income growth distribution vary with the business cycle. Following earlier work documenting the strong procyclicality of the skewness of income changes, in Figure 8, we plot the Kelley skewness measure of the 1-year log income change for men (line with squares, left axis) together with the annual growth rate of GDP per capita (dashed line, right axis) over time.33 The latter is a natural indicator of the business cycle, so the comovement between the two series would give a direct visual indication of the cyclicality of the statistic plotted. As seen in the figure, the two lines comove to a remarkable extent in almost every country, especially during deep recessions, showing that the skewness of income change is strongly procyclical in all countries in GRID. The pattern for women looks qualitatively very similar, with a somewhat smaller amplitude of fluctuations in skewness, so the analogous figures are included in the Online Appendix (Guvenen, Pistaferri, and Violante (2022)) for brevity. (Kelley) Skewness fluctuations over the business cycle (men). Regarding the magnitudes, are the procyclical fluctuations large? For most countries, the answer is yes. The advantage of the Kelley statistic is that it can easily be mapped into the relative sizes of P9050 and P5010 (or the upper and lower tails) in P9010 of the shock distribution. For example, in Argentina, Kelley skewness went from −0.29 in 2001 to +0.32 in 2003, implying that the share of P9010 of the income growth distribution accounted for by the upper and lower tails flipped from 1/3 and 2/3 to 2/3 and 1/3 in a short span of 2 years.34 (Notice that this comparison controls for the rise in median income growth.) This is a major reversal of the income shock distribution in a very short period. Of course, for Argentina this period was preceded by a large decline in skewness coinciding with the deep 2001 recession, illustrating the procyclical nature of skewed income risk. Similar or larger swings happened in several other countries (e.g., Spain, Italy, Denmark, and the US) and during different episodes, such as the severe recession in Europe during the early 1990s as well as the Great Recession (with skewness falling from 0.24 to −0.49 in 3 years in Spain and from 0.22 to −0.37 in Italy in 2 years). To sum up, a major manifestation of changes in income risk between expansions and recessions is in the skewness of the income change or shock distribution. These procyclical swings are large and synchronized with the business cycle. To more precisely quantify the extent of cyclicality in different moments, we adopt a simple regression framework. In particular, we regress a given moment m on a constant, a linear time trend, and log GDP per capita growth—Δ(logGDPt)≡log(GDPt+1)−log(GDPt),35 1m(Δyt)=α+γt+βm×Δ(logGDPt)+ut, for each country and separately for men and women. We normalize the GDP per capital growth to have unit standard deviation, which makes the estimated β′s easier to compare across countries. Table 3 reports the parameter estimates of βm (multiplied by 100), which measures the cyclical sensitivity of moment m. A significant and positive βm indicates a procyclical moment and vice versa for a negative coefficient. ARG BRA CAN DEN FRA GER ITA MEX NOR SPA SWE UK USA Kelley Skewness Males 15.0 5.6 8.5 8.0 8.8 7.8 14.8 5.2 3.7 27.3 7.1 5.2 7.3 (6.0) (4.6) (9.7) (12.3) (9.3) (5.9) (10.0) (15.8) (4.7) (16.7) (4.8) (6.9) (7.7) Females 8.5 4.7 3.4 2.0 3.9 3.2 6.3 4.5 2.1 12.3 1.8 2.8 3.8 (4.5) (3.8) (7.7) (2.0) (5.2) (3.5) (5.0) (8.1) (2.6) (14.0) (2.6) (4.2) (8.2) P9050: Upper Tail Males 6.7 2.7 2.4 1.8 1.7 1.4 3.0 2.6 0.5 5.9 1.8 0.5 2.3 (4.7) (4.2) (8.4) (5.4) (12.7) (8.0) (6.1) (14.6) (1.4) (8.3) (5.5) (3.4) (8.7) Females 4.2 1.7 0.8 1.5 0.6 0.4 2.3 1.9 −0.9 4.5 1.5 0.1 1.6 (3.0) (2.8) (3.7) (2.6) (2.9) (1.9) (2.2) (3.5) (−1.2) (8.9) (3.7) (0.5) (8.4) P5010: Lower Tail Males −9.6 −4.6 −4.2 −2.5 −2.9 −1.9 −5.6 −4.3 −1.9 −13.4 −2.5 −1.8 −5.0 (−6.7) (−3.2) (−7.9) (−5.6) (−7.3) (−4.6) (−11.1) (−15.1) (−6.4) (−12.3) (−3.4) (−4.4) (−5.9) Females −4.7 −3.7 −2.7 0.2 −2.4 −1.2 −3.0 −3.6 −3.4 −6.0 −0.1 −1.0 −2.0 (−5.3) (−2.9) (−8.2) (0.3) (−4.1) (−2.7) (−4.9) (−19.1) (−4.5) (−16.1) (−0.2) (−2.7) (−5.6) P9010: Volatility Males −2.9 −1.9 −1.8 −0.7 −1.2 −0.5 −2.6 −1.7 −1.3 −7.5 −0.7 −1.3 −2.7 (−4.9) (−1.3) (−4.2) (−1.1) (−2.8) (−1.2) (−3.6) (−9.6) (−2.9) (−4.3) (−1.1) (−2.5) (−4.0) Females −0.5 −2.0 −1.9 1.7 −1.8 −0.9 −0.7 −1.7 −4.3 −1.5 1.4 −1.0 −0.4 (−0.4) (−1.4) (−6.6) (1.3) (−2.6) (−1.5) (−0.5) (−4.3) (−3.8) (−2.3) (1.7) (−1.9) (−1.1) Crow Kurtosis Males −0.73 0.06 0.01 0.21 0.31 0.04 0.58 0.21 −0.15 −1.17 0.32 0.35 0.01 (−1.9) (0.3) (0.1) (0.9) (1.5) (0.2) (2.0) (1.6) (−1.3) (−2.5) (2.5) (3.4) (0.1) Females −0.85 0.09 0.08 0.49 0.31 0.10 0.54 −0.02 0.30 −0.70 0.15 0.48 −0.20 (−2.4) (0.3) (0.5) (1.8) (3.2) (1.3) (2.9) (−0.1) (2.4) (−2.5) (1.1) (4.3) (−2.1) Note: Each cell reports the cyclical sensitivity coefficient, βm, in a regression of statistic m on log annual GDP change plus a constant and a time trend (equation (1)). Except for the Crow kurtosis, the reported coefficient is multiplied by 100 for ease of interpretation. The numbers in parentheses are the t-statistics computed using Newey–West standard errors with 3 lags. The top panel contains the estimates for Kelley skewness, which confirms the visually evident strong procyclicality of skewness for men and shows the same for women: βm is statistically significant with t-statistics that range from 4 to 17 for men and 2 to 14 for women.36 The magnitudes are smaller for women, with the lowest coefficients found in Nordic countries with large public sectors that heavily employ female workers. This finding is consistent with Busch, Domeij, Guvenen, and Madeira (2022), who find that the industry of employment is a key determinant of how cyclical skewed income risk is for a worker. The magnitude of the sensitivity is large, considering that the Kelley statistic is bounded between zero and one. For example, a coefficient of 15.0 for Argentina indicates that a two standard deviation swing (which would be typical when going from a normal expansion to a recession), implies a 0.30 drop in the Kelley skewness of income changes. Clearly, a change in skewness can be driven by a change in the right tail, left tail, or both. To investigate which tail drives the procyclical fluctuations, we run the same cyclicality regressions for P9050 and P5010 separately (next two panels of Table 3). The results suggest that, with one or two exceptions, the right tail is strongly procyclical and the left tail strongly countercyclical for all countries. The magnitudes of the coefficients are comparable, with those on the left tail often slightly larger than those on the right. Finally, the dispersion (P9010) and kurtosis show more mixed patterns. For men, there is evidence of countercyclical volatility, while for women the pattern is less clear and noisier. Kurtosis does not show a clear cyclical pattern, with coefficients sometimes positive and sometimes negative, and t-statistics indicating significance at the 5% level for only a few countries. Overall, this evidence appears too noisy to be economically informative. The skewness of income changes (a proxy for income risk) varies significantly from expansions to recessions in all countries represented in GRID. In particular, income shocks become more negatively skewed in recessions, with the probability of large negative tail shocks rising and the likelihood of large positive shocks falling in recessions. The opposite happens in expansions, which see a rise in the likelihood of large positive shocks and a decline in the likelihood of large negative shocks. As for overall dispersion, while the estimates indicate countercyclical variation, the magnitudes of fluctuations are relatively modest for most countries. Finally, we can identify no robust cyclical patterns in kurtosis. Table 1 gives an overview of the key characteristics of each of the underlying databases used in GRID. All 13 data sets in our database are, as explained, of an administrative nature and assembled by government agencies.12 The sample period covers at least 20 years for 10 of the countries, averaging 26 years over the 13 countries. Spain has the shortest sample period (14 years) and the UK the longest (45). Argentina Brazil Canada Denmark France Germany Italy Mexico Norway Spain Sweden UK US Data set name RELS (Registered Employment Longitudinal Sample) RAIS (Relação Anual de Informações Sociais) CEEDD (Canadian Employer–Employee Dynamics Database) Several registry databases DADS (Déclaration Annuelle des Données Sociales) IAB/TPP (Integrated Employment Biographies/German Taxpayer Panel) INPS (Istituto Nazionale della Previdenza Sociale) LoSal IMSS (Instituto Mexicano del Seguro Social) Inntekts-og formuesregisteret MCVL (Muestra Continua de Vidas Laborales) LOUISE (Longitudinel databas kring utbildning, inkomst och sysselsä ttning) ASHE (Annual Survey of Hours and Earnings) LEHD (Longitudinal Employer-Household Dynamics) Income record frequency Monthly Monthly Annual Annual Job spell IAB (Job spell)/TPP (annual) Job spell Monthly Annual Job spell Annual Weekly earnings Quarterly Top coding No (pooled) 120 × MW No No No IAB (Yes), TPP (No) Yes (645 euro daily threshold) 25 × MW No No No No No Bottom coding No No No No No IAB (No)/TPP (nonfilers) No 1 × MW No No No No No Time span 1996–2015 1985–2018 1983–2016 1987–2016 1991–2016 2001–2016 1985–2016 2005–2019 1993–2017 2005–2018 1985–2016 1975–2020 1998–2019 Number of years 20 34 34 30 26 16 32 15 25 14 32 45 22 Available data set total size 3% random sample/130K to 230K Population/40M 89–93% of all ages 25–55 Canadians Population 4% random sample IAB (10% R.S.)/TPP (25% R.S.) 6.6% R.S. Population/17M to 26M Population 4 % Random Sample Population 1% Sample/140K to 180K Population (excludes a few states) CS sample† [min-max] 97.2K to 167.6K 15.7M to 45.9M 8 Mto 10.8M 1.8M to 1.9M 0.58 M to 1.34 M (2002 ×2) IAB(23.1 to 24.9 M)/TPP(16 to 22.4 M) 700K 12.4M to 19.6M 1.96M to 2.0M 405K to 463K 2.95M to 3.2M 93.1K to 121K 82.1M to 95.6M Note: All data are aggregated to an annual level to calculate GRID statistics. In each country, the databases are produced by the following agencies: RELS by the Ministry of Labor, Employment, and Social Security; RAIS by Ministério da Economia; CEEDD by Statistics Canada; Danish data registries by Statistics Denmark; DADS by INSEE; IAB by Institute for Employment Research and TPP by Research Data Centre of the Statistical Offices of the Federal States; INPS-LoSal by INPS (Istituto Nazionale della Previdenza Sociale); IMSS data by Instituto Mexicano del Seguro Social; Inntekts-og formuesregisteret by Statistics Norway; MCVL by Dirección General de Ordenación de la Seguridad Social; LOUISE by Statistics Sweden; ASHE by ONS (Office of National Statistics); and the LEHD by the US Census Bureau. †CS sample is the cross-sectional sample as defined in the text. R.S. stands for random sample. Income data are originally recorded at monthly to annual frequencies (with the exception of weekly data in the UK) and aggregated to an annual frequency when needed for calculating all GRID statistics. The data are not top-coded for 10 of the 13 countries. The exceptions are Brazil, which has a very high threshold of 120 times the minimum wage, and Italy and Mexico, which have somewhat lower thresholds that nevertheless bind for less than a few percent of the population. Germany has two data sets that are used jointly in GRID: the IAB from the Social Security administration and TPP from the tax authorities. The former has been used extensively in past research (e.g., Card, Heining, and Kline (2013), Song, Price, Guvenen, Bloom, and Von Wachter (2019)) but has fairly severe top coding (about 10% of the population), whereas the latter is nontop-coded but bottom-coded as a result of nonfiling. The Germany team synthetically combined these two data sets to obtain statistics that do not suffer from bottom coding or top coding. As for sample size, for 7 out of 13 countries (Brazil, Canada, Denmark, Mexico, Norway, Sweden, US), the data sets have nearly complete coverage of the relevant population, and the remaining data sets cover about 3% to 25% of the population with the exception of the UK, which as 1% coverage. The size of the final cross-sectional sample (defined below) varies from about 100,000 individuals for Argentina and the UK to 2 to 3 million individuals for the middle group of countries (Denmark, Norway, and Sweden) to as high as 25 million for Germany, 45 million for Brazil, and 95 million for the US. To enhance harmonization and allow meaningful comparisons across countries in the project, we start by imposing three common restrictions. First, we focus on workers between 25 and 55 years old, a range within which most education choices are usually completed and after which workers tend to leave the labor force for retirement. Second, for most of the analysis we drop observations with earnings (defined next) below a threshold (call it y) to avoid using records from workers without a meaningful attachment to the labor force or with very low earnings, which could skew log-based statistics. Specifically, we discard observations with earnings below what workers would earn if they were to work part-time for one quarter at the national minimum wage. For countries without a national minimum wage, we have used the US-specific threshold (in PPP terms). Third, for the countries where labor income is top-coded (Brazil, Italy, and Mexico) we use an imputation procedure. Each team constructed three separate samples to be used for different parts of the analysis:13 1.The cross-sectional (CS) sample is the one used to compute cross-sectional inequality statistics. All individuals who satisfy the three criteria above are in this sample at date t. This sample is the most comprehensive and uses the longest possible time series available. 2.The longitudinal (LX) sample is used to study the distribution of earnings changes. It includes all individuals in the CS sample who, in addition, have 1-year and 5-year forward earnings changes. 3.The heterogeneity (H) sample is used to study variation across demographic groups defined by observable characteristics (such as age, gender, and permanent income). It includes all individuals in the LX sample for whom, in addition, a permanent earnings measure (see the definition below) can be constructed. For this sample, we only select the last 15–20 years available and always pool observations across years.14 Our main variable of interest is annual individual labor earnings (i.e., market income from employment services) comprehensive, whenever possible, of bonuses, overtime pay, tips, commissions, and so on, earned from all jobs held during the calendar year but excluding self-employment income.15 We asked country teams to construct several measures of earnings for worker i in year t: 1.Raw real earnings in levels, yit, and logs, log(yit). Real earnings are computed from nominal earnings and a measure of CPI inflation for each country. 2.Residualized log earnings, εit. This measure is the residual from a regression of log real earnings on a full set of age dummies,16 separately for each year and gender. It is intended to control for predictable changes in individual earnings (life cycle and business cycle effects). 3.Permanent earnings, Pit−1. They are defined as average earnings over the previous 3 years, Pit=∑s=tt−2yis/3, where yis can include earnings below y for at most 1 year. The measure is intended to average over transitory income changes and proxy for skill levels. 4.1-year change in residualized log earnings, git1. It is the 1-year forward change in εit, defined as git1=Δεit=εit+1−εit, where earnings must be above y for both years.17 5.5-year change in residualized log earnings, git5. It is the 5-year forward change in εit, defined as git5=Δεit=εit+5−εit, where earnings must be above y for both years. We now proceed to summarize the stylized facts that emerge from a systematic analysis of the statistics in the common core of the 13 country papers in this special issue. Publisher Copyright: Copyright © 2022 The Authors.
PY - 2022/11
Y1 - 2022/11
N2 - The Global Repository of Income Dynamics (GRID) is a new open-access, cross-country database that contains a wide range of micro statistics on income inequality, dynamics, and mobility. It has four key characteristics: it is built on micro panel data drawn from administrative records; it fully exploits the longitudinal dimension of the underlying data sets; it offers granular descriptions of income inequality and income dynamics for finely defined subpopulations; and it is designed from the ground up with the goals of harmonization and cross-country comparability. This paper introduces the database and presents a set of global trends in income inequality and income dynamics across the 13 countries that are currently in GRID. Our results are based on the statistics created for GRID by the 13 country teams who also contributed to this special issue with individual articles.
AB - The Global Repository of Income Dynamics (GRID) is a new open-access, cross-country database that contains a wide range of micro statistics on income inequality, dynamics, and mobility. It has four key characteristics: it is built on micro panel data drawn from administrative records; it fully exploits the longitudinal dimension of the underlying data sets; it offers granular descriptions of income inequality and income dynamics for finely defined subpopulations; and it is designed from the ground up with the goals of harmonization and cross-country comparability. This paper introduces the database and presents a set of global trends in income inequality and income dynamics across the 13 countries that are currently in GRID. Our results are based on the statistics created for GRID by the 13 country teams who also contributed to this special issue with individual articles.
KW - Administrative data
KW - E24
KW - J24
KW - J31
KW - cross-country
KW - database
KW - granular
KW - harmonized
KW - inequality
KW - longitudinal
KW - mobility
KW - volatility
UR - http://www.scopus.com/inward/record.url?scp=85143358655&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85143358655&partnerID=8YFLogxK
U2 - 10.3982/QE2260
DO - 10.3982/QE2260
M3 - Article
AN - SCOPUS:85143358655
SN - 1759-7323
VL - 13
SP - 1321
EP - 1360
JO - Quantitative Economics
JF - Quantitative Economics
IS - 4
ER -