University of Michigan, bmodene@umich.edu. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1256260. Any opinions, findings, and conclusions or recommendations expressed in (2024)

\useunder

\ul\settimeformatampmtime\mmddyyyydate

Jamie Fogel and Bernardo Modenesi

Abstract

Recent advances in the literature of decomposition methods in economics have allowed for the identification and estimation of detailed wage gap decompositions. In this context, building reliable counterfactuals requires using tighter controls to ensure that similar workers are correctly identified by making sure that important unobserved variables such as skills are controlled for, as well as comparing only workers with similar observable characteristics. This paper contributes to the wage decomposition literature in two main ways: (i) developing an economic principled network based approach to control for unobserved worker skills heterogeneity in the presence of potential discrimination; and (ii) extending existing generic decomposition tools to accommodate for potential lack of overlapping supports in covariates between groups being compared, which is likely to be the norm in more detailed decompositions. We illustrate the methodology by decomposing the gender wage gap in Brazil.

1 Introduction

Significant attention has been paid to the gap in wages between men and women. Researchers are interested in understanding how much of the gap is due to men and women performing different work using different skills, and how much is due to men and women being paid differently for similar work. A number of methods exist for trying to answer this question. These methods decompose gender wage gaps into a portion explained by differences in characteristics between men and women, and a portion explained by differences in the return to characteristics, or “discrimination”. However, all of these methods rely on three assumptions. First, they assume that unobserved determinants of earnings are independent of gender. To the extent that there exist unobserved worker characteristics that are important for determining wages and are correlated with gender, then researchers will obtain biased estimates of the return to observable characteristics. As a result, decompositions of gender wage gaps into a component explained by covariates and a component explained by the return to covariates will be incorrect. Second, they assume a functional form in order to estimate the function that maps observable characteristics into wages and thus serves as the foundation for counterfactuals that ask what men would earn if they had the same characteristics except their gender were switched to female, and vice versa. Third, they assume that the covariates for male workers and female workers share a common support. While this is likely to hold when the number of covariates is small, as more covariates are added (possibly to satisfy the independence assumption) the common support assumption becomes more likely to be violated.222As more covariates are added it becomes harder to find another worker who shares the same values of all covariates.

In this paper, we (i) propose a new method for identifying unobserved determinants of workers’ earnings from the information revealed by detailed data on worker–job matching patterns, (ii) non-parametrically estimate counterfactual wage functions for male and female workers, (iii) allow for a relaxation of the common support assumption, and (iv) apply our methods by decomposing the gender wage gap in Brazil using improved counterfactuals based on (i), (ii) and (iii). We find that the Brazilian gender wage gap is almost entirely explained by male and female workers who possess similar skills and perform similar tasks being paid different wages, not women possessing skills or tasks that pay relatively lower wages.

To understand the problem created by unobserved determinants of productivity, suppose that there are three types of worker characteristics that are relevant for determining wages: gender, other characteristics observable to researchers, and characteristics that are observable to labor market participants, but not to researchers. A naive wage decomposition would simply compare male wages to female wages and attribute all differences to the effect of gender. A more common approach would condition on observable characteristics like age, experience, occupation, education, and union membership and would attribute all differences in wages, conditional on these characteristics, solely to being a woman as opposed to being a man. However, this would miss the fact that even workers with identical observable covariates may perform distinct labor. As Goldin (2014) shows, male lawyers significantly outearn female lawyers largely because males are more likely to work long, inflexible hours, which leads to high wages. Therefore, if we simply compared the wages of male lawyers to the wages of female lawyers, we might mistakenly conclude that male and female lawyers receive differential pay for the same work, when in fact male and female lawyers perform different types of legal work. In other words, male and female lawyers differ in terms of covariates that are observed by labor market participants but not by researchers.

The key to our approach is identifying information about worker characteristics observable to labor market participants, but not to researchers, directly from the behavior of labor market participants. If we can identify groups of workers and groups of jobs who are similar from the perspective of labor market participants, then we can be confident that any gender wage differentials within these groups are due to differential returns to labor market activities by gender, rather than differences in the work done by male and female workers.

We employ a revealed preference approach that relies on workers’ and jobs’ choices, rather than observable variables or expert judgments, to classify workers and jobs into groups. Our key insight is that linked employer-employee data contain a previously underutilized source of information: millions of worker–job matches, each of which reflects workers’ and jobs’ perceptions of the workers’ skills and the jobs’ tasks. Intuitively, if two workers are employed by the same job, they probably have similar skills, and if two jobs employ the same worker those jobs probably require workers to perform similar tasks. However, since discrimination may lead men and women with similar skills to sort into different jobs, our method includes a correction for gender-based sorting into jobs that normalizes workers’ job match probabilities by the match probabilities for their gender.

We formalize this intuition and apply it to large-scale data using a Roy (1951) model in which workers supply labor to jobs according to comparative advantage. Workers belong to a discrete set of latent worker types defined by having the same “skills” and jobs belong to a discrete set of latent markets defined by requiring employees to perform the same “tasks.”333“Skills” and “tasks” should be interpreted broadly as any worker and job characteristics that determine which workers match with which jobs. Workers match with jobs according to comparative advantage, which is determined by complementarities between skills and tasks at the worker type–market level. Workers who have similar vectors of match probabilities over markets are therefore revealed to have similar skills and belong to the same worker type, and jobs that have similar vectors of match probabilities over worker types are revealed to have similar tasks and belong to the same market. Our model extends the model in Fogel and Modenesi (2023) to allow firms to have labor market power, thereby rationalizing pay heterogeneity among workers with the same skills in jobs requiring the same tasks and microfounding the correction for gender-based sorting.

Once we have clustered workers with similar skills into worker types and jobs requiring similar tasks into markets, we turn to estimating counterfactual wage functions. Traditional decomposition methods estimate counterfactual female earnings by fitting wage regressions using observations for male workers only, but generating predicted values by multiplying average female covariate values by the male regression coefficients. This approach suffers from three main issues: (i) it requires the researcher to impose a restrictive regression functional form; (ii) it does not necessarily allow for heterogeneous returns to covariates in predictions; and (iii) it does not have embedded tools to handle when workers do not share similar covariate support. Taken together, these issues can potentially bias the counterfactual estimation exercise, which is the foundation of gender wage gap decompositions. In order to circumvent these issues, we make use of a flexible matching estimator for counterfactual earnings.

We implement a matching estimator in which we match male and female workers who belong to the same worker type and are employed by jobs in the same market. In doing so, we implicitly assume that worker types and markets fully account for all factors, other than gender, that affect workers’ wages, although we also estimate specifications in which we include other observable characteristics in addition to worker types and markets. Within these matched groups, we use the male workers’ mean wages as counterfactuals for what the female workers would have earned if they were male, and vice versa. We compare our matching estimator to a standard estimator and find similar results, although in some specifications the matching estimator is clearly preferable. However, there may be some worker type–market cells with no male workers or no female workers so we introduce a correction to account for this lack of common support.

We address the issue of a lack of common covariate support between male and female workers by decomposing the gender wage gap into four components: (i) differences due to different covariate distributions between groups, i.e. the composition factor, for observations that share the same support; (ii) differences related to differential returns to covariates between groups over a common support of the covariates, i.e. the structural factor, often associated with labor market discrimination; (iii) a part due to observations from male workers being out of the female workers’ support of the covariates; and (iv) the last portion related to observations of female workers being out of the male workers’ support of the covariates. This decomposition allows us to perform counterfactuals similar to existing methods for the part of the distribution of the covariates for which male and female workers have common support, yet it still allows us to quantify how much of the gender wage gap occurs outside the region of common support and would therefore be ignored by standard decomposition methods.

We estimate our model and conduct empirical analyses using Brazilian administrative records from the Annual Social Information Survey (RAIS) that is managed by the Brazilian labor ministry. The RAIS data contain detailed information about every formal sector employment contract, including worker demographic information, occupation, sector, and earnings. Critically, these data represent a network of worker–job matches in which workers are connected to every job they have ever held, allowing us to identify job histories of workers, their coworkers, their coworkers’ coworkers, and so on. We restrict our analysis to the Rio de Janeiro metropolitan area both for computational reasons and because restricting to a single metropolitan area enables us to focus on skills and tasks dimensions of worker and job heterogeneity rather than geographic heterogeneity.

In our data, the average male worker earns a wage 16.7% higher than the average female worker. Our primary result is that almost the entire gender wage gap is attributable to male and female workers who possess similar skills and perform similar tasks being paid differently, or what is often referred to as “discrimination.” This is true at the aggregate level, and remains true when we perform wage decompositions within each worker type–market cell, indicating that this is a widespread phenomenon, not one driven by large wage differentials in small subsets of the labor market. We find that wage decompositions based on standard observable variables suffer from omitted variable bias, emphasizing the need for detailed worker and job characteristics in the form of worker types and markets. We find that wage decompositions based on linear regressions yield similar findings to those based on matching when a lack of common support is not an issue, however when male and female workers’ characteristics do not share a common support the matching estimator with corrections for a lack of common support outperforms alternatives.

Literature: The literature of decomposition methods in economics can be classified into two main branches. The first decomposes average differences in a variable of interest Y𝑌Yitalic_Y — often wages — between two groups of workers. The most widespread method in this class was developed by Oaxaca (1973) and Blinder (1973). The second branch decomposes functionals of the variable of interest Y𝑌Yitalic_Y – e.g. its distribution or quantile function. Given that functionals of a variable often provide more information than its average, the second group of decompositions is referred to as “detailed decompositions” (Fortin et al. 2011). A seminal paper in this group is DiNardo et al. (1996)444Barsky et al. (2002) develop a methodology similar to DiNardo et al. (1996), focusing on issues that arise from lack of common covariate support between the groups in the decomposition. Modenesi (2022) discusses their approach in light of alternatives to handle the lack of common support. and their methodology and inference was further generalized and improved later by Chernozhukov et al. (2013)555Firpo et al. (2018) later in this literature uses influence functions to propose a detailed decomposition that is invariant to the order of the decomposition.. We follow the first branch of the literature in focusing on average differences, largely because our rich set of controls introduces a curse of dimensionality that renders detailed decompositions infeasible.

Our method for handling a lack of common covariate support follows Ñopo (2008) and Garcia et al. (2009)666Garcia et al. (2009) and Morello and Anjolim (2021) both study the evolution of the Brazilian gender gap. Garcia et al. (2009) uses the same approach we use to handle the problem of lack of overlapping supports, and Morello and Anjolim (2021) have a similar matching methodology to decompose the gender gap. In addition to using similar methods for the decomposition, we add the skills and tasks controls derived from the labor market network, and we derive a distribution of gender gaps for different clusters of similar workers performing similar tasks.. In concurrent work we extend Ñopo (2008) to generic “detailed decompositions” (Modenesi 2022).

Our model of labor market power builds on Card et al. (2015), Card et al. (2018) and Gerard et al. (2018) but allows for significantly more granular worker and job heterogeneity. The way we model multidimensional worker–job heterogeneity relates to papers that use a skills-tasks framework in the worker-job matching literature (Autor et al. 2003; Acemoglu and Autor 2011; Autor 2013; Lindenlaub 2017; Tan 2018; Kantenga 2018). Our method for clustering workers and jobs fits into the relatively recent literature in labor economics that extracts latent information from the network structural of the labor market (Sorkin 2018; Nimczik 2018; Jarosch et al. 2019) and directly extends Fogel and Modenesi (2023) by allowing for labor market power. Methodologically, we draw from the community detection branch of network theory (Larremore et al. 2014; Peixoto 2018; 2019)777More precisely, we employ a variant of the SBM which makes use of network edge weights (Peixoto 2018), which are key for us to model the presence of potential discrimination in the labor market.. Our paper connects to this literature by formalizing a theoretical link between monopsonistic labor market models and the stochastic block model, providing microfoundations and economic interpretability of network theory unsupervised learning tools in order to solve economic problems.

By controlling for skills and tasks, our papers share common ground with Goldin (2014) and Hurst et al. (2021). Goldin (2014) indicates that the potential residual discrimination in the gender wage gap is due to the nature of the tasks in some occupations, by using a linear regression approach dummies for occupation interacted with the gender dummy. We add to her approach by proposing an economic model for discrimination, which provides us with both worker and job heterogeneity controls, in addition to performing the gender gap decomposition while taking into account potential violations of conventional decomposition assumptions. Hurst et al. (2021) on the other hand are assessing the black-white wage gap over time as function of changes in the taste vs statistical discrimination factors, as well as the result of workers sorting after these changes.

Roadmap: The paper proceeds as follows. Section 2 introduces a simple framework for decomposition methods. Section 3 presents our model of worker–job matching and derives from it our algorithm for clustering workers into worker types and jobs into markets. Section 4 provides greater detail on the wage gap decomposition methods we employ. Section 5 describes our data. Section 6 presents results. Finally, Section 7 concludes.

2 A framework for decomposition methods

We introduce a simple framework for decomposition methods to guide the analysis in this paper. Define the actual wage of worker i𝑖iitalic_i employed by job j𝑗jitalic_j as Yijsubscript𝑌𝑖𝑗Y_{ij}italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT, and let Gisubscript𝐺𝑖G_{i}italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT be a dummy denoting whether worker i𝑖iitalic_i is male. The difference between the average wage for male workers and the average wage for female workers, which we call the “overall wage gap,” can be expressed as:

Δ:=E[Yij|Gi=1]E[Yij|Gi=0]assignΔ𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝐺𝑖0\Delta:=E[Y_{ij}|G_{i}=1]-E[Y_{ij}|G_{i}=0]roman_Δ := italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ](1)

The overall wage gap above can be decomposed into two factors: differences in productivity between male and female workers, usually referred to as the composition factor; and differences in pay between equally productive male and female workers, known as the structural factor. We use the potential outcomes framework in order to formally decompose the overall wage gap into these two factors. Denote by Y0ijsubscript𝑌0𝑖𝑗Y_{0ij}italic_Y start_POSTSUBSCRIPT 0 italic_i italic_j end_POSTSUBSCRIPT the potential wage of worker i𝑖iitalic_i employed by job j𝑗jitalic_j when the worker is female, and Y1ijsubscript𝑌1𝑖𝑗Y_{1ij}italic_Y start_POSTSUBSCRIPT 1 italic_i italic_j end_POSTSUBSCRIPT the potential wage of worker i𝑖iitalic_i employed by job j𝑗jitalic_j when the worker is male. Let x𝑥xitalic_x be the vector of all variables that determine workers’ productivity. We assume that the worker’s gender may affect their pay, but does not directly affect their productivity. We represent the potential outcomes as functions of x𝑥xitalic_x as follows: Ygij:=Yg(xij),g{0,1}formulae-sequenceassignsubscript𝑌𝑔𝑖𝑗subscript𝑌𝑔subscript𝑥𝑖𝑗𝑔01Y_{gij}:=Y_{g}(x_{ij}),g\in\{0,1\}italic_Y start_POSTSUBSCRIPT italic_g italic_i italic_j end_POSTSUBSCRIPT := italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) , italic_g ∈ { 0 , 1 }. Notice that x𝑥xitalic_x has both i𝑖iitalic_i and j𝑗jitalic_j subscripts, as the marginal product of worker i𝑖iitalic_i at their current job j𝑗jitalic_j depends on both the worker’s skills and the job’s tasks. The fact that there is a different earnings function for men and women reflects the possibility that male and female workers with identical productivities may be paid differently. Furthermore, it is possible to use the dummy for gender to represent observed wages as a function of potential outcomes using a switching regression model Yij:=GiYg(xij)(1Gi)Yg(xij)assignsubscript𝑌𝑖𝑗subscript𝐺𝑖subscript𝑌𝑔subscript𝑥𝑖𝑗1subscript𝐺𝑖subscript𝑌𝑔subscript𝑥𝑖𝑗Y_{ij}:=G_{i}Y_{g}(x_{ij})-(1-G_{i})Y_{g}(x_{ij})italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT := italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) - ( 1 - italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ).

At this point we are able to decompose the overall wage gap, ΔΔ\Deltaroman_Δ, into the composition and structural components mentioned above by adding and subtracting the quantity888Analogously, the overall decomposition can be performed by adding and subtracting the male counterfactual quantity E[Y0(xij)|Gi=1]𝐸delimited-[]conditionalsubscript𝑌0subscript𝑥𝑖𝑗subscript𝐺𝑖1E[Y_{0}(x_{ij})|G_{i}=1]italic_E [ italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] to ΔΔ\Deltaroman_Δ. The main results in this paper use the female counterfactual approach.

E[Y1(xij)|Gi=0]:=Y1(xij)𝑑FG=0(x)assign𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0subscript𝑌1subscript𝑥𝑖𝑗differential-dsubscript𝐹𝐺0𝑥E[Y_{1}(x_{ij})|G_{i}=0]:=\int Y_{1}(x_{ij})dF_{G=0}(x)italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] := ∫ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) italic_d italic_F start_POSTSUBSCRIPT italic_G = 0 end_POSTSUBSCRIPT ( italic_x )

from the overall wage gap ΔΔ\Deltaroman_Δ, where FG=0(x)subscript𝐹𝐺0𝑥F_{G=0}(x)italic_F start_POSTSUBSCRIPT italic_G = 0 end_POSTSUBSCRIPT ( italic_x ) is the productivity distribution for female workers. Intuitively, E[Y1(xij)|Gi=0]𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] is the mean earnings for a counterfactual set of workers possessing the female productivity distribution, but who are paid like men999Alternatively, this counterfactual term can be interpreted as the mean earnings of male workers whose productivity distribution was adjusted to match the female productivity distribution.

Δ:=E[Yij|Gi=1]E[Y1(xij)|Gi=0]ΔX:=Composition+E[Y1(xij)|Gi=0]E[Yij|Gi=0]Δ0:=StructuralassignΔassignsubscriptΔ𝑋Composition𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0assignsubscriptΔ0Structural𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝐺𝑖0\Delta:=\underset{\Delta_{X}:=\text{Composition}}{\underbrace{E[Y_{ij}|G_{i}=1%]-E[Y_{1}(x_{ij})|G_{i}=0]}}+\underset{\Delta_{0}:=\text{Structural}}{%\underbrace{E[Y_{1}(x_{ij})|G_{i}=0]-E[Y_{ij}|G_{i}=0]}}roman_Δ := start_UNDERACCENT roman_Δ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT := Composition end_UNDERACCENT start_ARG under⏟ start_ARG italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] end_ARG end_ARG + start_UNDERACCENT roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT := Structural end_UNDERACCENT start_ARG under⏟ start_ARG italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] - italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] end_ARG end_ARG(2)

The composition portion can be rewritten as E[Y1(xij)|Gi=1]E[Y1(xij)|Gi=0]𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})|G_{i}=1]-E[Y_{1}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ]101010We use the representation of the observed Y𝑌Yitalic_Y in terms of potential outcomes to write E[Yij|Gi=1]=E[GiYg(xij)(1Gi)Yg(xij)|Gi=1]=E[Y1(xij)|Gi=1]𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝐺𝑖1𝐸delimited-[]subscript𝐺𝑖subscript𝑌𝑔subscript𝑥𝑖𝑗conditional1subscript𝐺𝑖subscript𝑌𝑔subscript𝑥𝑖𝑗subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖1E[Y_{ij}|G_{i}=1]=E[G_{i}Y_{g}(x_{ij})-(1-G_{i})Y_{g}(x_{ij})|G_{i}=1]=E[Y_{1}%(x_{ij})|G_{i}=1]italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] = italic_E [ italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) - ( 1 - italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] = italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] and substitute it in ΔXsubscriptΔ𝑋\Delta_{X}roman_Δ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT.. It represents the difference between what male workers actually earn and what male workers would have earned in a counterfactual scenario in which their productivity distribution was equivalent to the female productivity distribution. This quantity captures the portion of the overall wage gap attributable to differences in the composition, or distribution of productivity, between male and female workers. The structural portion is equivalent to E[Y1(xij)Y0(xij)|Gi=0]𝐸delimited-[]subscript𝑌1subscript𝑥𝑖𝑗conditionalsubscript𝑌0subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})-Y_{0}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ]111111Analogously to the previous term, using the map from the potential outcomes to the observed Y𝑌Yitalic_Y, we can write E[Yij|Gi=0]=E[GiYg(xij)(1Gi)Yg(xij)|Gi=1]=E[Y0(xij)|Gi=0]𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝐺𝑖0𝐸delimited-[]subscript𝐺𝑖subscript𝑌𝑔subscript𝑥𝑖𝑗conditional1subscript𝐺𝑖subscript𝑌𝑔subscript𝑥𝑖𝑗subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌0subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{ij}|G_{i}=0]=E[G_{i}Y_{g}(x_{ij})-(1-G_{i})Y_{g}(x_{ij})|G_{i}=1]=E[Y_{0}%(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] = italic_E [ italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) - ( 1 - italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] = italic_E [ italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] and substitute it in Δ0subscriptΔ0\Delta_{0}roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.. This is the difference between female earnings in a counterfactual state in which females were paid equivalently to what equally productive male workers are paid and actual average female earnings. This portion of the overall wage gap is due to structural differences in how the two genders are paid, holding productivity constant, which is why this term is often associated with a form of discrimination.

What we define as the structural component might reasonably be thought as discrimination, where labor market discrimination is defined as workers with similar productivity, performing similar tasks, and being paid differently based on observables that do not influence productivity. Other forms of discrimination may exist — including mistreatment or harassment, differential pre-job human capital accumulation opportunities, or discriminatory hiring practices — but we do not consider those in this paper. In our set up, individual discrimination occurs when the wage for worker i𝑖iitalic_i at job j𝑗jitalic_j is different if the individual’s gender changes, ceteris paribus, i.e. Y1(xij)Y0(xij)0subscript𝑌1subscript𝑥𝑖𝑗subscript𝑌0subscript𝑥𝑖𝑗0Y_{1}(x_{ij})-Y_{0}(x_{ij})\neq 0italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ≠ 0. The problem is that, in order to measure this quantity, we run into the fundamental problem of causal inference: it is impossible to observe the potential wages in both states for the same individual. Therefore we must make assumptions in order to construct counterfactual values, i.e. the value of Y1subscript𝑌1Y_{1}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT for a female worker, or the value of Y0subscript𝑌0Y_{0}italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT for a male worker. In this paper, we break the assumptions needed for the counterfactual estimation into two parts and we show how our approach contributes to deal with limitations in each of them.

The first assumption is that workers with the same values of x𝑥xitalic_x are equally productive and would be paid equal wages if gender played no role in wage determination, conditional on productivity. This is equivalent to assuming that x𝑥xitalic_x contains all factors that affect productivity and are correlated with gender. This “conditional independence/ignorability” assumption, is the basis of all decomposition methods in economics (Fortin et al. 2011), as it is a requirement for consistency of its estimates for the gap decomposition portions. However, not all factors that theoretically should be included in x𝑥xitalic_x are observable.

A problem would arise if certain factors that contribute to worker i𝑖iitalic_i’s productivity in job j𝑗jitalic_j are both unobserved by the econometrician and correlated with gender. If such factors exist, our counterfactuals would be invalid. Specifically, wage differentials due to unobserved differences in skills and tasks between male and female workers would be attributed to the effect of gender itself. For example, if women tend to have better social skills but we do not observe social skills, then we would interpret women outearning men in social skill-intensive jobs as discrimination against men, when in fact it is simply the result of differences in unobserved skills. Therefore, it is critical to come as close as possible to identifying groups of male and female workers who have exactly the same skills and perform exactly the same tasks. If we do so, then any gender wage differentials within this group are attributable to the effect of gender per se. In Section 3 we address this issue by identifying latent worker and job characteristics relevant to productivity and wage determination using the network of worker–job matches.

The second set of assumptions required to build the counterfactual Y1(x)subscript𝑌1𝑥Y_{1}(x)italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) for females in ΔΔ\Deltaroman_Δ are related to the choice of an estimation strategy for the function Y1()subscript𝑌1Y_{1}(\cdot)italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ )121212Another approach decomposes the wage distributions, as opposed to actual wages, which would be equivalent to switching Y𝑌Yitalic_Y for its distribution FYsubscript𝐹𝑌F_{Y}italic_F start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT, but still needing the estimation of the counterfactual FY1(y|x)subscript𝐹subscript𝑌1conditional𝑦𝑥F_{Y_{1}}(y|x)italic_F start_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_y | italic_x ) for females (e.g. DiNardo et al. 1996 and Chernozhukov et al. 2013). We choose not to employ these decompositions in this paper as our setup does not satisfy basic conditions for decomposing distributions, such as having a low-dimensional vector of observable characteristics x𝑥xitalic_x – given curse of dimensionality – and having the overlapping supports assumptions satisfied.. A common estimation strategy requires fitting a linear wage regression for males and using its estimated coefficients to predict wages, but inputting female workers’ covariates (Oaxaca 1973 and Blinder 1973). This approach is highly tractable, however the assumption of a linear functional form is to some extent arbitrary, and using the same regression coefficients to predict counterfactual earnings for distinct female workers (i.e. allowing no heterogeneous returns to observable characteristics) could lead to biased estimates of counterfactual earnings. An alternative approach relies on matching males to each female worker based on similar observable characteristics, and uses the wages of matched male workers in order to inform each female’s counterfactual wage. This less-parametric approach has the advantage of not imposing any functional form assumption for Y1()subscript𝑌1Y_{1}(\cdot)italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ ), however it requires us to observe a sufficiently rich set of observable variables that male and female workers with the same observables may be assumed to have similar productivity. Moreover, matching methods are unreliable when we are unable to find a female worker with the same observables as a male worker, or vice versa. In Section 3 we describe a new method to enhance the set of observable characteristics available to the researcher, reducing the scope for unobserved determinants of productivity to cause biased estimates. In Section 4 we compare and contrast different methods to decompose the gender wage gap given a set of observable characteristics, circumventing issues present in counterfactual earnings estimation.

3 Revealing latent worker and job heterogeneity using network theory

In this section we present an economic model of monopsonistic wage setting, which rationalizes a wage gap between two groups of workers who have different demographic characteristics, but have the same skills and perform the same tasks. Intuitively, otherwise identical male and female workers may supply labor to individual jobs with different elasticities, and jobs respond by offering them wages with different markdowns. If one group of workers supplies labor to jobs more inelastically, then they will be paid less, holding productivity constant. Moreover, the model microfounds our network-based clustering algorithm, which identifies groups of male and female workers with similar skills who perform similar tasks, and therefore can serve as good counterfactuals for each other. The model builds on the model of the labor market developed in Fogel and Modenesi (2023), with two important differences: (i) in this paper workers have idiosyncratic preferences over individual jobs, not just markets, causing jobs to face upward-sloping labor supply curves, and (ii) firms may offer different wages to men and women, even if they have identical skills and perform identical tasks. The model defines a probability distribution that governs how workers match with jobs, forming the network of worker-job matches observed in linked employer-employee data. We use this probability distribution to assign similar workers to worker types and similar jobs to markets, using a Bayesian method based on generative network theory models, which we present after the economic model.

3.1 Economic model

We propose a model with two primary components: heterogeneous workers who supply labor and firms that produce goods by employing labor to perform tasks. Workers supply their skills to jobs, which are bundles of tasks embedded within firms. Jobs’ tasks are combined by the firms’ production functions to produce output. We assume that firms face an exogenously-determined demand for their goods131313For an alternative version of the model with endogenous product demand, see Fogel and Modenesi (2023).. Our model of the labor market has the following components:

  • Each worker is endowed with a “worker type,” and all workers of the same type have the same skills.

  • A job is a bundle of tasks within a firm. As we discuss in Section 5, we define a job in our data as an occupation–establishment pair.

  • Each job belongs to a “market,” and all jobs in the same market are composed of the same bundle of tasks.

  • There are I𝐼Iitalic_I worker types, indexed by ι𝜄\iotaitalic_ι, and ΓΓ\Gammaroman_Γ markets, indexed by γ𝛾\gammaitalic_γ.

  • The key parameter governing worker-job match propensity is an I×Γ𝐼ΓI\times\Gammaitalic_I × roman_Γ productivity matrix, ΨΨ\Psiroman_Ψ, where the (ι,γ𝜄𝛾\iota,\gammaitalic_ι , italic_γ) cell, ψιγsubscript𝜓𝜄𝛾\psi_{\iota\gamma}italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT denotes the number of efficiency units of labor a type ι𝜄\iotaitalic_ι worker can supply to a job in market γ𝛾\gammaitalic_γ.141414We can think of ψιγsubscript𝜓𝜄𝛾\psi_{\iota\gamma}italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT as ψιγ=f(Xι,Yγ)subscript𝜓𝜄𝛾𝑓subscript𝑋𝜄subscript𝑌𝛾\psi_{\iota\gamma}=f(X_{\iota},Y_{\gamma})italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT = italic_f ( italic_X start_POSTSUBSCRIPT italic_ι end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ), where Xιsubscript𝑋𝜄X_{\iota}italic_X start_POSTSUBSCRIPT italic_ι end_POSTSUBSCRIPT is an arbitrarily high dimensional vector of skills for type ι𝜄\iotaitalic_ι workers, Yγsubscript𝑌𝛾Y_{\gamma}italic_Y start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT is an arbitrarily high dimensional vector of tasks for jobs in market γ𝛾\gammaitalic_γ, and f()𝑓f()italic_f ( ) is a function mapping skills and tasks into productivity. This framework is consistent with Acemoglu and Autor (2011)’s skill and task-based model, and is equivalent to Lindenlaub (2017) and Tan (2018). A key difference is that Lindenlaub and Tan observe X𝑋Xitalic_X and Y𝑌Yitalic_Y directly and assume a functional form for f()𝑓f()italic_f ( ), whereas we assume that X𝑋Xitalic_X, Y𝑌Yitalic_Y, and f()𝑓f()italic_f ( ) exist but are latent. We do not identify X𝑋Xitalic_X, Y𝑌Yitalic_Y, and f()𝑓f()italic_f ( ) directly because in our framework ψιγsubscript𝜓𝜄𝛾\psi_{\iota\gamma}italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT is a sufficient statistic for all of them.

Time is discrete, with time periods indexed by t{1,,T}𝑡1𝑇t\in\{1,\dots,T\}italic_t ∈ { 1 , … , italic_T } and workers make idiosyncratic moves between jobs over time. Neither workers, households, nor firms make dynamic decisions, meaning that the model may be considered one period at a time. We do not consider capital as an input to production.

3.1.1 Firm’s problem

Each firm, indexed by f𝑓fitalic_f, has a production function Yf()subscript𝑌𝑓Y_{f}(\cdot)italic_Y start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( ⋅ ) which aggregates tasks from different labor markets, indexed by γ𝛾\gammaitalic_γ. Firm f𝑓fitalic_f faces exogenously-determined demand for its output, Y¯fsubscript¯𝑌𝑓\bar{Y}_{f}over¯ start_ARG italic_Y end_ARG start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT. The firm’s only cost is labor. As we discuss in the next subsection, firms face upward-sloping labor supply curves and therefore have wage-setting power. Firms demand labor in each market, γ{1,,Γ}𝛾1Γ\gamma\in\{1,\dots,\Gamma\}italic_γ ∈ { 1 , … , roman_Γ } and offer a different wage per efficiency unit of labor for each market. Firms also may offer different wages to workers in different demographic groups g{A,B}𝑔𝐴𝐵g\in\{A,B\}italic_g ∈ { italic_A , italic_B } (e.g. male and female workers), although type A𝐴Aitalic_A and type B𝐵Bitalic_B workers belonging to the same worker type ι𝜄\iotaitalic_ι are equally productive in all jobs. We define a job j𝑗jitalic_j as a firm f𝑓fitalic_f – market γ𝛾\gammaitalic_γ pair. We define the wage per efficiency unit of labor for demographic group g𝑔gitalic_g workers employed in job j𝑗jitalic_j wjgsuperscriptsubscript𝑤𝑗𝑔w_{j}^{g}italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT. Define Ljgsuperscriptsubscript𝐿𝑗𝑔L_{j}^{g}italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT as the quantity of efficiency units of labor supplied by demographic group g𝑔gitalic_g workers to job j𝑗jitalic_j.

The firm’s problem is to choose the quantity of labor inputs in each job for each demographic group in order to minimize costs subject to the constraint that production is greater than or equal to the firm’s exogenous product demand, Y¯fsubscript¯𝑌𝑓\bar{Y}_{f}over¯ start_ARG italic_Y end_ARG start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT:

min{wjA,wjB}j=1Γj=1ΓwjALjA+wjBLjBs.t.Yf(L1,,LΓ)Y¯fsubscriptsuperscriptsubscriptsuperscriptsubscript𝑤𝑗𝐴superscriptsubscript𝑤𝑗𝐵𝑗1Γsuperscriptsubscript𝑗1Γsubscriptsuperscript𝑤𝐴𝑗subscriptsuperscript𝐿𝐴𝑗subscriptsuperscript𝑤𝐵𝑗subscriptsuperscript𝐿𝐵𝑗s.t.subscript𝑌𝑓subscript𝐿1subscript𝐿Γsubscript¯𝑌𝑓\displaystyle\min_{\{{w}_{j}^{A},{w}_{j}^{B}\}_{j=1}^{\Gamma}}\sum_{j=1}^{%\Gamma}w^{A}_{j}L^{A}_{j}+w^{B}_{j}L^{B}_{j}\quad\text{s.t.}\quad Y_{f}\left(L%_{1},\ldots,L_{\Gamma}\right)\geq\bar{Y}_{f}roman_min start_POSTSUBSCRIPT { italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A end_POSTSUPERSCRIPT , italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Γ end_POSTSUPERSCRIPT italic_w start_POSTSUPERSCRIPT italic_A end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT italic_A end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_w start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT s.t. italic_Y start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_L start_POSTSUBSCRIPT roman_Γ end_POSTSUBSCRIPT ) ≥ over¯ start_ARG italic_Y end_ARG start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT

where Lj=LjA+LjBsubscript𝐿𝑗subscriptsuperscript𝐿𝐴𝑗subscriptsuperscript𝐿𝐵𝑗L_{j}=L^{A}_{j}+L^{B}_{j}italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_L start_POSTSUPERSCRIPT italic_A end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_L start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the total amount of efficiency units of labor employed by job j𝑗jitalic_j and Yfsubscript𝑌𝑓Y_{f}italic_Y start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT is a concave and differentiable production function.

Taking the first order condition with respect to wjgsuperscriptsubscript𝑤𝑗𝑔w_{j}^{g}italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT allows us to solve for the wage paid by job j𝑗jitalic_j to workers in demographic group g𝑔gitalic_g as a markdown relative to the marginal revenue product of labor:

wjg=superscriptsubscript𝑤𝑗𝑔absent\displaystyle w_{j}^{g}=italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT =ejg1+ejgMarkdown×μfYfLjMarg. revenue product of labor\displaystyle\underset{\text{ Markdown }}{\underbrace{\frac{e_{j}^{g}}{1+e_{j}%^{g}}}}\qquad\times\underset{\text{Marg. revenue product of labor}}{%\underbrace{\mu_{f}\frac{\partial Y_{f}}{\partial L_{j}}}}underMarkdown start_ARG under⏟ start_ARG divide start_ARG italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG start_ARG 1 + italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_ARG end_ARG × underMarg. revenue product of labor start_ARG under⏟ start_ARG italic_μ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT divide start_ARG ∂ italic_Y start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG end_ARG end_ARG(3)

where μfsubscript𝜇𝑓\mu_{f}italic_μ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT is the shadow revenue associated with one more unit of output and ejg:=LjgwjgwjgLjgassignsuperscriptsubscript𝑒𝑗𝑔superscriptsubscript𝐿𝑗𝑔superscriptsubscript𝑤𝑗𝑔superscriptsubscript𝑤𝑗𝑔superscriptsubscript𝐿𝑗𝑔e_{j}^{g}:=\frac{\partial L_{j}^{g}}{\partial w_{j}^{g}}\frac{w_{j}^{g}}{L_{j}%^{g}}italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT := divide start_ARG ∂ italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG divide start_ARG italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG is the labor supply elasticity of workers from group g𝑔gitalic_g to job j𝑗jitalic_j.

Equation (3) shows that the wage paid to demographic group g𝑔gitalic_g workers employed in job j𝑗jitalic_j (equivalently, employed in market γ𝛾\gammaitalic_γ by firm f𝑓fitalic_f) is the product of a markdown and the marginal revenue product of labor in job j𝑗jitalic_j. The markdown depends on the demographic group g𝑔gitalic_g’s elasticity of labor supply to job j𝑗jitalic_j. As labor supply becomes more elastic, the markdown converges to 1 and the wage converges to the marginal product of labor. Conversely, as labor supply becomes less elastic, the wage declines further below the marginal product of labor. This equation rationalizes different demographic groups being paid different wages for the same labor: if one demographic group supplies labor more inelastically, they will be paid less.151515We are referring to the elasticity of labor supply to a specific job j𝑗jitalic_j, which may differ from a group’s labor supply elasticity to the overall labor market. For example, it could be the case that men supply labor more inelastically at the extensive margin, but women have stronger idiosyncratic preferences for specific jobs, making them less likely to change jobs in response to a wage differential. In this case, women would supply labor less elastically to a specific job j𝑗jitalic_j and thus receive lower wages. The firm employs workers in both demographic groups despite paying them different wages because in order to attract the marginal worker from the lower-paid demographic group, it must raise wages for all inframarginal workers in that group. At some point the marginal cost (inclusive of the required raises for inframarginal workers) of hiring workers from the lower-paid demographic group exceeds the marginal cost of hiring workers from the higher-paid demographic group, and the firm will switch to hiring the higher-paid workers.

3.1.2 Worker’s problem

A worker belonging to worker type ι𝜄\iotaitalic_ι and demographic group g{A,B}𝑔𝐴𝐵g\in\{A,B\}italic_g ∈ { italic_A , italic_B }, has a two step decision. First, she chooses a market γ𝛾\gammaitalic_γ in which to look for a job, and second she chooses a firm f𝑓fitalic_f (and by extension a job j𝑗jitalic_j). The worker’s type defines their skills. Type ι𝜄\iotaitalic_ι workers can supply ψιγsubscript𝜓𝜄𝛾\psi_{\iota\gamma}italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT efficiency units of labor to any job in market γ𝛾\gammaitalic_γ. ψιγsubscript𝜓𝜄𝛾\psi_{\iota\gamma}italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT is a reduced form representation of the skill level of a type ι𝜄\iotaitalic_ι worker in the various tasks required by a job in market γ𝛾\gammaitalic_γ. Units of human capital are perfectly substitutable, meaning that if type 1 workers are twice as productive as type 2 workers in a particular market γ𝛾\gammaitalic_γ (i.e. ψ1γ=2ψ2γsubscript𝜓1𝛾2subscript𝜓2𝛾\psi_{1\gamma}=2\psi_{2\gamma}italic_ψ start_POSTSUBSCRIPT 1 italic_γ end_POSTSUBSCRIPT = 2 italic_ψ start_POSTSUBSCRIPT 2 italic_γ end_POSTSUBSCRIPT), firms would be indifferent between hiring one type 1 worker and two type 2 workers at a given wage per efficiency unit of labor, wjsubscript𝑤𝑗w_{j}italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Therefore, the law of one price holds within each demographic group for each job, and a type ι𝜄\iotaitalic_ι worker belonging to demographic group g𝑔gitalic_g employed in a job in market γ𝛾\gammaitalic_γ is paid ψιγwjgsubscript𝜓𝜄𝛾superscriptsubscript𝑤𝑗𝑔\psi_{\iota\gamma}w_{j}^{g}italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT. Because workers’ time is indivisible, each worker may supply labor to only one job in each period and we do not consider the hours margin.

Workers choose job j𝑗jitalic_j, equivalent to γf𝛾𝑓\gamma fitalic_γ italic_f, in order to maximize utility, which is the sum of log earnings log(ψιγwjg)subscript𝜓𝜄𝛾superscriptsubscript𝑤𝑗𝑔\log(\psi_{\iota\gamma}w_{j}^{g})roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) and an idiosyncratic preference for job j𝑗jitalic_j, εijgsuperscriptsubscript𝜀𝑖𝑗𝑔\varepsilon_{ij}^{g}italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT:

j=superscript𝑗absent\displaystyle j^{*}=italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =argmaxjlog(ψιγwjg)+εijg.subscript𝑗subscript𝜓𝜄𝛾superscriptsubscript𝑤𝑗𝑔superscriptsubscript𝜀𝑖𝑗𝑔\displaystyle\arg\max_{j}\quad\log(\psi_{\iota\gamma}w_{j}^{g})+\varepsilon_{%ij}^{g}.roman_arg roman_max start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) + italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT .

We assume that εijgsuperscriptsubscript𝜀𝑖𝑗𝑔\varepsilon_{ij}^{g}italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT follows a nested logit distribution with parameter νγgsuperscriptsubscript𝜈𝛾𝑔\nu_{\gamma}^{g}italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT, with the γ𝛾\gammaitalic_γ subscript indicating that nests are defined by γ𝛾\gammaitalic_γ:

εijgNestedLogit(νγg)similar-tosuperscriptsubscript𝜀𝑖𝑗𝑔𝑁𝑒𝑠𝑡𝑒𝑑𝐿𝑜𝑔𝑖𝑡superscriptsubscript𝜈𝛾𝑔\displaystyle\varepsilon_{ij}^{g}\sim NestedLogit(\nu_{\gamma}^{g})italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ∼ italic_N italic_e italic_s italic_t italic_e italic_d italic_L italic_o italic_g italic_i italic_t ( italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT )

It follows from this assumption about the distribution of εijgsuperscriptsubscript𝜀𝑖𝑗𝑔\varepsilon_{ij}^{g}italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT that the probability that worker i𝑖iitalic_i belonging to worker type ι𝜄\iotaitalic_ι and demographic group g𝑔gitalic_g matches with job j𝑗jitalic_j in market γ𝛾\gammaitalic_γ is161616Details for the derivation of the choice probability in the Appendix A.:

P(j=j|jγ,iι,g)\displaystyle P(j=j^{*}|j\in\gamma,i\in\iota,g)italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_j ∈ italic_γ , italic_i ∈ italic_ι , italic_g )=exp(Iιγg)νγgγexp(Iιγg)νγgP(γ=γ|iι,jγ,g)1st step: market choice(ψιγwjg)1νγgjγ(ψιγwjg)1νγgP(j=j|iι,jγ,γ=γ,g)2nd step: job choice\displaystyle=\underset{\underset{\text{{1st step}: market choice}}{%\underbrace{\scriptstyle P(\gamma=\gamma^{*}|i\in\iota,j\in\gamma,g)}}}{%\underbrace{\frac{\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}{\sum_{\gamma}%\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}}}\underset{\underset{\text{{2nd %step}: job choice}}{\underbrace{\scriptstyle P(j=j^{*}|i\in\iota,j\in\gamma,%\gamma=\gamma^{*},g)}}}{\underbrace{\frac{(\psi_{\iota\gamma}w_{j}^{g})^{\frac%{1}{\nu_{\gamma}^{g}}}}{\sum_{j\in\gamma}(\psi_{\iota\gamma}w_{j}^{g})^{\frac{%1}{\nu_{\gamma}^{g}}}}}}= start_UNDERACCENT under1st step: market choice start_ARG under⏟ start_ARG italic_P ( italic_γ = italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_i ∈ italic_ι , italic_j ∈ italic_γ , italic_g ) end_ARG end_ARG end_UNDERACCENT start_ARG under⏟ start_ARG divide start_ARG roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG end_ARG end_ARG start_UNDERACCENT under2nd step: job choice start_ARG under⏟ start_ARG italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_i ∈ italic_ι , italic_j ∈ italic_γ , italic_γ = italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_g ) end_ARG end_ARG end_UNDERACCENT start_ARG under⏟ start_ARG divide start_ARG ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG end_ARG end_ARG(4)

where Iιγg:=jγ(ψιγwjg)1νγgassignsubscriptsuperscript𝐼𝑔𝜄𝛾subscript𝑗𝛾superscriptsubscript𝜓𝜄𝛾superscriptsubscript𝑤𝑗𝑔1superscriptsubscript𝜈𝛾𝑔I^{g}_{\iota\gamma}:=\sum_{j\in\gamma}(\psi_{\iota\gamma}w_{j}^{g})^{\frac{1}{%\nu_{\gamma}^{g}}}italic_I start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT := ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT, also referred to as the inclusive value, is the expected utility a type ι𝜄\iotaitalic_ι worker faces when choosing market γ𝛾\gammaitalic_γ. Intuitively, the nested logit assumption decomposes the job choice probability into a first stage in which the worker chooses a market and then a second stage in which the worker chooses a job conditional on their choice of a market.

3.2 Identifying worker types and markets

3.2.1 Deriving the likelihood

Now that we have derived the probability of worker i𝑖iitalic_i matching with job j𝑗jitalic_j from the primitives of our model, the next step is using this probability as the basis for a maximum likelihood procedure that assigns workers to worker types and jobs to markets based on the observed set of worker–job matches. This procedure builds on Fogel and Modenesi (2023), by allowing workers in the same worker type but different demographic groups to have different vectors of match probabilities over jobs.

We decompose the choice probability in equation (4) into a component that depends only on variation at the ι,γ,g𝜄𝛾𝑔\iota,\gamma,gitalic_ι , italic_γ , italic_g level and a component that depends on wages at individual jobs:

P(j=j|jγ,iι,g)\displaystyle P(j=j^{*}|j\in\gamma,i\in\iota,g)italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_j ∈ italic_γ , italic_i ∈ italic_ι , italic_g )=exp(Iιγg)νγg1γexp(Iιγg)νγgψιγ1νγg=:Ωιγgιγgcomponent(wjg)1νγg=:djgjgcomponent.\displaystyle=\underset{\underset{\iota-\gamma-g\text{ component}\quad}{%\underbrace{=:\Omega_{\iota\gamma}^{g}}}}{\underbrace{\frac{\exp(I_{\iota%\gamma}^{g})^{\nu_{\gamma}^{g}-1}}{\sum_{\gamma}\exp(I_{\iota\gamma}^{g})^{\nu%_{\gamma}^{g}}}\psi_{\iota\gamma}^{\frac{1}{\nu_{\gamma}^{g}}}}}\underset{%\underset{\quad j-g\text{ component}}{\underbrace{=:d_{j}^{g}}}}{\underbrace{%\vphantom{\frac{\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}-1}}{\sum_{\gamma}%\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}\psi_{\iota\gamma}^{\frac{1}{\nu_%{\gamma}^{g}}}}(w_{j}^{g})^{\frac{1}{\nu_{\gamma}^{g}}}}}.= start_UNDERACCENT start_UNDERACCENT italic_ι - italic_γ - italic_g component end_UNDERACCENT start_ARG under⏟ start_ARG = : roman_Ω start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_ARG end_UNDERACCENT start_ARG under⏟ start_ARG divide start_ARG roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG end_ARG start_UNDERACCENT start_UNDERACCENT italic_j - italic_g component end_UNDERACCENT start_ARG under⏟ start_ARG = : italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_ARG end_UNDERACCENT start_ARG under⏟ start_ARG ( italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG end_ARG .(5)

The first term reflects workers choosing markets according to comparative advantage, while the second captures the fact that some jobs in market γ𝛾\gammaitalic_γ require more workers than others (due to exogenous product demand differences), and since jobs face upward-sloping labor supply curves, they must pay higher wages to attract greater numbers of workers. Isolating the group-level (ι,γ,g𝜄𝛾𝑔\iota,\gamma,gitalic_ι , italic_γ , italic_g) variation from the idiosyncratic job-level variation allows us to cluster workers into worker types and jobs into markets on the basis of having the same group-level match probabilities, as we discuss below.

The choice probabilities we have discussed thus far refer to a single job search for worker i𝑖iitalic_i. In reality, we may observe workers searching for jobs multiple times, and each of these searches is informative about the latent worker skills and job tasks that define worker types ι𝜄\iotaitalic_ι and markets γ𝛾\gammaitalic_γ. We incorporate repeated searches by assuming that workers periodically receive exogenous separation shocks which arrive following a Poisson process. Upon receiving a separation shock, the worker draws a new εijgsuperscriptsubscript𝜀𝑖𝑗𝑔\varepsilon_{ij}^{g}italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT shock and repeats the job choice process described above. Assuming that Poisson𝑃𝑜𝑖𝑠𝑠𝑜𝑛Poissonitalic_P italic_o italic_i italic_s italic_s italic_o italic_n-distributed exogenous separations happen at a rate digsuperscriptsubscript𝑑𝑖𝑔d_{i}^{g}italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT for the individual worker i𝑖iitalic_i, then the expected number of times she will match with job j𝑗jitalic_j throughout our sample period is given by

digP(j=j|jγ,iι,g)=Ωιγgdigdjg.\displaystyle d_{i}^{g}\cdot P(j=j^{*}|j\in\gamma,i\in\iota,g)=\Omega_{\iota%\gamma}^{g}d_{i}^{g}d_{j}^{g}.italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ⋅ italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_j ∈ italic_γ , italic_i ∈ italic_ι , italic_g ) = roman_Ω start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT .(6)

Equation 6 forms the basis of our algorithm for clustering workers into worker types and jobs into markets, but before proceeding we must define some notation. Let NWsubscript𝑁𝑊N_{W}italic_N start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT and NJsubscript𝑁𝐽N_{J}italic_N start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT denote the number of workers and jobs, respectively, in our data. Define Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT as the number of times that worker i𝑖iitalic_i is observed to match with job j𝑗jitalic_j. Further, define 𝑨𝑨\bm{A}bold_italic_A as the matrix with typical element Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT. 𝑨𝑨\bm{A}bold_italic_A is a NW×NJsubscript𝑁𝑊subscript𝑁𝐽N_{W}\times N_{J}italic_N start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT matrix and represents the full set of worker–job matches observed in our data. As discussed previously, each individual worker belongs to a latent worker type denoted by ι𝜄\iotaitalic_ι and each job belongs to a latent market denoted by γ𝛾\gammaitalic_γ. The list of all latent worker type and market assignments is stored in the (NW+NJ)×1subscript𝑁𝑊subscript𝑁𝐽1(N_{W}+N_{J})\times 1( italic_N start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT + italic_N start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) × 1 vector denoted by 𝒃𝒃\bm{b}bold_italic_b, known as the node membership vector. We define 𝒈𝒈\bm{g}bold_italic_g as the NW×1subscript𝑁𝑊1N_{W}\times 1italic_N start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT × 1 vector containing each worker’s demographic group affiliation. The matrix of worker–job matches 𝑨𝑨\bm{A}bold_italic_A and workers’ demographic groups 𝒈𝒈\bm{g}bold_italic_g are the data we use to cluster workers and jobs, while the node membership vector 𝒃𝒃\bm{b}bold_italic_b is the latent object identified by the maximum likelihood procedure we discuss below.

Following equation (6), the expected number of matches between a worker–job pair, Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT, can be written as171717It is worth mentioning that: (i) the information iι,jγformulae-sequence𝑖𝜄𝑗𝛾i\in\iota,j\in\gammaitalic_i ∈ italic_ι , italic_j ∈ italic_γ is contained in 𝒃𝒃\bm{b}bold_italic_b; and (ii) Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is the number of matches between worker i𝑖iitalic_i and job j𝑗jitalic_j, which makes the event that j=j|i𝑗conditionalsuperscript𝑗𝑖j=j^{*}|iitalic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_i equivalent to the event that Aij=1subscript𝐴𝑖𝑗1A_{ij}=1italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1. These two facts allow us to use more succinct notation that directly links theoretical objects in our model to data: P(j=j|jγ,iι,g)=P(Aij=1|𝒃,g)P(j=j^{*}|j\in\gamma,i\in\iota,g)=P(A_{ij}=1|\bm{b},g)italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_j ∈ italic_γ , italic_i ∈ italic_ι , italic_g ) = italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1 | bold_italic_b , italic_g ), which we know the distributional form for. This connects notations from the economic model to the network model, but it still lacks the precise definition of the likelihood of interest, P(𝑨,𝒈|𝒃)𝑃𝑨conditional𝒈𝒃P(\bm{A},\bm{g}|\bm{b})italic_P ( bold_italic_A , bold_italic_g | bold_italic_b ), where Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT can assume values other than just 1111.

E[Aij|𝒃,g]=Ωιγgdigdjg.𝐸delimited-[]conditionalsubscript𝐴𝑖𝑗𝒃𝑔superscriptsubscriptΩ𝜄𝛾𝑔superscriptsubscript𝑑𝑖𝑔superscriptsubscript𝑑𝑗𝑔\displaystyle E[A_{ij}|\bm{b},g]=\Omega_{\iota\gamma}^{g}d_{i}^{g}d_{j}^{g}.italic_E [ italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | bold_italic_b , italic_g ] = roman_Ω start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT .(7)

We prove in Appendix C that our assumption of Poisson-distributed exogenous separation shocks implies that Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT follows a Poisson distribution:

Aij|𝒃,gPoisson(Ωιγgdigdjg)similar-toconditionalsubscript𝐴𝑖𝑗𝒃𝑔𝑃𝑜𝑖𝑠𝑠𝑜𝑛superscriptsubscriptΩ𝜄𝛾𝑔superscriptsubscript𝑑𝑖𝑔superscriptsubscript𝑑𝑗𝑔\displaystyle A_{ij}|\bm{b},g\sim Poisson(\Omega_{\iota\gamma}^{g}d_{i}^{g}d_{%j}^{g})italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | bold_italic_b , italic_g ∼ italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( roman_Ω start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT )(8)

Finally, we incorporate equation (8) above to fully characterize the likelihood of our data as a function of the unknown parameters, by applying Bayes rule:

P(Aij,g|𝒃)=P(Aij|𝒃,g)Poisson(Ωιγgdigdjg)P(g|𝒃)αιγg,𝑃subscript𝐴𝑖𝑗conditional𝑔𝒃𝑃𝑜𝑖𝑠𝑠𝑜𝑛superscriptsubscriptΩ𝜄𝛾𝑔superscriptsubscript𝑑𝑖𝑔superscriptsubscript𝑑𝑗𝑔𝑃conditionalsubscript𝐴𝑖𝑗𝒃𝑔superscriptsubscript𝛼𝜄𝛾𝑔𝑃conditional𝑔𝒃\displaystyle P(A_{ij},g|\bm{b})=\underset{Poisson(\Omega_{\iota\gamma}^{g}d_{%i}^{g}d_{j}^{g})}{\underbrace{P(A_{ij}|\bm{b},g)}}\underset{\alpha_{\iota%\gamma}^{g}}{\underbrace{P(g|\bm{b})}},italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_g | bold_italic_b ) = start_UNDERACCENT italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( roman_Ω start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) end_UNDERACCENT start_ARG under⏟ start_ARG italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | bold_italic_b , italic_g ) end_ARG end_ARG start_UNDERACCENT italic_α start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_UNDERACCENT start_ARG under⏟ start_ARG italic_P ( italic_g | bold_italic_b ) end_ARG end_ARG ,(9)

where αιγgP(g|𝒃)superscriptsubscript𝛼𝜄𝛾𝑔𝑃conditional𝑔𝒃\alpha_{\iota\gamma}^{g}\equiv P(g|\bm{b})italic_α start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ≡ italic_P ( italic_g | bold_italic_b ) is the fraction of type ι𝜄\iotaitalic_ι workers employed in market γ𝛾\gammaitalic_γ jobs who belong to the demographic group g𝑔gitalic_g. Equation 9 corresponds to a commonly-used method from network theory known as the bipartite degree-corrected stochastic block model with edge weights (SBM). The SBM clusters nodes in a network (workers and jobs) into groups (worker types and markets) based on patterns of connections between nodes.181818Larremore et al. (2014) lays out the advantages of using bipartite models over using one-sided network projections to fit SBMs; Karrer and Newman (2011) presents the methodology for degree-correction as it enhances significantly the ability of the SBM to fit large scale real world networks; and Peixoto (2018) deal with weighted SBM inference, which is how we accommodate discrimination influencing matches within the SBM.. The main parameter of interest is the set of assignments of workers to worker types and jobs to markets contained in 𝒃𝒃\bm{b}bold_italic_b, while all of the other parameters are nuisance parameters that can be straightforwardly determined after 𝒃𝒃\bm{b}bold_italic_b is defined (Karrer and Newman 2011). The next step is to maximize the likelihood defined in equation 9, which we address in the next subsection.

3.2.2 A Bayesian approach to recovering worker types and markets

In order to make the estimation of worker types and markets feasible, together with using a principled method for choosing the number of clusters, we employ Bayesian methods from the network literature (Peixoto 2017). We can rewrite equation (9) as

P(𝒃|Aij,g)𝑃conditional𝒃subscript𝐴𝑖𝑗𝑔proportional-to\displaystyle P(\bm{b}|A_{ij},g)\quad\proptoitalic_P ( bold_italic_b | italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_g ) ∝P(Aij,g|𝒃)P(𝒃)𝑃subscript𝐴𝑖𝑗conditional𝑔𝒃𝑃𝒃\displaystyle\qquad P(A_{ij},g|\bm{b})P(\bm{b})italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_g | bold_italic_b ) italic_P ( bold_italic_b )
=\displaystyle==P(Aij|𝒃,g)Poisson(Ωιγgdigdjg)P(g|𝒃)αιγgP(𝒃)Prior𝑃𝑜𝑖𝑠𝑠𝑜𝑛superscriptsubscriptΩ𝜄𝛾𝑔superscriptsubscript𝑑𝑖𝑔superscriptsubscript𝑑𝑗𝑔𝑃conditionalsubscript𝐴𝑖𝑗𝒃𝑔superscriptsubscript𝛼𝜄𝛾𝑔𝑃conditional𝑔𝒃Prior𝑃𝒃\displaystyle\quad\underset{Poisson(\Omega_{\iota\gamma}^{g}d_{i}^{g}d_{j}^{g}%)}{\underbrace{P(A_{ij}|\bm{b},g)}}\underset{\alpha_{\iota\gamma}^{g}}{%\underbrace{P(g|\bm{b})}}\underset{\text{Prior}}{\underbrace{P(\bm{b})}}start_UNDERACCENT italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( roman_Ω start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) end_UNDERACCENT start_ARG under⏟ start_ARG italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | bold_italic_b , italic_g ) end_ARG end_ARG start_UNDERACCENT italic_α start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_UNDERACCENT start_ARG under⏟ start_ARG italic_P ( italic_g | bold_italic_b ) end_ARG end_ARG underPrior start_ARG under⏟ start_ARG italic_P ( bold_italic_b ) end_ARG end_ARG(10)

Maximizing the posterior distribution means assigning individual workers to worker types ι𝜄\iotaitalic_ι and jobs to markets γ𝛾\gammaitalic_γ. The basic intuition follows from and is described in greater detail in Fogel and Modenesi (2023): workers belong to the same worker type if they have approximately the same vector of match probabilities over jobs, while jobs belong to the same market if they have approximately the same vector of match probabilities over workers. The key difference in this paper is that workers in the same worker type ι𝜄\iotaitalic_ι may belong to different demographic groups g𝑔gitalic_g and each worker type–demographic group pair may face its own wage and therefore have its own match probability. Equation (3.2.2) allows for this by allowing the match probabilities P(Aij,g|𝒃)𝑃subscript𝐴𝑖𝑗conditional𝑔𝒃P(A_{ij},g|\bm{b})italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_g | bold_italic_b ) to depend on the workers’ demographic group g𝑔gitalic_g in addition to the worker types and markets stored in 𝒃𝒃\bm{b}bold_italic_b.

If worker types are defined by having common vectors of match probabilities over jobs, but match probabilities are allowed to vary by demographic group within a worker type, how do we know that type ι𝜄\iotaitalic_ι workers in group A𝐴Aitalic_A belong to the same worker type as type ι𝜄\iotaitalic_ι workers in group B𝐵Bitalic_B? The answer is embedded in equation (3.2.2). The αιγgsuperscriptsubscript𝛼𝜄𝛾𝑔\alpha_{\iota\gamma}^{g}italic_α start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT term in equation (3.2.2) adjusts workers’ match probabilities so that they are relative to their own gender. Suppose women are significantly underrepresented in construction jobs and overrepresented in nursing jobs, and vice versa for men. Once we incorporate this adjustment, we would assign workers to a construction-intensive worker type if they are disproportionately likely to match with construction jobs, relative to other workers of their gender. Once we adjust the raw match probabilities to account for this selection, we obtain identical adjusted match probability vectors for this group of men and this group of women, causing us to assign them to the same worker type, ι𝜄\iotaitalic_ι.

Equation (3.2.2) assumes that we know the number of worker types and markets a priori, however this is rarely the case in real world applications. Therefore we must choose the number of worker types and markets, I𝐼Iitalic_I and ΓΓ\Gammaroman_Γ respectively. We do so using the principle of minimum description length (MDL), an information theoretic approach that is commonly used in the network theory literature. MDL chooses the number of worker types and markets to minimize the total amount of information necessary to describe the data, where the total includes both the complexity of the model conditional on the parameters and the complexity of the parameter space itself. MDL will penalize a model that fits the data very well but overfits by using a large number of parameters (corresponding to a large number of worker types and markets), and therefore requires a large amount of information to encode it. MDL effectively adds a penalty term in our objective function, such that our algorithm finds a parsimonious model. See Fogel and Modenesi (2023) for greater detail.

Equation (3.2.2) defines a combinatorial optimization problem. If we had infinite computing resources, we would test all possible assignments of workers to worker types and jobs to markets and choose the one that maximizes the likelihood in equation (3.2.2), however this is not computationally feasible for large networks like ours. Therefore, we use a Markov chain Monte Carlo (MCMC) approach in which we modify the assignment of each worker to a worker type and each job to a market in a random fashion and accept or reject each modification with a probability given as a function of the change in the likelihood. We repeat the procedure for multiple different starting values to reduce the chances of finding local maxima. We implement the procedure using a Python package called graph-tool. (https://graph-tool.skewed.de/. See Peixoto (2014) for details.) Now that we have dealt with the issue of important worker and job characteristics being unobserved, we turn our attention to estimating counterfactuals for wage gap decompositions.

4 Wage gap decomposition

This section lays out the estimation strategies we use to decompose the Brazilian gender wage gap, while circumventing some of the issues associated with conventional decomposition methods. We decompose the gender wage gap into the quantities listed in equation (2): the composition component E[Y1(xij)|Gi=1]E[Y1(xij)|Gi=0]𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})|G_{i}=1]-E[Y_{1}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] and the structural component E[Y1(xij)Y0(xij)|Gi=0]𝐸delimited-[]subscript𝑌1subscript𝑥𝑖𝑗conditionalsubscript𝑌0subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})-Y_{0}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ]. The quantity E[Yg(xij)|Gi=g]=E[Yij|Gi=g]𝐸delimited-[]conditionalsubscript𝑌𝑔subscript𝑥𝑖𝑗subscript𝐺𝑖𝑔𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝐺𝑖𝑔E[Y_{g}(x_{ij})|G_{i}=g]=E[Y_{ij}|G_{i}=g]italic_E [ italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_g ] = italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_g ], g{0,1}𝑔01g\in\{0,1\}italic_g ∈ { 0 , 1 } can be consistently and straightforwardly estimated since it is directly observable. The challenge is estimating the counterfactual wage function E[Y1(xij)|Gi=0]𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ], given that the potential outcome Y1(xij)subscript𝑌1subscript𝑥𝑖𝑗Y_{1}(x_{ij})italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) is not observed for female workers. Estimating E[Y1(xij)|Gi=0]𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] requires us to use data on male workers to estimate a relationship between observable characteristics xijsubscript𝑥𝑖𝑗x_{ij}italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT and male earnings Y1subscript𝑌1Y_{1}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and then extrapolate this relationship to female workers.

In this paper, we consider two approaches to estimating counterfactual wage functions. The first is the commonly-used Oaxaca-Blinder decomposition, which we henceforth refer to as OB (Oaxaca 1973; Blinder 1973). For the OB decomposition, we estimate two linear regressions — one for the set of male workers and another for the set of female workers — to estimate the functionals Y1()subscript𝑌1Y_{1}(\cdot)italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ ) and Y0()subscript𝑌0Y_{0}(\cdot)italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( ⋅ ), respectively, as denoted in equation (11). Values for E[Yg(xij)|Gi=g]𝐸delimited-[]conditionalsubscript𝑌𝑔subscript𝑥𝑖𝑗subscript𝐺𝑖𝑔E[Y_{g}(x_{ij})|G_{i}=g]italic_E [ italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_g ] are obtained by averaging out the fitted values of the respective linear regressions. Estimates for the counterfactual E[Y1(xij)|Gi=0]𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0E[Y_{1}(x_{ij})|G_{i}=0]italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] are obtained by using the coefficients from the linear regression fitted for males, β^G=1subscript^𝛽𝐺1\hat{\beta}_{G=1}over^ start_ARG italic_β end_ARG start_POSTSUBSCRIPT italic_G = 1 end_POSTSUBSCRIPT, and multiplying them by the average female covariates, x¯G=0subscript¯𝑥𝐺0\bar{x}_{G=0}over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_G = 0 end_POSTSUBSCRIPT, as defined in equation (11). This is equivalent to producing fitted values for the males’ regression, while inputting females’ covariates.

OB regressions:Yg(xij)=xijTβG=g+ϵgij,g{0,1}formulae-sequencesubscript𝑌𝑔subscript𝑥𝑖𝑗superscriptsubscript𝑥𝑖𝑗𝑇subscript𝛽𝐺𝑔subscriptitalic-ϵ𝑔𝑖𝑗𝑔01\displaystyle Y_{g}(x_{ij})=x_{ij}^{T}\beta_{G=g}+\epsilon_{gij},\qquad g\in\{%0,1\}italic_Y start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_β start_POSTSUBSCRIPT italic_G = italic_g end_POSTSUBSCRIPT + italic_ϵ start_POSTSUBSCRIPT italic_g italic_i italic_j end_POSTSUBSCRIPT , italic_g ∈ { 0 , 1 }(11)
OB counterfactual estimate:E[Y1(xij)|Gi=0]^:=x¯G=0Tβ^G=1,x¯G=0:=i|Gi=0xijnformulae-sequenceassign^𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0superscriptsubscript¯𝑥𝐺0𝑇subscript^𝛽𝐺1assignsubscript¯𝑥𝐺0subscriptconditional𝑖subscript𝐺𝑖0subscript𝑥𝑖𝑗𝑛\displaystyle\widehat{E[Y_{1}(x_{ij})|G_{i}=0]}:=\bar{x}_{G=0}^{T}\hat{\beta}_%{G=1},\quad\bar{x}_{G=0}:=\sum_{i|G_{i}=0}\frac{x_{ij}}{n}over^ start_ARG italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] end_ARG := over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_G = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_β end_ARG start_POSTSUBSCRIPT italic_G = 1 end_POSTSUBSCRIPT , over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_G = 0 end_POSTSUBSCRIPT := ∑ start_POSTSUBSCRIPT italic_i | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT divide start_ARG italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG

As discussed in section 2, the OB decomposition has several important limitations. Although highly tractable, OB imposes potentially restrictive assumptions on Y1(xij)subscript𝑌1subscript𝑥𝑖𝑗Y_{1}(x_{ij})italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ). First, it assumes that its expectation is linear in xijsubscript𝑥𝑖𝑗x_{ij}italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT. Although linear regressions allow for flexible transformations of its covariates, the functional form is still a somewhat arbitrary researcher choice. Second, by using a linear regression to estimate the potential outcome function, Y1()subscript𝑌1Y_{1}(\cdot)italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ ), as in equation (11), it uses the same functional form to compute counterfactuals for all male workers. In other words, it imposes the same average returns to covariates for all workers, which would create biases in the counterfactual estimation if returns to worker characteristics are heterogeneous. The third limitation of the OB is related to the overlapping supports assumption, also referred to as the common supports assumption. This assumption imposes that the support of x𝑥xitalic_x for one of the genders has to fully overlap with the support of x𝑥xitalic_x for the other gender, and is imposed by almost all decomposition methods in economics (Fortin et al. 2011). The overlapping supports assumption is imposed to ensure that the counterfactual function Y1(x)subscript𝑌1𝑥Y_{1}(x)italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) estimated using male data, xGi=1subscript𝑥subscript𝐺𝑖1x_{G_{i}=1}italic_x start_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 end_POSTSUBSCRIPT, is only used to predict counterfactual earnings for females whose values of x𝑥xitalic_x lie within the male support of x𝑥xitalic_x. When this condition is not satisfied in the data, observations that are outside of the common support are typically trimmed or given virtually zero weight in the estimation process, potentially eliminating significant numbers of workers from the analysis and making the analysis representative of only a subset of the population (Modenesi 2022). This is particularly salient when x𝑥xitalic_x lies in a high-dimensional space, as is the case in our application with high-dimensional worker types and markets.

Our preferred decomposition strategy relies on matching male and female workers with similar observable characteristics and using matched workers of different genders as counterfactuals for each other. This approach was initially proposed by Ñopo (2008) and was further extended by Modenesi (2022). Not only does this approach avoid the strong functional form assumptions made by OB, it includes a framework for handling a lack of common support. In this paper, we choose to use the original estimation strategy laid out by Ñopo (2008), given its tractability especially for a high-dimensional set of covariates like ours, and we refer to it as the matching decomposition henceforth.

The matching decomposition has two main components: (i) matching observations and (ii) relaxing the overlapping supports assumption. First, counterfactual female earnings Y1(xij)|Gi=0conditionalsubscript𝑌1subscript𝑥𝑖𝑗subscript𝐺𝑖0Y_{1}(x_{ij})|G_{i}=0italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 — what female workers would have earned if their gender were changed to male but nothing else about them changed — are obtained by exact matching each female to one or more male workers with similar observable characteristics and then taking a sample average of the matched males191919In this paper we coarsened a few variables such as years of education and age, and we use the coarsened version of these variables instead to perform the exact matching. This serves the purpose of matching more individuals, giving more statistical power to the method, since workers with just e.g. 1 year difference in age, ceteris paribus, are roughly the same in terms of productivity.. This method for building counterfactuals is non-parametric, assuming no functional form for Y1()subscript𝑌1Y_{1}(\cdot)italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ ), it exerts no extrapolations out of the support of x𝑥xitalic_x and it avoids using data from all workers to build counterfactuals for a specific worker. The matching decomposition handles the lack of common support issue by allowing unmatched workers, i.e. outside of the common support of x𝑥xitalic_x, to contribute to the overall observed gap. In the matching decomposition, we add two terms, ΔMsubscriptΔ𝑀\Delta_{M}roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and ΔFsubscriptΔ𝐹\Delta_{F}roman_Δ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT, to the expression for the overall wage gap ΔΔ\Deltaroman_Δ in equation (1) which captures the contributions of unmatched male and female workers, respectively. The resulting expression is

Δ=Δabsent\displaystyle\Delta=roman_Δ =E[Yij|Gi=1]E[Yij|Gi=0]=:ΔX+Δ0+ΔM+ΔF,\displaystyle E[Y_{ij}|G_{i}=1]-E[Y_{ij}|G_{i}=0]=:\Delta_{X}+\Delta_{0}+%\Delta_{M}+\Delta_{F},italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] = : roman_Δ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT + roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT + roman_Δ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ,(12)

where

ΔX:=E[Yij|Matched,Gi=1]E[Y1(xij)|Matched,Gi=0]assignsubscriptΔ𝑋𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗𝑀𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗𝑀𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖0\displaystyle\Delta_{X}:=E\left[Y_{ij}|Matched,G_{i}=1\right]-E\left[Y_{1}(x_{%ij})|Matched,G_{i}=0\right]roman_Δ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT := italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_M italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_M italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ]
Δ0:=E[Yij|Matched,Gi=1]E[Y1(xij)|Matched,Gi=0]assignsubscriptΔ0𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗𝑀𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌1subscript𝑥𝑖𝑗𝑀𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖0\displaystyle\Delta_{0}:=E\left[Y_{ij}|Matched,G_{i}=1\right]-E\left[Y_{1}(x_{%ij})|Matched,G_{i}=0\right]roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT := italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_M italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) | italic_M italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ]
ΔM:={E[Yij|Unmatched,Gi=1]E[Yij|Matched,Gi=1]}P(Unmatched|Gi=1)assignsubscriptΔ𝑀𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗𝑈𝑛𝑚𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖1𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗𝑀𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖1𝑃conditional𝑈𝑛𝑚𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖1\displaystyle\Delta_{M}:=\left\{E\left[Y_{ij}|Unmatched,G_{i}=1\right]-E\left[%Y_{ij}|Matched,G_{i}=1\right]\right\}P\left(Unmatched|G_{i}=1\right)roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT := { italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_U italic_n italic_m italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] - italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_M italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ] } italic_P ( italic_U italic_n italic_m italic_a italic_t italic_c italic_h italic_e italic_d | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 )
ΔF:={E[Yij|Matched,Gi=0]E[Yij|Unmatched,Gi=0]}P(Unmatched|Gi=0)assignsubscriptΔ𝐹𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗𝑀𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖0𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗𝑈𝑛𝑚𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖0𝑃conditional𝑈𝑛𝑚𝑎𝑡𝑐𝑒𝑑subscript𝐺𝑖0\displaystyle\Delta_{F}:=\left\{E\left[Y_{ij}|Matched,G_{i}=0\right]-E\left[Y_%{ij}|Unmatched,G_{i}=0\right]\right\}P\left(Unmatched|G_{i}=0\right)roman_Δ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT := { italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_M italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] - italic_E [ italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_U italic_n italic_m italic_a italic_t italic_c italic_h italic_e italic_d , italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 ] } italic_P ( italic_U italic_n italic_m italic_a italic_t italic_c italic_h italic_e italic_d | italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 )

Notice that if all observations are matched the ΔMsubscriptΔ𝑀\Delta_{M}roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and ΔFsubscriptΔ𝐹\Delta_{F}roman_Δ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT terms vanish and this method collapses back to the original decomposition we have in equation (2). The terms ΔXsubscriptΔ𝑋\Delta_{X}roman_Δ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT and Δ0subscriptΔ0\Delta_{0}roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT still have the same interpretation as discussed in Section 2 — composition and structural, respectively — but now only similar workers of one gender are used to build counterfactuals for the other gender, using an agnostic functional form for the counterfactual function. The extra terms ΔMsubscriptΔ𝑀\Delta_{M}roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and ΔFsubscriptΔ𝐹\Delta_{F}roman_Δ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT measure the contribution of unmatched male and female workers to the overall observed gender gap. Each of them measures the difference between matched and unmatched workers of a given gender, weighted by the proportion of unmatched workers within that gender202020Precise definitions of each of the terms in the NP decomposition can be found in the appendix section B. For example, if unmatched male workers have an average log wage that is 0.2 higher than the average log wage for matched male workers and 10% of male workers are unmatched, then ΔM=0.2×0.1=0.02subscriptΔ𝑀0.20.10.02\Delta_{M}=0.2\times 0.1=0.02roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT = 0.2 × 0.1 = 0.02.

To understand how the matching decomposition handles a lack of common support, consider male workers employed as professional football players. These workers will not be matched to female workers and therefore would be omitted from the analysis if we simply restrict it to the region of common support. However, the male workers do contribute meaningfully to the overall gender wage gap because they earn significantly more than the average female worker. The matching decomposition would handle this by including these workers in the ΔMsubscriptΔ𝑀\Delta_{M}roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT term. Intuitively, it would say that some of the gender wage gap can be decomposed within the region of common support, while some of it is explained by male workers outside the region of common support earning more than male workers within the region of common support, and similarly for female workers.

Our preferred specifications in this paper use the matching decomposition in conjunction with the latent skills and tasks clusters revealed by our network methodology developed in Section 3. Since we define labor market gender discrimination as workers with similar skills performing similar tasks with similar productivity but being paid differently based on gender, our worker type–market clusters serve as natural cells within which workers are considered as equivalent in terms of productivity. With the matching decomposition we are able to ensure that only similar workers are used when estimating counterfactual earnings, mitigating counterfactual biases, and also avoid dropping unmatched workers from the estimation procedure as mentioned above. Although the original matching decomposition is not considered to be a “detailed decomposition” by the literature of decompositions in economics, in combination with our network clusters, it is possible to compute an economically principled distribution of the gender gap (and its components) for a vast amount of cells of workers in the labor market, mapping how discrimination is spread in different parts of the market.

5 Data

5.1 Administrative Brazilian data

We use the Brazilian linked employer-employee data set RAIS. The data contain detailed information on all employment contracts in the Brazilian formal sector, going back to the 1980s. The sample we work with includes all workers between the ages of 25 and 55 employed in the formal sector in the Rio de Janeiro metro area at least once between 2009 and 2018. These workers are defined as matching with the unemployment (or informal sector) in years we do not observe them. We also exclude the public sector because institutional barriers make flows between the Brazilian public and private sectors rare, as well as the military. Finally, we exclude the small number of jobs that do not pay workers on a monthly basis.

Our wage variable is the real hourly log wage in December, defined as total December earnings divided by hours worked. We deflate wages using the national inflation index. We exclude workers who were not employed for the entire month of December because we do not have accurate hours worked information for such workers. We define a job as an occupation-establishment pair. This implicitly assumes that all workers employed in the same occupation at the same establishment are performing approximately the same tasks.

Our data contain 4,578,210 unique workers, 289,836 unique jobs, and 7,940,483 unique worker–job matches. The average worker matches with 1.73 jobs and the average job matches with 27.4 workers. 42% of workers match with more than one job during our sample. Figure 1(b) presents histograms of the number of matches for workers and jobs, respectively. In network theory parlance, these are known as degree distributions.

University of Michigan, bmodene@umich.edu. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1256260. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This research is also supported by the Alfred P. Sloan Foundation through the CenHRS project at the University of Michigan. This work is done in partnership with the Brazilian Institute of Applied Economic Research (IPEA). We thank John Bound, Abigail Jacobs, Matthew Shapiro, Mel Stephens, and Sebastian Sotelo for advice and guidance throughout this project. We also thank Charlie Brown, Zach Brown, Raj Chetty, Ying Fan, John Friedman, Florian Gunsilius, Nathan Hendren, Dhiren Patki, Rafael Pereira, Matthew Staiger, Dyanne Vaught, and Jean-Gabriel Young for helpful comments and discussions. We also received helpful feedback from seminar participants at the University of Michigan, Labo(u)r Day, the Urban Economics Association, Networks 2021, Yale University, Duke University, the Federal Reserve Bank of Boston, Opportunity Insights, and JAM. (1)
University of Michigan, bmodene@umich.edu. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1256260. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This research is also supported by the Alfred P. Sloan Foundation through the CenHRS project at the University of Michigan. This work is done in partnership with the Brazilian Institute of Applied Economic Research (IPEA). We thank John Bound, Abigail Jacobs, Matthew Shapiro, Mel Stephens, and Sebastian Sotelo for advice and guidance throughout this project. We also thank Charlie Brown, Zach Brown, Raj Chetty, Ying Fan, John Friedman, Florian Gunsilius, Nathan Hendren, Dhiren Patki, Rafael Pereira, Matthew Staiger, Dyanne Vaught, and Jean-Gabriel Young for helpful comments and discussions. We also received helpful feedback from seminar participants at the University of Michigan, Labo(u)r Day, the Urban Economics Association, Networks 2021, Yale University, Duke University, the Federal Reserve Bank of Boston, Opportunity Insights, and JAM. (2)

Our network-based classification algorithm identifies 187 worker types (ι𝜄\iotaitalic_ι) and 341 markets (γ𝛾\gammaitalic_γ). Figure 2(b) presents histograms of the number of workers per worker type and jobs per market. The average worker belongs to a worker type with 20,896 workers and the median worker belongs to a worker type with 14,211 workers. The average job belongs to a market with 1,156 jobs and the median job belongs to a market with 1,127 jobs.

University of Michigan, bmodene@umich.edu. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1256260. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This research is also supported by the Alfred P. Sloan Foundation through the CenHRS project at the University of Michigan. This work is done in partnership with the Brazilian Institute of Applied Economic Research (IPEA). We thank John Bound, Abigail Jacobs, Matthew Shapiro, Mel Stephens, and Sebastian Sotelo for advice and guidance throughout this project. We also thank Charlie Brown, Zach Brown, Raj Chetty, Ying Fan, John Friedman, Florian Gunsilius, Nathan Hendren, Dhiren Patki, Rafael Pereira, Matthew Staiger, Dyanne Vaught, and Jean-Gabriel Young for helpful comments and discussions. We also received helpful feedback from seminar participants at the University of Michigan, Labo(u)r Day, the Urban Economics Association, Networks 2021, Yale University, Duke University, the Federal Reserve Bank of Boston, Opportunity Insights, and JAM. (3)
University of Michigan, bmodene@umich.edu. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1256260. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This research is also supported by the Alfred P. Sloan Foundation through the CenHRS project at the University of Michigan. This work is done in partnership with the Brazilian Institute of Applied Economic Research (IPEA). We thank John Bound, Abigail Jacobs, Matthew Shapiro, Mel Stephens, and Sebastian Sotelo for advice and guidance throughout this project. We also thank Charlie Brown, Zach Brown, Raj Chetty, Ying Fan, John Friedman, Florian Gunsilius, Nathan Hendren, Dhiren Patki, Rafael Pereira, Matthew Staiger, Dyanne Vaught, and Jean-Gabriel Young for helpful comments and discussions. We also received helpful feedback from seminar participants at the University of Michigan, Labo(u)r Day, the Urban Economics Association, Networks 2021, Yale University, Duke University, the Federal Reserve Bank of Boston, Opportunity Insights, and JAM. (4)

6 Results

6.1 Aggregate wage gap decomposition

Table 1 presents the results of performing gender wage decompositions using each of our two methods: OB and matching. For each method, we have three specifications. The first, presented in columns (1) and (4), estimates counterfactual earnings distributions using a standard set of observable characteristics: experience, education, race, industry and union status. The second, presented in columns (2) and (5), estimates counterfactual earnings distributions using the worker types and markets identified by the SBM. The third specification, presented in columns (3) and (6) uses both standard observable characteristics and worker types and markets. The first row of each column presents the overall wage gap: the average male worker earns 16.7 percent more than the average female worker in our sample. The second row presents the wage gap that would exist if male and female workers with the same productivity were paid equivalently but the observed differences between the distributions of male and female productivity — as proxied by observable characteristics and/or worker types and markets — remained, the composition component. The third row presents the wage gap that would exist if male and female workers had identical productivity distributions, but the observed earnings differences conditional on productivity remained, the structural component. The fourth and fifth rows present the wage gap explained by male and female workers outside the region of common support, respectively. For the OB method the composition and structural components add up to the overall wage gap; for the matching method the overall wage gap equals the sum of the composition and structural components and the components due to a lack of common support.

The qualitative stories told by both the OB method and the matching method are similar. When we define counterfactual earnings using observable characteristics (columns 1 and 4), we find that if male and female workers with the same productivity were paid similarly, then female workers would significantly outearn male workers (structural effect): by 12.7% using the OB method and 8.8% using the matching method. By contrast, female workers would be paid significantly less if they possessed the male’s productivity distribution (composition effect): 29.4% less using the OB method and 25.6% less using the matching method. When we define counterfactuals using worker types and markets instead of observable characteristics (columns 2 and 5) we find that the wage gap would nearly disappear if male and female workers with the same productivity were paid similarly. By contrast, the wage gap that would exist if male and female workers had the same productivity distribution — 17.9% according to OB and 17.8% according to matching — is almost equal to the overall wage gap of 16.7%. In other words, when we compute counterfactuals using worker types and markets we find that differential pay for similar productivity explains roughly the entire gender wage gap. This tells us that the results of gender wage gap decompositions are highly sensitive to the way in which we define counterfactuals. If, as we argue, worker types and markets do a better job of capturing the latent productivity of worker–job matches than do standard observable characteristics, then these results imply that gender wage gaps are almost entirely due to similarly productive male and female workers being paid differently, not male and female workers having different productivity distributions.

Columns (3) and (6) of Table 1 use both observable characteristics and worker types and markets to form counterfactuals for the gender wage gap decompositions. The OB method finds that female workers have covariates that would imply that they would outearn male workers if equally productive workers were paid equivalently, similar to the findings when we included only observable characteristics, not worker and job types, in column (1). By contrast, the matching method finds that male workers’ covariates imply 3.4% higher earnings than female workers’ covariates and that male workers are paid 18.5% more than similarly productive female workers. Why do we observe a discrepancy between the OB and matching methods once we include observable characteristics and worker types and markets? The answer lies in the final two rows of Table 1, which present the fraction of male and female workers, respectively, for whom we are unable to find a counterfactual. Once we try to match workers on such a large set of variables, many workers are unable to be matched, and a significant part of the gender wage gap occurs among such workers. The matching method allows us to take this into account, while the OB method simply makes a linear extrapolation. However, a linear extrapolation outside the region of common support is likely to lead to incorrect inferences. Furthermore, the fact that the matching estimator yields similar results when we use worker types and markets as it does when we use worker types, markets, and other observable characteristics, but not when we use other observables alone, implies that worker types and markets capture significant determinants of productivity, and omitting them leads to incorrect inferences. This highlights the importance of using a sufficiently set of worker characteristics when estimating counterfactuals, and our method for identifying previously unobserved heterogeneity enhances our ability to do so. All of the results presented in this section correspond to the aggregate gender wage gap. In the next section, we consider heterogeneity in wage gaps within different subsets of the labor market.

Oaxaca-BlinderMatching
Observablesι×γ𝜄𝛾\iota\times\gammaitalic_ι × italic_γFull modelObservablesι×γ𝜄𝛾\iota\times\gammaitalic_ι × italic_γFull model
(1)(2)(3)(4)(5)(6)
Gap0.1670.1670.1670.1670.1670.167
Composition-0.127-0.011-0.084-0.088-0.0060.034
structural0.2940.1790.2500.2560.1780.185
Males unmatched---0.000-0.005-0.076
Females unmatched---0.0000.0000.024
% of males matched---1.000.980.57
% of females matched---1.000.990.74

6.2 Wage gaps within worker type–market cells

An appealing feature of our worker types and markets is that they allow us to further decompose gender wage gaps and identify heterogeneity in gender wage gaps across the labor market. We do so by computing overall wage gaps, ΔΔ\Deltaroman_Δ, and then decomposing them following the matching decomposition, within each worker type–market cell.

For each worker type–market cell we decompose the overall wage gap (Row 1 of Table 1) into its four components: composition, structural, males unmatched, and females unmatched (Rows 2–5 of Table 1). Figure 3 presents kernel density plots of the resulting distributions of overall wage gaps and their four components. Several clear patterns emerge. First, the overall wage gaps ΔΔ\Deltaroman_Δ are almost universally positive, meaning that male workers outearning their female counterparts is a widespread phenomenon. Specifically, 91% of workers are in clusters where males outearn females. Second, the distribution of the structural component, Δ0subscriptΔ0\Delta_{0}roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, is similar to the distribution of the overall wage gap. This suggests that the result from the aggregate decomposition in Section 6.1 that almost the entire overall gender wage gap is explained by the structural component holds within worker type–market cells as well. The fact that the structural component roughly coincides with the overall wage gap implies that the other three components — composition, males outside the common support, and females outside the common support — must contribute relatively little to the overall gender wage gap, which is confirmed by the fact that the distributions for these three components are centered close to zero and have low variances. We present the same results quantitatively in Table 2. Together, these results tell us that while there is significant variability in gender wage gaps across different worker type–market pairs, the overall qualitative pattern of male workers outearning their female counterparts, and almost all of this gap being explained by differential returns to the same skills rather than different skills, is true in the disaggregated results as well as the aggregated results.

University of Michigan, bmodene@umich.edu. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1256260. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This research is also supported by the Alfred P. Sloan Foundation through the CenHRS project at the University of Michigan. This work is done in partnership with the Brazilian Institute of Applied Economic Research (IPEA). We thank John Bound, Abigail Jacobs, Matthew Shapiro, Mel Stephens, and Sebastian Sotelo for advice and guidance throughout this project. We also thank Charlie Brown, Zach Brown, Raj Chetty, Ying Fan, John Friedman, Florian Gunsilius, Nathan Hendren, Dhiren Patki, Rafael Pereira, Matthew Staiger, Dyanne Vaught, and Jean-Gabriel Young for helpful comments and discussions. We also received helpful feedback from seminar participants at the University of Michigan, Labo(u)r Day, the Urban Economics Association, Networks 2021, Yale University, Duke University, the Federal Reserve Bank of Boston, Opportunity Insights, and JAM. (5)
meansdminmaxcount
ΔΔ\Deltaroman_Δ0.2150.240-1.1836.2284791014
Δ0subscriptΔ0\Delta_{0}roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT0.1960.172-2.5069.3844791014
ΔMsubscriptΔ𝑀\Delta_{M}roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT0.0160.134-3.5773.4484783255
ΔFsubscriptΔ𝐹\Delta_{F}roman_Δ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT-0.0110.116-2.6324.6844724863
ΔXsubscriptΔ𝑋\Delta_{X}roman_Δ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT0.0130.153-1.1502.4184791014
Frac. Male Workers Matched0.7660.2380.0041.0004791014
Frac. Female Workers Matched0.8750.1990.0081.0004791014
Frac. Workers that Are Male0.6170.1620.0370.9994791014

7 Conclusion

In this paper we reconsider the wage gap decomposition literature and make three key contributions. First, we propose a new method for identifying unobserved determinants of workers’ earnings from the information revealed by detailed data on worker–job matching patterns. The method builds on Fogel and Modenesi (2023) and provides a blueprint for incorporating observable variables into the clustering algorithm, while also relaxing the assumption of perfect competition in labor markets. Second, we non-parametrically estimate counterfactual wage functions for male and female workers and use them to decompose gender wage gaps into a composition component in which male and female workers earn different wages because they possess different skills and perform different tasks, and a structural component in which male and female workers who possess similar skills and perform similar tasks nonetheless earn different wages. Third, we address the issue of male workers’ observables characteristics falling outside the support of female workers’ observable characteristics, and vice versa, by augmenting the wage decomposition with components attributable to male and female workers, respectively, outside the region of common support.

We apply these methods to Brazilian administrative data and find that almost the entire gender wage gap is attributable to male and female workers who possess similar skills and perform similar tasks being paid differently. This is true at the aggregate level, and remains true when we perform wage decompositions within each worker type–market cell, indicating that this is a widespread phenomenon, not one driven by large wage differentials in small subsets of the labor market. We find that wage decompositions based on standard observable variables suffer from omitted variable bias, emphasizing the need for detailed worker and job characteristics in the form of worker types and markets. We find that wage decompositions based on linear regressions yield similar findings to those based on matching when a lack of common support is not an issue, however when male and female workers’ characteristics do not share a common support the matching estimator with corrections for a lack of common support outperforms alternatives.

While this paper focuses on gender wage gaps, the methods are applicable to other wage gaps, for instance race. Moreover, our strategy for using worker–job matching patterns to control for previously-unobserved, but potentially confounding, covariates may be applied in a wide variety of contexts.

References

  • (1)
  • Acemoglu and Autor (2011)Acemoglu, Daron and David Autor, “Skills, tasks and technologies:Implications for employment and earnings,” 2011, 4, 1043–1171.
  • Autor (2013)Autor, DavidH, “The ‘task approach’ to labor markets: an overview,”2013.
  • Autor et al. (2003)Autor, DavidH., Frank Levy, and RichardJ. Murnane, “The Skill Contentof Recent Technological Change: An Empirical Exploration,” TheQuarterly Journal of Economics, 2003, 118 (4), 1279–1333.
  • Barsky et al. (2002)Barsky, Robert, John Bound, KerwinKofi Charles, and JosephP. Lupton,“Accounting for the Black-White Wealth Gap: A Nonparametric Approach,” Journal of the American Statistical Association, 2002, 97 (459),663–673.
  • Blinder (1973)Blinder, AlanS., “Wage Discrimination: Reduced Form and StructuralEstimates,” The Journal of Human Resources, 1973, 8 (4),436–455.
  • Card et al. (2015)Card, David, AnaRute Cardoso, and Patrick Kline, “ Bargaining,Sorting, and the Gender Wage Gap: Quantifying the Impact of Firms on theRelative Pay of Women *,” The Quarterly Journal of Economics, 102015, 131 (2), 633–686.
  • Card et al. (2018)  ,   , Joerg Heining, and Patrick Kline, “Firms and LaborMarket Inequality: Evidence and Some Theory,” Journal of LaborEconomics, 2018, 36 (S1), S13–S70.
  • Chernozhukov et al. (2013)Chernozhukov, Victor, Iván Fernández-Val, and Blaise Melly, “Inferenceon Counterfactual Distributions,” Econometrica, 2013, 81 (6),2205–2268.
  • DiNardo et al. (1996)DiNardo, John, NicoleM. Fortin, and Thomas Lemieux, “Labor MarketInstitutions and the Distribution of Wages, 1973-1992: A SemiparametricApproach,” Econometrica, 1996, 64 (5), 1001–1044.
  • Firpo et al. (2018)Firpo, SergioP., NicoleM. Fortin, and Thomas Lemieux, “DecomposingWage Distributions Using Recentered Influence Function Regressions,” Econometrics, May 2018, 6 (2), 1–40.
  • Fogel and Modenesi (2023)Fogel, Jamie and Bernardo Modenesi, “What is a Labor Market? ClassifyingWorkers and Jobs Using Network Theory,” 2023.
  • Fortin et al. (2011)Fortin, Nicole, Thomas Lemieux, and Sergio Firpo, “Chapter 1 -Decomposition Methods in Economics,” in Orley Ashenfelter and David Card,eds., Orley Ashenfelter and David Card, eds., Vol.4 of Handbookof Labor Economics, Elsevier, 2011, pp.1–102.
  • Garcia et al. (2009)Garcia, LuanaMarquez, HugoNopo, and Paola Salardi, “Gender andRacial Wage Gaps in Brazil 1996-2006: Evidence Using a Matching ComparisonsApproach,” Research Department Publications 4626, Inter-AmericanDevelopment Bank, Research Department May 2009.
  • Gerard et al. (2018)Gerard, François, Lorenzo Lagos, Edson Severnini, and David Card,“Assortative Matching or Exclusionary Hiring? The Impact of Firm Policies onRacial Wage Differences in Brazil,” Working Paper 25176, National Bureau ofEconomic Research October 2018.
  • Goldin (2014)Goldin, Claudia, “A Grand Gender Convergence: Its Last Chapter,” American Economic Review, April 2014, 104 (4), 1091–1119.
  • Hurst et al. (2021)Hurst, Erik, Yona Rubinstein, and Kazuatsu Shimizu, “Task-BasedDiscrimination,” Working Paper 29022, National Bureau of Economic ResearchJuly 2021.
  • Jarosch et al. (2019)Jarosch, Gregor, JanSebastian Nimczik, and Isaac Sorkin, “Granularsearch, market structure, and wages,” Technical Report, National Bureau ofEconomic Research 2019.
  • Kantenga (2018)Kantenga, Kory, “The effect of job-polarizing skill demands on the USwage structure,” 2018.
  • Karrer and Newman (2011)Karrer, Brian and MarkEJ Newman, “Stochastic blockmodels and communitystructure in networks,” Physical review E, 2011, 83 (1), 016107.
  • Larremore et al. (2014)Larremore, DanielB, Aaron Clauset, and AbigailZ Jacobs, “Efficientlyinferring community structure in bipartite networks,” Physical ReviewE, 2014, 90 (1), 012805.
  • Lindenlaub (2017)Lindenlaub, Ilse, “Sorting multidimensional types: Theory andapplication,” The Review of Economic Studies, 2017, 84 (2),718–789.
  • McFadden (1978)McFadden, Daniel, “Modeling the choice of residential location,” Transportation Research Record, 1978, (673).
  • Modenesi (2022)Modenesi, Bernardo, “Advancing Distribution Decomposition Methods BeyondCommon Supports: Applications to Racial Wealth Disparities,” 2022.
  • Morello and Anjolim (2021)Morello, Thiago and Jacqueline Anjolim, “Gender wage discrimination inBrazil from 1996 to 2015: A matching analysis,” EconomiA, 2021.
  • Nimczik (2018)Nimczik, JanSebastian, “Job Mobility Networks and Endogenous LaborMarkets,” 2018.
  • Ñopo (2008)Ñopo, Hugo, “Matching as a Tool to Decompose Wage Gaps,” TheReview of Economics and Statistics, 2008, 90 (2), 290–299.
  • Oaxaca (1973)Oaxaca, Ronald, “Male-Female Wage Differentials in Urban LaborMarkets,” International Economic Review, 1973, 14 (3), 693–709.
  • Peixoto (2014)Peixoto, TiagoP, “Efficient Monte Carlo and greedy heuristic for theinference of stochastic block models,” Physical Review E, 2014, 89 (1), 012804.
  • Peixoto (2017)  , “Nonparametric Bayesian inference of the microcanonicalstochastic block model,” Physical Review E, 2017, 95 (1),012317.
  • Peixoto (2018)Peixoto, TiagoP., “Nonparametric weighted stochastic block models,”Phys. Rev. E, Jan 2018, 97, 012306.
  • Peixoto (2019)Peixoto, TiagoP, “Bayesian stochastic blockmodeling,” Advances innetwork clustering and blockmodeling, 2019, pp.289–332.
  • Roy (1951)Roy, AndrewDonald, “Some thoughts on the distribution of earnings,”Oxford economic papers, 1951, 3 (2), 135–146.
  • Sorkin (2018)Sorkin, Isaac, “Ranking firms using revealed preference,” Thequarterly journal of economics, 2018, 133 (3), 1331–1393.
  • Tan (2018)Tan, Joanne, “Multidimensional heterogeneity and matching in africtional labor market - An application to polarization,” 2018.
  • Train (2010)Train, KennethE, Discrete choice methods with simulation,Cambridge university press, 2010.

\appendixpage

Appendix A Nested Logit Choice Probability

According to Train (2010), and originally developed by McFadden (1978), maximizing the utility choosing j𝑗jitalic_j, which is nested within a group γ𝛾\gammaitalic_γ

j:=argmaxjWγ+Yj+εjassignsuperscript𝑗subscript𝑗subscript𝑊𝛾subscript𝑌𝑗subscript𝜀𝑗j^{*}:=\arg\max_{j}\quad W_{\gamma}+Y_{j}+\varepsilon_{j}italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT := roman_arg roman_max start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT + italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_ε start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT(13)

with εjNestedLogit(νγ)similar-tosubscript𝜀𝑗𝑁𝑒𝑠𝑡𝑒𝑑𝐿𝑜𝑔𝑖𝑡subscript𝜈𝛾\varepsilon_{j}\sim NestedLogit(\nu_{\gamma})italic_ε start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∼ italic_N italic_e italic_s italic_t italic_e italic_d italic_L italic_o italic_g italic_i italic_t ( italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) results in the following choice probability:

P(j=j)𝑃𝑗superscript𝑗\displaystyle P(j=j^{*})italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )=P(Chooseγ)P(j=j|γ)absent𝑃Choose𝛾𝑃𝑗conditionalsuperscript𝑗𝛾\displaystyle=P(\text{Choose }\gamma)P(j=j^{*}|\gamma)= italic_P ( Choose italic_γ ) italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_γ )
=exp(Wγ+νγIγ)γexp(Wγ+νγIγ)exp(Yj)1νγjγexp(Yj)1νγ\displaystyle=\frac{\exp(W_{\gamma}+\nu_{\gamma}I_{\gamma})}{\sum_{\gamma}\exp%(W_{\gamma}+\nu_{\gamma}I_{\gamma})}\frac{\exp(Y_{j})^{\frac{1}{\nu_{\gamma}}}%}{\sum_{j\in\gamma}\exp(Y_{j})^{\frac{1}{\nu_{\gamma}}}}= divide start_ARG roman_exp ( italic_W start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT + italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_exp ( italic_W start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT + italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) end_ARG divide start_ARG roman_exp ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT roman_exp ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG

where Iγ=log(jγexp(Yj)1νγ)I_{\gamma}=\log\left(\sum_{j\in\gamma}\exp(Y_{j})^{\frac{1}{\nu_{\gamma}}}\right)italic_I start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT = roman_log ( ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT roman_exp ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT ).

Our problem is similar, with workers choosing job j𝑗jitalic_j within a market γ𝛾\gammaitalic_γ in order to maximize the sum of log earnings log(ψιγwjg)subscript𝜓𝜄𝛾superscriptsubscript𝑤𝑗𝑔\log(\psi_{\iota\gamma}w_{j}^{g})roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) and an idiosyncratic preference for job j𝑗jitalic_j, εijgsuperscriptsubscript𝜀𝑖𝑗𝑔\varepsilon_{ij}^{g}italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT:

j=superscript𝑗absent\displaystyle j^{*}=italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =argmaxjlog(ψιγwjg)+εijg.subscript𝑗subscript𝜓𝜄𝛾superscriptsubscript𝑤𝑗𝑔superscriptsubscript𝜀𝑖𝑗𝑔\displaystyle\arg\max_{j}\quad\log(\psi_{\iota\gamma}w_{j}^{g})+\varepsilon_{%ij}^{g}.roman_arg roman_max start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) + italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT .(14)

We also assume that εijgNestedLogit(νγg)similar-tosuperscriptsubscript𝜀𝑖𝑗𝑔𝑁𝑒𝑠𝑡𝑒𝑑𝐿𝑜𝑔𝑖𝑡superscriptsubscript𝜈𝛾𝑔\varepsilon_{ij}^{g}\sim NestedLogit(\nu_{\gamma}^{g})italic_ε start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ∼ italic_N italic_e italic_s italic_t italic_e italic_d italic_L italic_o italic_g italic_i italic_t ( italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ). One of the differences from our setup to what is covered by Train (2010) is that we add extra worker indexes ι𝜄\iotaitalic_ι for her/his skills and g𝑔gitalic_g for her gender and we condition our probabilities on knowing ι𝜄\iotaitalic_ι and g𝑔gitalic_g. Notice that when comparing equations 13 and 14, Wγ=0subscript𝑊𝛾0W_{\gamma}=0italic_W start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT = 0 and Yj=log(ψιγwjg)subscript𝑌𝑗subscript𝜓𝜄𝛾superscriptsubscript𝑤𝑗𝑔Y_{j}=\log(\psi_{\iota\gamma}w_{j}^{g})italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ), which results in the following choice probabilities:

P(j=j|jγ,iι,g)\displaystyle P(j=j^{*}|j\in\gamma,i\in\iota,g)italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_j ∈ italic_γ , italic_i ∈ italic_ι , italic_g )=P(γ=γ|iι,jγ,g)P(j=j|iι,jγ,γ=γ,g)\displaystyle=P(\gamma=\gamma^{*}|i\in\iota,j\in\gamma,g)P(j=j^{*}|i\in\iota,j%\in\gamma,\gamma=\gamma^{*},g)= italic_P ( italic_γ = italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_i ∈ italic_ι , italic_j ∈ italic_γ , italic_g ) italic_P ( italic_j = italic_j start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | italic_i ∈ italic_ι , italic_j ∈ italic_γ , italic_γ = italic_γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_g )
=exp(νγgIιγg)γexp(νγgIιγg)exp(log(ψιγwjg))1νγgjγexp(log(ψιγwjg))1νγg(plugging objects in)\displaystyle=\frac{\exp(\nu_{\gamma}^{g}I_{\iota\gamma}^{g})}{\sum_{\gamma}%\exp(\nu_{\gamma}^{g}I_{\iota\gamma}^{g})}\frac{\exp(\log(\psi_{\iota\gamma}w_%{j}^{g}))^{\frac{1}{\nu_{\gamma}^{g}}}}{\sum_{j\in\gamma}\exp(\log(\psi_{\iota%\gamma}w_{j}^{g}))^{\frac{1}{\nu_{\gamma}^{g}}}}\quad\text{\tiny(plugging %objects in)}= divide start_ARG roman_exp ( italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_exp ( italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) end_ARG divide start_ARG roman_exp ( roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT roman_exp ( roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG (plugging objects in)
=exp(Iιγg)νγgγexp(Iιγg)νγg(ψιγwjg)1νγgjγ(ψιγwjg)1νγg(similar to equation4)\displaystyle=\frac{\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}{\sum_{\gamma%}\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}\frac{(\psi_{\iota\gamma}w_{j}^{%g})^{\frac{1}{\nu_{\gamma}^{g}}}}{\sum_{j\in\gamma}(\psi_{\iota\gamma}w_{j}^{g%})^{\frac{1}{\nu_{\gamma}^{g}}}}\quad\text{\tiny(similar to equation \ref{eq_%worker_choice})}= divide start_ARG roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG divide start_ARG ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG (similar to equation )
=exp(Iιγg)νγgγexp(Iιγg)νγg(ψιγwjg)1νγgexp(Iιγg)(by definition ofIιγg)\displaystyle=\frac{\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}{\sum_{\gamma%}\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}\frac{(\psi_{\iota\gamma}w_{j}^{%g})^{\frac{1}{\nu_{\gamma}^{g}}}}{\exp(I_{\iota\gamma}^{g})}\quad\text{\tiny(%by definition of $I_{\iota\gamma}^{g}$)}= divide start_ARG roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG divide start_ARG ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG start_ARG roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) end_ARG (by definition of italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT )
=exp(Iιγg)νγg1γexp(Iιγg)νγgψιγ1νγg=:Ωιγgιγgcomponent(wjg)1νγg=:djgjgcomponent(similar to equation5)\displaystyle=\underset{\underset{\iota-\gamma-g\text{ component}\quad}{%\underbrace{=:\Omega_{\iota\gamma}^{g}}}}{\underbrace{\frac{\exp(I_{\iota%\gamma}^{g})^{\nu_{\gamma}^{g}-1}}{\sum_{\gamma}\exp(I_{\iota\gamma}^{g})^{\nu%_{\gamma}^{g}}}\psi_{\iota\gamma}^{\frac{1}{\nu_{\gamma}^{g}}}}}\underset{%\underset{\quad j-g\text{ component}}{\underbrace{=:d_{j}^{g}}}}{\underbrace{%\vphantom{\frac{\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}-1}}{\sum_{\gamma}%\exp(I_{\iota\gamma}^{g})^{\nu_{\gamma}^{g}}}\psi_{\iota\gamma}^{\frac{1}{\nu_%{\gamma}^{g}}}}(w_{j}^{g})^{\frac{1}{\nu_{\gamma}^{g}}}}}\quad\text{\tiny(%similar to equation \ref{eq_worker_choice_separation})}= start_UNDERACCENT start_UNDERACCENT italic_ι - italic_γ - italic_g component end_UNDERACCENT start_ARG under⏟ start_ARG = : roman_Ω start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_ARG end_UNDERACCENT start_ARG under⏟ start_ARG divide start_ARG roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_exp ( italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_ARG italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG end_ARG start_UNDERACCENT start_UNDERACCENT italic_j - italic_g component end_UNDERACCENT start_ARG under⏟ start_ARG = : italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_ARG end_UNDERACCENT start_ARG under⏟ start_ARG ( italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT end_ARG end_ARG (similar to equation )

where Iιγg=log[jγexp(log(ψιγwjg))1νγg]=log[jγ(ψιγwjg)1νγg]I_{\iota\gamma}^{g}=\log\left[\sum_{j\in\gamma}\exp(\log(\psi_{\iota\gamma}w_{%j}^{g}))^{\frac{1}{\nu_{\gamma}^{g}}}\right]=\log\left[\sum_{j\in\gamma}(\psi_%{\iota\gamma}w_{j}^{g})^{\frac{1}{\nu_{\gamma}^{g}}}\right]italic_I start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT = roman_log [ ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT roman_exp ( roman_log ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT ] = roman_log [ ∑ start_POSTSUBSCRIPT italic_j ∈ italic_γ end_POSTSUBSCRIPT ( italic_ψ start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ν start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT end_ARG end_POSTSUPERSCRIPT ].

Appendix B Terms in the NP decomposition

The terms in the NP decomposition from equation 12 can be more formally defined as follows:

ΔM:=[S¯FY1(x)dFM(x)μM(S¯F)SFY1(x)dFM(x)μM(SF)]μM(S¯F)assignsubscriptΔ𝑀delimited-[]subscriptsubscript¯𝑆𝐹subscript𝑌1𝑥𝑑subscript𝐹𝑀𝑥subscript𝜇𝑀subscript¯𝑆𝐹subscriptsubscript𝑆𝐹subscript𝑌1𝑥𝑑subscript𝐹𝑀𝑥subscript𝜇𝑀subscript𝑆𝐹subscript𝜇𝑀subscript¯𝑆𝐹\displaystyle\Delta_{M}:=\left[\int_{\bar{S}_{F}}Y_{1}(x)\frac{dF_{M}(x)}{\mu_%{M}(\bar{S}_{F})}-\int_{S_{F}}Y_{1}(x)\frac{dF_{M}(x)}{\mu_{M}(S_{F})}\right]%\mu_{M}(\bar{S}_{F})roman_Δ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT := [ ∫ start_POSTSUBSCRIPT over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) divide start_ARG italic_d italic_F start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ) end_ARG - ∫ start_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) divide start_ARG italic_d italic_F start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ) end_ARG ] italic_μ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT )(15)
ΔX:=SMSFY1(x)[dFM(x)μM(SF)dFF(x)μF(SM)]assignsubscriptΔ𝑋subscriptsubscript𝑆𝑀subscript𝑆𝐹subscript𝑌1𝑥delimited-[]𝑑subscript𝐹𝑀𝑥subscript𝜇𝑀subscript𝑆𝐹𝑑subscript𝐹𝐹𝑥subscript𝜇𝐹subscript𝑆𝑀\displaystyle\Delta_{X}:=\int_{S_{M}\cap S_{F}}Y_{1}(x)\left[\frac{dF_{M}(x)}{%\mu_{M}(S_{F})}-\frac{dF_{F}(x)}{\mu_{F}(S_{M})}\right]roman_Δ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT := ∫ start_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∩ italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) [ divide start_ARG italic_d italic_F start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ) end_ARG - divide start_ARG italic_d italic_F start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_ARG ]
Δ0:=SMSF[Y1(x)Y0(x)]dFF(x)μF(SM)assignsubscriptΔ0subscriptsubscript𝑆𝑀subscript𝑆𝐹delimited-[]subscript𝑌1𝑥subscript𝑌0𝑥𝑑subscript𝐹𝐹𝑥subscript𝜇𝐹subscript𝑆𝑀\displaystyle\Delta_{0}:=\int_{S_{M}\cap S_{F}}\left[Y_{1}(x)-Y_{0}(x)\right]%\frac{dF_{F}(x)}{\mu_{F}(S_{M})}roman_Δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT := ∫ start_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∩ italic_S start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x ) ] divide start_ARG italic_d italic_F start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_ARG
ΔF:=[SMY0(x)dFF(x)μF(SM)S¯MY0(x)dFF(x)μF(S¯M)]μF(S¯M)assignsubscriptΔ𝐹delimited-[]subscriptsubscript𝑆𝑀subscript𝑌0𝑥𝑑subscript𝐹𝐹𝑥subscript𝜇𝐹subscript𝑆𝑀subscriptsubscript¯𝑆𝑀subscript𝑌0𝑥𝑑subscript𝐹𝐹𝑥subscript𝜇𝐹subscript¯𝑆𝑀subscript𝜇𝐹subscript¯𝑆𝑀\displaystyle\Delta_{F}:=\left[\int_{S_{M}}Y_{0}(x)\frac{dF_{F}(x)}{\mu_{F}(S_%{M})}-\int_{\bar{S}_{M}}Y_{0}(x)\frac{dF_{F}(x)}{\mu_{F}(\bar{S}_{M})}\right]%\mu_{F}(\bar{S}_{M})roman_Δ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT := [ ∫ start_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x ) divide start_ARG italic_d italic_F start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_ARG - ∫ start_POSTSUBSCRIPT over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x ) divide start_ARG italic_d italic_F start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_ARG ] italic_μ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT )

where: FM(x)subscript𝐹𝑀𝑥F_{M}(x)italic_F start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_x ) and FF(x)subscript𝐹𝐹𝑥F_{F}(x)italic_F start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ( italic_x ) denote the distributions of x𝑥xitalic_x for both males and females, respectively; μMsubscript𝜇𝑀\mu_{M}italic_μ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and μFsubscript𝜇𝐹\mu_{F}italic_μ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT measure the proportions of males and females over regions of the supports of x𝑥xitalic_x; and the support of x𝑥xitalic_x for a gender g𝑔gitalic_g, supp(Xg)supp_{(}X_{g})italic_s italic_u italic_p italic_p start_POSTSUBSCRIPT ( end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ), is partitioned as supp(Xg):=SgS¯gsupp_{(}X_{g}):=S_{g}\cup\bar{S}_{g}italic_s italic_u italic_p italic_p start_POSTSUBSCRIPT ( end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ) := italic_S start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ∪ over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT, with SgS¯g=subscript𝑆𝑔subscript¯𝑆𝑔S_{g}\cap\bar{S}_{g}=\emptysetitalic_S start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ∩ over¯ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT = ∅, for g{F,M}𝑔𝐹𝑀g\in\{F,M\}italic_g ∈ { italic_F , italic_M }.

Appendix C Proof that Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT follows a Poisson distribution

If an individual worker i𝑖iitalic_i only searched for a job once, then the probability of worker i𝑖iitalic_i matching with job j𝑗jitalic_j would be equal to ij=𝒫ιγdjsubscript𝑖𝑗subscript𝒫𝜄𝛾subscript𝑑𝑗\mathbb{P}_{ij}=\mathcal{P}_{\iota\gamma}d_{j}blackboard_P start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = caligraphic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT would follow a Bernoulli distribution:

AijBernoulli(𝒫ιγdj).similar-tosubscript𝐴𝑖𝑗𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖subscript𝒫𝜄𝛾subscript𝑑𝑗A_{ij}\sim Bernoulli(\mathcal{P}_{\iota\gamma}d_{j}).italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ∼ italic_B italic_e italic_r italic_n italic_o italic_u italic_l italic_l italic_i ( caligraphic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) .

However, since worker i𝑖iitalic_i searches for jobs cit=1Tcitsubscript𝑐𝑖superscriptsubscript𝑡1𝑇subscript𝑐𝑖𝑡c_{i}\equiv\sum_{t=1}^{T}c_{it}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≡ ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i italic_t end_POSTSUBSCRIPT times, Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is actually the sum of cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT Bernoulli random variables, and is therefore a Binomial random variable. Conditional on knowing cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT,

Aij|ciBinomial(ci,𝒫ιγdj).similar-toconditionalsubscript𝐴𝑖𝑗subscript𝑐𝑖𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙subscript𝑐𝑖subscript𝒫𝜄𝛾subscript𝑑𝑗A_{ij}|c_{i}\sim Binomial(c_{i},\mathcal{P}_{\iota\gamma}d_{j}).italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∼ italic_B italic_i italic_n italic_o italic_m italic_i italic_a italic_l ( italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , caligraphic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) .

However, we still need to take into account the fact that cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is a Poisson-distributed random variable with arrival rate disubscript𝑑𝑖d_{i}italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Consequently, the unconditional distribution of Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is Poisson as well:

AijPoisson(didj𝒫ιγ).similar-tosubscript𝐴𝑖𝑗𝑃𝑜𝑖𝑠𝑠𝑜𝑛subscript𝑑𝑖subscript𝑑𝑗subscript𝒫𝜄𝛾A_{ij}\sim Poisson(d_{i}d_{j}\mathcal{P}_{\iota\gamma}).italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ∼ italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT caligraphic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) .

We prove this fact by multiplying the conditional density of Aij|ciconditionalsubscript𝐴𝑖𝑗subscript𝑐𝑖A_{ij}|c_{i}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT by the marginal density of cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to get the joint density of Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT and cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and then integrating out cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

P(Aij,ci)=P(Aij|ci)Bin(ci,djPιγ)×P(ci)Poisson(di)𝑃subscript𝐴𝑖𝑗subscript𝑐𝑖𝐵𝑖𝑛subscript𝑐𝑖subscript𝑑𝑗subscript𝑃𝜄𝛾𝑃conditionalsubscript𝐴𝑖𝑗subscript𝑐𝑖𝑃𝑜𝑖𝑠𝑠𝑜𝑛subscript𝑑𝑖𝑃subscript𝑐𝑖\displaystyle P(A_{ij},c_{i})=\underset{Bin(c_{i},d_{j}P_{\iota\gamma})}{%\underbrace{P(A_{ij}|c_{i})}}\quad\times\quad\underset{Poisson(d_{i})}{%\underbrace{P(c_{i})}}italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = start_UNDERACCENT italic_B italic_i italic_n ( italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) end_UNDERACCENT start_ARG under⏟ start_ARG italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG end_ARG × start_UNDERACCENT italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_UNDERACCENT start_ARG under⏟ start_ARG italic_P ( italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG end_ARG

Deriving the joint distribution:

P(Aij,ci)=𝑃subscript𝐴𝑖𝑗subscript𝑐𝑖absent\displaystyle P(A_{ij},c_{i})=italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) =(ciAij)(djPιγ)Aij(1djPιγ)ciAij×diciexp(di)ci!binomialsubscript𝑐𝑖subscript𝐴𝑖𝑗superscriptsubscript𝑑𝑗subscript𝑃𝜄𝛾𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖subscript𝑑𝑖subscript𝑐𝑖\displaystyle\binom{c_{i}}{A_{ij}}(d_{j}P_{\iota\gamma})^{A{ij}}(1-d_{j}P_{%\iota\gamma})^{c_{i}-A{ij}}\times\frac{d_{i}^{c_{i}}\exp{(-d_{i}})}{c_{i}!}( FRACOP start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG ) ( italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A italic_i italic_j end_POSTSUPERSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT × divide start_ARG italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG

We want to find out the marginal distribution of Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT:

P(Aij)𝑃subscript𝐴𝑖𝑗\displaystyle P(A_{ij})italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT )=ci=0P(Aij,ci)absentsuperscriptsubscriptsubscript𝑐𝑖0𝑃subscript𝐴𝑖𝑗subscript𝑐𝑖\displaystyle=\sum_{c_{i}=0}^{\infty}P(A_{ij},c_{i})= ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )
=ci=0(ciAij)(djPιγ)Aij(1djPιγ)ciAij×diciexp(di)ci!absentsuperscriptsubscriptsubscript𝑐𝑖0binomialsubscript𝑐𝑖subscript𝐴𝑖𝑗superscriptsubscript𝑑𝑗subscript𝑃𝜄𝛾𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖subscript𝑑𝑖subscript𝑐𝑖\displaystyle=\sum_{c_{i}=0}^{\infty}\binom{c_{i}}{A_{ij}}(d_{j}P_{\iota\gamma%})^{A{ij}}(1-d_{j}P_{\iota\gamma})^{c_{i}-A{ij}}\times\frac{d_{i}^{c_{i}}\exp{%(-d_{i}})}{c_{i}!}= ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG ) ( italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A italic_i italic_j end_POSTSUPERSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT × divide start_ARG italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG
=ci=0ci!Aij!(diAij)!(djPιγ)Aij(1djPιγ)ciAij×diciexp(di)ci!absentsuperscriptsubscriptsubscript𝑐𝑖0subscript𝑐𝑖subscript𝐴𝑖𝑗𝑑𝑖subscript𝐴𝑖𝑗superscriptsubscript𝑑𝑗subscript𝑃𝜄𝛾𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖subscript𝑑𝑖subscript𝑐𝑖\displaystyle=\sum_{c_{i}=0}^{\infty}\frac{c_{i}!}{A_{ij}!(di-A_{ij})!}(d_{j}P%_{\iota\gamma})^{A{ij}}(1-d_{j}P_{\iota\gamma})^{c_{i}-A{ij}}\times\frac{d_{i}%^{c_{i}}\exp{(-d_{i}})}{c_{i}!}= ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! ( italic_d italic_i - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ! end_ARG ( italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A italic_i italic_j end_POSTSUPERSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT × divide start_ARG italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG
=(djPιγ)Aijexp(di)Aij!ci=01(diAij)!(1djPιγ)ciAijdiciabsentsuperscriptsubscript𝑑𝑗subscript𝑃𝜄𝛾𝐴𝑖𝑗subscript𝑑𝑖subscript𝐴𝑖𝑗superscriptsubscriptsubscript𝑐𝑖01𝑑𝑖subscript𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖\displaystyle=\frac{(d_{j}P_{\iota\gamma})^{A{ij}}\exp{(-d_{i}})}{A_{ij}!}\sum%_{c_{i}=0}^{\infty}\frac{1}{(di-A_{ij})!}(1-d_{j}P_{\iota\gamma})^{c_{i}-A{ij}%}d_{i}^{c_{i}}= divide start_ARG ( italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A italic_i italic_j end_POSTSUPERSCRIPT roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! end_ARG ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG ( italic_d italic_i - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ! end_ARG ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT

If the summation term is equal to

ci=01(diAij)!(1djPιγ)ciAijdici=diAijexp(di(1djPιγ))superscriptsubscriptsubscript𝑐𝑖01𝑑𝑖subscript𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖superscriptsubscript𝑑𝑖subscript𝐴𝑖𝑗subscript𝑑𝑖1subscript𝑑𝑗subscript𝑃𝜄𝛾\sum_{c_{i}=0}^{\infty}\frac{1}{(di-A_{ij})!}(1-d_{j}P_{\iota\gamma})^{c_{i}-A%{ij}}d_{i}^{c_{i}}=d_{i}^{A_{ij}}\exp{(d_{i}(1-d_{j}P_{\iota\gamma}))}∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG ( italic_d italic_i - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ! end_ARG ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_exp ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) )(16)

then P(Aij)=(didjPιγ)Aijexp(didjPιγ)Aij!𝑃subscript𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑑𝑗subscript𝑃𝜄𝛾𝐴𝑖𝑗subscript𝑑𝑖subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝐴𝑖𝑗P(A_{ij})=\frac{(d_{i}d_{j}P_{\iota\gamma})^{A{ij}}\exp{(-d_{i}d_{j}P_{\iota%\gamma}})}{A_{ij}!}italic_P ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = divide start_ARG ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A italic_i italic_j end_POSTSUPERSCRIPT roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! end_ARG, i.e. Aijsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT would be Poisson distributed:

AijPoisson(didjPιγ)similar-tosubscript𝐴𝑖𝑗𝑃𝑜𝑖𝑠𝑠𝑜𝑛subscript𝑑𝑖subscript𝑑𝑗subscript𝑃𝜄𝛾A_{ij}\sim Poisson(d_{i}d_{j}P_{\iota\gamma})italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ∼ italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT )

Proving (16) is equivalent to proving the following equality:

1=1absent\displaystyle 1=1 =1diAijexp(di(1djPιγ))ci=01(diAij)!(1djPιγ)ciAijdici1superscriptsubscript𝑑𝑖subscript𝐴𝑖𝑗subscript𝑑𝑖1subscript𝑑𝑗subscript𝑃𝜄𝛾superscriptsubscriptsubscript𝑐𝑖01𝑑𝑖subscript𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖\displaystyle\frac{1}{d_{i}^{A_{ij}}\exp{(d_{i}(1-d_{j}P_{\iota\gamma}))}}\sum%_{c_{i}=0}^{\infty}\frac{1}{(di-A_{ij})!}(1-d_{j}P_{\iota\gamma})^{c_{i}-A{ij}%}d_{i}^{c_{i}}divide start_ARG 1 end_ARG start_ARG italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_exp ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) ) end_ARG ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG ( italic_d italic_i - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ! end_ARG ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT

Proof:

diAijexp(di(1djPιγ))ci=01(diAij)!(1djPιγ)ciAijdici=superscriptsubscript𝑑𝑖subscript𝐴𝑖𝑗subscript𝑑𝑖1subscript𝑑𝑗subscript𝑃𝜄𝛾superscriptsubscriptsubscript𝑐𝑖01𝑑𝑖subscript𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖absent\displaystyle d_{i}^{-A_{ij}}\exp{(-d_{i}(1-d_{j}P_{\iota\gamma}))}\sum_{c_{i}%=0}^{\infty}\frac{1}{(di-A_{ij})!}(1-d_{j}P_{\iota\gamma})^{c_{i}-A{ij}}d_{i}^%{c_{i}}=italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) ) ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG ( italic_d italic_i - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ! end_ARG ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT =
=ci=0exp(di(1djPιγ))(diAij)!(1djPιγ)ciAijdiciAijabsentsuperscriptsubscriptsubscript𝑐𝑖0subscript𝑑𝑖1subscript𝑑𝑗subscript𝑃𝜄𝛾𝑑𝑖subscript𝐴𝑖𝑗superscript1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗superscriptsubscript𝑑𝑖subscript𝑐𝑖subscript𝐴𝑖𝑗\displaystyle=\sum_{c_{i}=0}^{\infty}\frac{\exp{(-d_{i}(1-d_{j}P_{\iota\gamma}%))}}{(di-A_{ij})!}(1-d_{j}P_{\iota\gamma})^{c_{i}-A{ij}}d_{i}^{c_{i}-A_{ij}}= ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) ) end_ARG start_ARG ( italic_d italic_i - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ! end_ARG ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
=ci=0exp(di(1djPιγ))(diAij)!(di(1djPιγ))ciAijabsentsuperscriptsubscriptsubscript𝑐𝑖0subscript𝑑𝑖1subscript𝑑𝑗subscript𝑃𝜄𝛾𝑑𝑖subscript𝐴𝑖𝑗superscriptsubscript𝑑𝑖1subscript𝑑𝑗subscript𝑃𝜄𝛾subscript𝑐𝑖𝐴𝑖𝑗\displaystyle=\sum_{c_{i}=0}^{\infty}\frac{\exp{(-d_{i}(1-d_{j}P_{\iota\gamma}%))}}{(di-A_{ij})!}(d_{i}(1-d_{j}P_{\iota\gamma}))^{c_{i}-A{ij}}= ∑ start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG roman_exp ( - italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) ) end_ARG start_ARG ( italic_d italic_i - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ! end_ARG ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A italic_i italic_j end_POSTSUPERSCRIPT
We assume λ=di(1djPιγ)𝜆subscript𝑑𝑖1subscript𝑑𝑗subscript𝑃𝜄𝛾\lambda=d_{i}(1-d_{j}P_{\iota\gamma})italic_λ = italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 - italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) for simplicity and we apply a change of variables z=ciAij𝑧subscript𝑐𝑖subscript𝐴𝑖𝑗z=c_{i}-A_{ij}italic_z = italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT
=z=0exp(λ)z!λz, knowing that in our problemciAij, i.e.z0.absentsuperscriptsubscript𝑧0𝜆𝑧superscript𝜆𝑧, knowing that in our problemciAij, i.e.z0\displaystyle=\sum_{z=0}^{\infty}\frac{\exp{(-\lambda)}}{z!}\lambda^{z}\text{,% knowing that in our problem $c_{i}\geq A_{ij}$, i.e. $z\geq 0$}.= ∑ start_POSTSUBSCRIPT italic_z = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG roman_exp ( - italic_λ ) end_ARG start_ARG italic_z ! end_ARG italic_λ start_POSTSUPERSCRIPT italic_z end_POSTSUPERSCRIPT , knowing that in our problem italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≥ italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , i.e. italic_z ≥ 0 .
=1absent1\displaystyle=1= 1
Since we have the p.d.f. of a Poisson r.v. inside the summation, i.e.zPoisson(λ)Since we have the p.d.f. of a Poisson r.v. inside the summation, i.e.zPoisson(λ)\displaystyle\text{Since we have the p.d.f. of a Poisson r.v. inside the %summation, i.e. $z\sim Poisson(\lambda)$ }\squareSince we have the p.d.f. of a Poisson r.v. inside the summation, i.e. italic_z ∼ italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( italic_λ ) □

Therefore, we have

AijPoisson(didjPιγ)similar-tosubscript𝐴𝑖𝑗𝑃𝑜𝑖𝑠𝑠𝑜𝑛subscript𝑑𝑖subscript𝑑𝑗subscript𝑃𝜄𝛾A_{ij}\sim Poisson(d_{i}d_{j}P_{\iota\gamma})\qeditalic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ∼ italic_P italic_o italic_i italic_s italic_s italic_o italic_n ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_ι italic_γ end_POSTSUBSCRIPT ) italic_∎

Appendix D Soft assignment workers and jobs to worker types and markets

In section 3, at the maximum of our posterior in equation 3.2.2, each worker is assigned to only one skill cluster, a process of hard assignments. However, it is possible that, given the pattern of worker matches, a particular worker could be revealed to possess certain skills ι1subscript𝜄1\iota_{1}italic_ι start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in most of her matches, and skills ι2subscript𝜄2\iota_{2}italic_ι start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in a few other of her matches. Creating a single worker skill group to accommodate her hybrid skills might not improve model fit if there are only a few workers who exhibit similar matches. Instead, allowing her to have mixed skills ι1subscript𝜄1\iota_{1}italic_ι start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and ι2subscript𝜄2\iota_{2}italic_ι start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, i.e. soft assignment, with weights according to her matching history, provides further nuanced information to the researcher. In fact, we propose using the Bayesian setup in order to recover these weights.

It turns out that the posterior P(𝒃|𝑨,𝒈)𝑃conditional𝒃𝑨𝒈P(\bm{b}|\bm{A},\bm{g})italic_P ( bold_italic_b | bold_italic_A , bold_italic_g ) ultimately carries the desired measure of workers’ skill profile needed to control for workers’ unobserved skills in the wage gap estimation. Given a total of I𝐼Iitalic_I clusters of workers competing for the same jobs in the labor market network, i.e. with similar skills, the posterior distribution provides the chance of each worker to belong to a certain skill cluster, given the worker demographic group g𝑔gitalic_g and the entire network 𝑨𝑨\bm{A}bold_italic_A. More formally, for worker i𝑖iitalic_i, her skills profile is defined as:

Pi:=[P(iι1|𝑨,𝒈)P(iι2|𝑨,𝒈)P(iιI|𝑨,𝒈)]Tassignsubscript𝑃𝑖superscript𝑃𝑖conditionalsubscript𝜄1𝑨𝒈𝑃𝑖conditionalsubscript𝜄2𝑨𝒈𝑃𝑖conditionalsubscript𝜄𝐼𝑨𝒈𝑇\displaystyle\vec{P}_{i}:=\left[P(i\in\iota_{1}|\bm{A},\bm{g})\qquad P(i\in%\iota_{2}|\bm{A},\bm{g})\qquad\cdots\qquad P(i\in\iota_{I}|\bm{A},\bm{g})%\right]^{T}over→ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT := [ italic_P ( italic_i ∈ italic_ι start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | bold_italic_A , bold_italic_g ) italic_P ( italic_i ∈ italic_ι start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | bold_italic_A , bold_italic_g ) ⋯ italic_P ( italic_i ∈ italic_ι start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT | bold_italic_A , bold_italic_g ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT(17)
University of Michigan, bmodene@umich.edu. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1256260. Any opinions, findings, and conclusions or recommendations expressed in  (2024)

References

Top Articles
Latest Posts
Article information

Author: Foster Heidenreich CPA

Last Updated:

Views: 5865

Rating: 4.6 / 5 (56 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Foster Heidenreich CPA

Birthday: 1995-01-14

Address: 55021 Usha Garden, North Larisa, DE 19209

Phone: +6812240846623

Job: Corporate Healthcare Strategist

Hobby: Singing, Listening to music, Rafting, LARPing, Gardening, Quilting, Rappelling

Introduction: My name is Foster Heidenreich CPA, I am a delightful, quaint, glorious, quaint, faithful, enchanting, fine person who loves writing and wants to share my knowledge and understanding with you.