CATHERINE C. BLAKE, District Judge.
Performance Food Group, Inc. ("PFG") delivers food and food-related products through its foodservice distributors. One distributor, its Broadline Division,
PFG employed "operatives," defined by the EEOC as "[w]orkers who operate machine or processing equipment or perform other factory-type duties of intermediate skill level which can be mastered in a few weeks and require only limited training,"
The EEOC's expert, Elvira Sisolak, has submitted an expert report, a rebuttal report, and a supplemental rebuttal report. PFG's rebuttal expert, Stephen G. Bronars, has submitted a rebuttal expert report and a supplemental rebuttal report. All reports concern Sisolak's statistical tests showing a statistically significant gender disparity adverse to women in job offer rates in the five operative positions.
Sisolak is a senior economist who works at the Office of General Counsel, EEOC. (ECF 256-1, Sisolak Report at 4). Her first report is dated August 7, 2017. Sisolak divided the applicant and hiring data provided by PFG into two time periods (2004-June 30, 2009, and July 1, 2009-2013), because the 2009-2013 period has more complete data. (Id. at 2-3). She also divided the "numerous Operative titles shown in the data into five job groups: Drivers, Forklift Operators, Selectors, Supervisors, and Other Warehouse positions." (Id. at 2). PFG received 76,589 applications during the 2004-2009 period, (id. at 19), and 101,769 applications during the 2009-2013 period, (id. at 8).
Sisolak compared the percentage of women applicants to the percentage of women selected. (Id. at 11 (2009-2013 period), 20-21 (2004-2009 period)). For each job group, she subtracted the actual female hires from the expected female hires, and calculated the "missed opportunities." (Id. at 11 (2009-2013), 21 (2004-2009)). She also compared the selection rates for men and women for each of the job groups. (Id. at 12 (2009-2013), 21-22 (2004-2009)).
Sisolak then controlled for certain variables by conducting the analysis of selection rates by job group, location, and/or year. She did this by organizing the data into strata (e.g., one strata would consist of applicants for a selector position at a certain OpCo), and then aggregating the results to determine whether the selection rates were statistically significant. (Id. at 13-14 (2009-2013), 23 (2004-2009)). She used the Cochran-Mantel-Haenszel Procedure ("CMH") to aggregate the statistical results; the test is "designed to test the null hypothesis that the true selection rates for women and men do not differ when the data is disaggregated by strata (level) and then recombined."
Sisolak then controlled for Class A license (for driver applicants), experience, and whether the applicant completed an online application. Sisolak conducted a separate analysis of selection rates for driver applicants who did and did not have a Class A license. She applied a Fisher's Exact test
PFG provides the report of Stephen G. Bronars, dated December 18, 2017, to rebut Sisolak's report. Bronars does not provide a conclusion of his own regarding whether selection disparities based on gender occurred, but points out several flaws in Sisolak's report. Bronars states that Sisolak's conclusions are unreliable because she did not "model the hiring process at PFG." (See ECF 233-3, Bronars Report at 4). This is because, according to Bronars, Sisolak placed all applicants in one strata, regardless of whether they were actually competing against each other for a position, and ignores that each applicant did not have an equal chance of being selected, but might be more or less likely to be chosen based on their qualifications and who they were competing against. (See, e.g., id. at 9 & n.13 (falsely assumes equal chance), 16 ("artificially constructed candidate pools")). The CMH tests "artificially combine[] candidates and offers into composite annual pools" that do not account for whether those candidates were actually competing against each other for the same position or whether the candidates were eligible to receive offers for that position.
An example of Bronars' criticism of the "artificial pools" is as follows: in her test controlling for online applications, Sisolak divided applicants based on whether they completed an online application. For some requisitions, however, only applicants who completed online applications were offered jobs. (Id. at 13). Therefore, for such a requisition ("requisition 1"), there would be no offers among the applicants who did not complete online applications. These "requisition 1" applicants would then be combined with other applicants from other requisitions who also did not complete online applications ("non-online application group"), and the gender disparity between group makeup of the non-online application group and offers would be calculated. Bronars' criticism, though, is that none of the "requisition 1" applicants would be eligible for any of the offers in the non-online application group, since the only requisition they applied to was offered to someone who completed an online application, and therefore this offer would instead be included in the "online application group." (Id.). According to Bronars, this way of analyzing the data "caused Ms. Sisolak to incorrectly measure expected job offer rates for the candidates she intended to study." (Id.). Because a slightly higher percentage of women failed to submit an online application than men, the failure to complete an online application could be a legitimate explanation for "missed opportunities" and, according to Bronars, was not adequately controlled for. (Id. at 14).
Bronars also criticizes Sisolak for aggregating her test results across all OpCos and job groups. He argues that such aggregation is inappropriate because of the difference in job requirements and hiring criteria across job groups and locations. (Id. at 15).
Bronars' other criticisms of Sisolak include that: she did not take into account the timing of applications in her CMH test based on job tracking number, (id. at 24); did not properly categorize applicant's experience or use all available information,
In her rebuttal report, dated March 6, 2018, Sisolak responds to Bronars' criticism of her aggregation of results, noting that such aggregation is necessary to prove a "pattern" of gender disparity. (ECF 256-9, Sisolak Rebuttal at 2). According to Sisolak, Bronars' methodological criticism is that Sisolak "should not have used CMH tests which took as their strata any population of candidates other than those candidates applying under the same job tracking number." (Id. at 19). She argues that this position, that "one cannot aggregate data across any potential control variable," is contrary to "routine analytical practice" and would be impossible for the 2004-2009 period, during which PFG did not use job tracking numbers and it is not clear which applicant applied to which specific position. (Id. at 20). Sisolak explains why she did not use the Career Builder resumes, and the issues with matching applicants to the PDF applications. (Id. at 11). She also notes that Bronars' criticism of her report is speculative, as he does not know if the outcome would have changed, since he did not perform tests of his own. (Id. at 3).
Sisolak also conducted additional tests to respond to Bronars. First, she conducted a CMH test that controlled for online applications, prior work experience, and Class A licenses (for driver applicants) at the same time, finding the results also statistically significant. (Id. at 3, 6-7). Second, she conducted CMH tests using hires instead of offers, and found that the results were essentially the same. (Id. at 25).
In his supplemental report, dated April 16, 2018, Bronars conducted additional tests "to assess how Ms. Sisolak's tests of gender disparities in job offer rates would have changed had she reported test results separately by job group and" OpCo. (ECF 233-5, Bronars Rebuttal Report at 1). He first conducted separate CMH tests "for each of 81 combinations of job groups and OpCos that had at least one `competitive' requisition [male and female candidates and at least one offer] in 2009-2013 using job requisitions as strata." (Id. at 3). He found no significant disparity in job offer rates adverse to females at any OpCo for the positions of forklift operator, warehouse supervisor, and other warehouse position; and found no significant disparities in offers for drivers in 15 of 22 OpCos, and no significant disparities in offers for selectors in 8 of 23 OpCos. (Id. at 3-4).
Second, Bronars "used a multiple regression methodology and Ms. Sisolak's data to estimate job offer models for each job group at each OpCo while accounting for job requisition and her simple controls for prior experience." (Id. at 6). Bronars found no significant differences in job offer rates for the driver, forklift operator, warehouse supervisor, or other warehouse positions at any OpCo. (Id. at 6-7). He found no significant disparities in offer rates for female selector applicants in 9 out of the 23 OpCos. (Id. at 7).
In response to Bronars' rebuttal, Sisolak submitted a supplemental rebuttal, dated September 28, 2018. She argues that Bronars divided the data into unreasonably small groups so that results could never be statistically significant. (ECF 233-6, Sisolak Supplemental Rebuttal at 5-7). For example, Bronars calculated no statistically significant disparities in 44 location/job groups that had female applicants but zero female hires. (Id. at 7). Sisolak argues that Bronars relied on an inappropriate form of regression, Ordinary Least Squares ("OLS"), that is not appropriate when there are only two values (hired/not hired), which is shown by the fact that his analysis produces some probabilities of getting the job as less than zero. (Id. at 16). Even using the OLS regression test that Bronars used, however, relying on the job tracking numbers to take location into account and controlling for experience, Sisolak found that gender disparities in selection rates were statistically significant in all jobs combined, and for the driver, selector, and other warehouse job groups. (Id. at 17). Based on the OLS regression analysis, selection rates for women applying to the forklift and supervisor positions were lower than for men, but the difference was not statistically significant. (Id. at 17). Sisolak also reached the same results using another regression test, logistic regression. (Id. at 16-17).
Rule 702 of the Federal Rules of Evidence, which governs the admissibility of expert testimony, states:
The party seeking to introduce expert testimony has the burden of establishing its admissibility by a preponderance of the evidence. Daubert v. Merrell Dow Pharm., 509 U.S. 579, 592 n.10 (1993). A district court is afforded "great deference . . . to admit or exclude expert testimony under Daubert." TFWS, Inc. v. Schaefer, 325 F.3d 234, 240 (4th Cir. 2003) (citations and internal quotation marks omitted); see also Daubert, 509 U.S. at 594 ("The inquiry envisioned by Rule 702 is . . . a flexible one . . . ."). "In applying Daubert, a court evaluates the methodology or reasoning that the proffered scientific or technical expert uses to reach his conclusion; the court does not evaluate the conclusion itself," Schaefer, 325 F.3d at 240, although "conclusions and methodology are not entirely distinct from one another." General Elec. Co. v. Joiner, 522 U.S. 136, 146 (1997). In essence, the court acts as gatekeeper, only admitting expert testimony where the underlying methodology satisfies a two-pronged test for (1) reliability and (2) relevance. See Daubert, 509 U.S. at 589. To be admissible, however, "the expert testimony need not be irrefutable or certainly correct." Young v. Swiney, 23 F.Supp.3d 596, 611 (D. Md. 2014) (internal citation and quotation omitted). "In other words, the Supreme Court did not intend the gatekeeper role to supplant the adversary system or the role of the jury: [v]igorous cross-examination, presentation of contrary evidence, and careful instruction on the burden of proof are the traditional and appropriate means of attacking shaky but admissible evidence." Id. (internal citations and quotations omitted).
Bronars' rebuttal of Sisolak's expert report mostly involves criticisms of the placement of applicants into "artificial" statistical pools and the aggregation of selection rates across positions, OpCos, and years. Additionally, some criticisms relate to the failure of Sisolak to control for certain factors (or to control properly for those factors). Bronars also criticizes the inclusion in some tests of noncompetitive requisitions or requisitions in which only male candidates were included.
Bronars' criticism, even if valid, does not require exclusion of Sisolak's expert testimony and reports. First, the use of CMH tests has been accepted in other cases,
Sisolak attempted to control for factors such as experience, whether the applicant completed an online application, and whether the applicant had a Class A license (for drivers) in subsequent tests, to show that they did not account for the findings of gender disparity. To the extent PFG argues that Sisolak did not properly control for experience, such as by not including additional information from sites like CareerBuilder, it may address this on cross-examination. See Texas Roadhouse, Inc., 215 F. Supp. 3d at 155 ("failing to use a perfect set of variables that incorporates all relevant factors or excludes all potentially irrelevant variables is not a means for rejecting an expert's analysis."). Importantly, it appears that many of Bronars' criticisms, for example failing to properly measure experience or control for the timing of the application, could affect men and women equally, and may not alter the findings of gender disparity. Therefore, even if Bronars' criticisms are valid, it is not clear how they reflect on the ultimate question.
Similarly, PFG can question Sisolak as to her treatment of non-competitive and male-only requisitions. In particular, there appears to be a reasonable dispute as to how to treat non-competitive requisitions, which tended to be male only. That PFG appears to have created certain requisitions to hire a preselected (and usually) male candidate may be data that should be considered in the statistical analysis. (See Sisolak Supplemental Rebuttal at 11).
Therefore, the court finds Sisolak's reports and testimony to be relevant and reliable, and will deny the motion to exclude them.
The EEOC argues that Bronars' criticisms are not relevant or reliable because he merely speculates that certain factors might affect the outcome of Sisolak's tests, without showing that they would. Specifically, for many of the criticisms, he does not state how they would affect the ultimate finding of gender disparity. As to his second report, the EEOC argues that Bronars' findings are misleading, as many of his statistical pools did not have sufficient statistical power. The EEOC also argues that Bronars misstates the opinion of the EEOC's expert in another case, Dr. David Neumark, regarding the benefits of multiple regression over CMH.
Although "[t]he court's function is more limited when evaluating rebuttal expert testimony offered by the defendant," it must still "determine threshold admissibility." Samuel v. Ford Motor Co., 112 F.Supp.2d 460, 469 (D. Md. 2000). Therefore, "a rebuttal expert is still subject to the scrutiny of Daubert and must offer both relevant and reliable opinions." Funderburk v. S.C. Elec. & Gas Co., 395 F.Supp.3d 695, 716 (D.S.C. 2019).
Here, Bronars' opinions meets the standards for relevance and reliability. First, Bronars' rebuttal would be helpful to the factfinder because it demonstrates some limitations of Sisolak's methodology: for example, his testimony might show that multiple regression analysis, rather than CMH, is more appropriate in these circumstances. His testimony also explains problems with aggregating across position and location, as the data might show a statistically significant selection disparity as to all operatives in the Broadline division when, in reality, such disparity only existed in a few positions or locations.
Second, Bronars has sufficiently explained his criticisms so that they are reliable. The cases cited by the EEOC are distinguishable. In Rembrandt Social Media, LP v. Facebook, Inc., No. 1:13-CV-158, 2013 WL 12097624, at *2 (E.D. Va. Dec. 3, 2013), the court excluded rebuttal testimony when the expert gave "no sources for this opinion, and provide[d] no reason to believe her opinion is based on a reasoned explanation." Here, Bronars explains why he believes Sisolak failed to consider certain relevant factors, inappropriately aggregated the data, and inappropriately used CMH. Id. His criticisms of Sisolak's data are testable, as evidenced by the fact that Sisolak performed additional tests to respond to some of Bronars' criticisms.
The EEOC also points to In re Ethicon, Inc., MDL No. 2327, 2016 WL 4493501, at *3 (S.D. W.Va. Aug. 25, 2016), where the court excluded rebuttal testimony when it was unclear what material the expert reviewed to reach her opinions, and reviewed only five samples of tissue explants without explaining her methodology in selecting and reviewing the samples. Although Bronars did rely on samples of applications to form his opinion that Sisolak mismatched some applicants with their applications or did not obtain all available experience data, Bronars used a larger number of random samples and explained how he reached his opinion. (ECF 233-3, Bronars Report, Appendix C and Appendix D).
In Bronars' second report, he analyzed 81 separate statistical pools, based on job position and OpCo, and found no significant disparities with respect to certain positions and certain locations. The EEOC argues that Bronars' calculations are misleading because many of the pools did not have sufficient statistical power (so that even with zero female hires the disparity was not statistically significant), and because Bronars gives equal weight to the pools even though some had hundreds more applicants than the others.
Here, Bronars' methodology is clearly laid out in his report, so it is testable (and in fact, Sisolak did test his methodology, including by calculating the statistical power of his data). (Sisolak Supplemental Rebuttal at 9). To the extent the EEOC contends that Bronars' findings are misleading, the EEOC may question him about that issue on cross-examination.
Finally, with respect to David Neumark's testimony, Bronars and Sisolak disagree as to the meaning of his statements regarding CMH in his testimony as an expert in a separate case. Sisolak and Bronars may dispute the general acceptance of CMH and multiple regression analysis in the scientific community, since that is relevant to whether CMH and multiple regression are reliable tests. The court notes, though, that Neumark's opinions on the use of CMH and multiple regression analysis in an unrelated case are likely of little (if any) relevance to this case, and any reference to Neumark is relevant only in the context of general scientific acceptance.
Therefore, the court finds Bronars' testimony and reports to be relevant and reliable, and will deny the motion to exclude them.
For the reasons stated above, the court will deny the motions to exclude as to Elvira Sisolak and Stephen G. Bronars. A separate order follows.