​The majority of South Africans have been exposed to M.tb, with a latent tuberculosis infection (LTBI) prevalence of 26% at 5–8 years, 53% at 14–17 years and >75% at 25 years.  Population-wide LTBI, high TB incidence, and low HIV infection allow us to capture a large sample size and ideal cases-control ratio (48% cases, 90% HIV-negative, and n=1100 currently collected).  Each participant in our study has matched DNA and epidemiological data (smoking behavior, alcohol consumption, comorbidities [HIV, diabetes, asthma], education and SES, and anthropometrics).

 

 

graphics

 

Increased power to detect an association between a given variant and a phenotype in admixed populations. Admixed population (ADX) has greater power than either of its ancestral populations (POP1, POP2). Modeled here for a binary dichotomous or quantitative trait.  We parameterized our simulation using the Balding-Nichols model for 2-way admixture between parental populations representing African and European drawing 100 causal SNPs from a range of Fst values and assuming a trait heritability of 50%. Association testing performed by varying each group’s sample sizes. “Power” is the proportion of causal sites identified with a p≤0.05.
 
Fourteen genome wide association studies (GWASs) investigating various clinical phenotypes of TB susceptibility have been conducted in the past decade, with little replication of variants due to small sample sizes, study populations with low TB endemicity, and relatively low genetic diversity.  The vast majority of GWAS continue to be performed in European populations and the portability of GWAS variants has met with mixed success when discovery occurs first in European-descent study populations. Genetic discoveries made in recently admixed populations, like our population, have a higher degree of transference across populations. Admixed individuals carry a greater diversity of variants and LD patterns which facilitate accurate identification of GWAS hits relevant to multiple ancestries. 
 
Polygenic risk scores: The aggregation of genetic findings can be used to determine the overall strength of association with TB progression and related outcomes. This can be determined via summing the effects across the entire genome into a polygenic risk score (PRS). Recently, PRS has been shown to be a possible clinical resource. Notably, the use of a PRS here could be the context of determining effect sizes across quantiles of risk, increase accuracy of prediction models using environmental data, and contrast genome architecture between our study and that of the International TB Host Genetics consortium (ITHGC). Importantly, we have demonstrated that PRS methods are highly susceptible to population structure, and without appropriate modelling, can be subject to unpredictable biases, and can exacerbate disparities. Understanding this is highly critical to analyses in under-represented groups. Here, we will use the framework co-developed by Co-I Gignoux, MPI and Methods Working Group Chair in the international Population Architecture using Genomics and Epidemiology (PAGE) Study’s polygenic scoring initiative to evaluate and apply scores in a manner appropriate to TB progression in the Northern Cape population.