Prof. Jun Zhu, PhD

Prof. Jun Zhu, PhD

Director, Institute of Bioinformatics , Zhejiang University, China,
http://mypage.zju.edu.cn/en/Jun_Zhu/0.html

Title of Talk: Mixed Linear Model Approaches of Association Mapping for Complex Traits Based on Omics Variants

Abstract: 

Complex traits are controlled by four-omics variants of SNPs, transcripts, proteins, and metabolites. Most of association studies for complex traits have been ignoring dominance, epistasis and environment interactions. We proposed mixed linear model approaches for association mapping SNPs (QTSs), transcripts (QTTs), proteins (QTPs), and metabolites (QTMs) to complex traits. Precise prediction for genetic architecture of complex traits has been impeded by the limited understanding on genetic effects of complex traits, especially on locus-by-locus interaction (GxG) and locus-by-environment interaction (GxE). The analysis of large omics datasets, especially two-loci interaction analyses, involves intensive computation. A GPU-based mapping software (QTXNetwork) has been developed for detecting multiple loci on large-scale omics data, and for estimating variance components of genetic effects. By analyzing datasets of SNPs and transcripts for mouse and drosophila datasets, we demonstrated that unbiased estimation could be obtained for genetic effects of causal loci. Transcript association can efficiently detect causal transcript loci on complex traits (QTTs), and on other transcripts (tQTTs). Complicated genetic networks of transcripts controlled by other omics variables can also be revealed for SNPs (tQTSs), proteins (tQTPs), and metabolites (tQTMs).

Association mapping for startle in Drosophila revealed high heritability for 85 QTTs (0.996) and 48 QTSs (0.935). The QTTs were also controlled by other 86 tQTTs (0.804 ~ 0.998) and 25 tQTSs (0.115 ~ 0.423). Both real data analyses and Monte Carlo simulations demonstrated that genetic effects and environment interaction effects could be estimated with no bias and high statistical power by using the proposed approaches. We conducted comparative GWASs for total cholesterol by full model and additive models. QTS-analysis identified 13 individual loci and 3 pairs of epistasis loci by using full model, and detected 14 loci by additive model. PLINK-analysis identified two loci and GCTA-analysis detected only one locus with genome-wide significance. Full model identified three previously reported genes as well as several new genes. Analyses of cholesterol data and simulation studies revealed that the full model performs were better than the additive-model performs in terms of detecting power and unbiased estimations of genetic variants of complex traits. By using full genetic model, Alcohol dependence symptom count (ADSC) was analyzed for detecting 20 highly significant QTSs, including four in previously reported genes (ADH1B, PKNOX2, CPE, and KCNB2), 4 novel genes (RGS6, FMN1, NRM, and BPTF), 2 noncoding RNA, and 2 epistasis loci. The detected QTSs contributed to about 20% of total heritability, in which dominance and epistasis effects accounted for over 50%.

WGAS was conducted for yield traits of cotton cultivars. There were 75 SNP loci detected with high heritability (61.73% ~ 98.71%), among which largely due to environmental interaction for lint yield (19.22%) and boll number (24.66%), and also to epistasis for boll weight (31.70%) and lint percentage (88.63%). As an often cross-pollinated crop, there are a small number of heterozygous genotypes (~7.0%) in the mapping cultivars, while the dominance-related heritabilities () were the major components (0.54 ~ 0.95) of total heritability () for four yield traits. It was revealed importance of heterozygote advantage for yield traits of cotton cultivars at the molecular level, and dominance effects of heterozygotes to genotypic variation of yield traits. These results could be useful for expediting cotton yield improvement by developing appropriate breeding strategies according to specific genetic basis. Leaf traits (leaf length, leaf width and upper leaf angle) of maize were analyzed for 5000 lines of NAM population derived by USDA. Analyses with full model identified 38 ~ 47 loci and multi-loci additive model identified 39~50 loci.  Estimated total heritability varied from 64.32~79.06% for full model, but 19.36 ~ 48.86% for multi-loci additive model. Estimated heritability due to dominance and dominance related epistasis interaction effects was 16.00% ~ 56.91% for full model. Phenotypic variation of upper leaf angle was mostly controlled by additive (a) and additive × additive (aa) epistasis effects, but phenotypic variations for leaf length and leaf width were mostly controlled by dominance related epistasis interactions. The optimal genotype combinations were predicted for each of traits under 4 different environments based on the estimated genotypic effects to facilitate maker-assisted selection for leaf traits. It was revealed that the dominance and epistasis effects had large contributions to heritability, however environmental interactions were relatively unimportant for the leaf traits.