Targeted imputation of sequence variants and gene expression profiling identifies twelve candidate genes associated with lactation volume, composition and calving interval in dairy cattle Academic Article uri icon


  • Dairy cattle are an interesting model for gaining insights into the genes responsible for the large variation between and within mammalian species in the protein and fat content of their milk and their milk volume. Large numbers of phenotypes for these traits are available, as well as full genome sequence of key founders of modern dairy cattle populations. In twenty target QTL regions affecting milk production traits, we imputed full genome sequence variant genotypes into a population of 16,721 Holstein and Jersey cattle with excellent phenotypes. Association testing was used to identify variants within each target region, and gene expression data were used to identify possible gene candidates. There was statistical support for imputed sequence variants in or close to BTRC, MGST1, SLC37A1, STAT5A, STAT5B, PAEP, VDR, CSF2RB, MUC1, NCF4, and GHDC associated with milk production, and EPGN for calving interval. Of these candidates, analysis of RNA-Seq data demonstrated that PAEP, VDR, SLC37A1, GHDC, MUC1, CSF2RB, and STAT5A were highly differentially expressed in mammary gland compared to 15 other tissues. For nine of the other target regions, the most significant variants were in non-coding DNA. Genomic predictions in a third dairy breed (Australian Reds) using sequence variants in only these candidate genes were for some traits more accurate than genomic predictions from 632,003 common SNP on the Bovine HD array. The genes identified in this study are interesting candidates for improving milk production in cattle and could be investigated for novel biological mechanisms driving lactation traits in other mammals.


  • Raven, L-A
  • Cocks, BG
  • Kemper, KE
  • Chamberlain, AJ
  • Vander Jagt, CJ
  • Goddard, ME
  • Hayes, BJ

publication date

  • 2016