FlashPCA2: principal component analysis of biobank-scale genotype datasets Academic Article uri icon


  • MotivationPrincipal component analysis (PCA) is a crucial step in quality control of genomic data and a common approach for understanding population genetic structure. With the advent of large genotyping studies involving hundreds of thousands of individuals, standard approaches are no longer computationally feasible. We present FlashPCA2, a tool that can perform PCA on 1 million individuals faster than competing approaches, while requiring substantially less memory.Availabilityhttps://github.com/gabraham/ashpcaContactgad.abraham@unimelb.edu.au

publication date

  • 2017