QC filtering procedure | Number of variants removed |
---|---|
Not passing SAMtools filters (“mpileup -S -D -q 30 -Q 20”, “vcfutils.pl varFilter -w 10 -d 3 -D 12740 -e 0–2 0”) | 209,826 |
Cumulative coverage outside of twofold range of global median coverage | 20,843 |
MAF in 723 monkeys <10 % | 10,766 |
Missing >50 % of data | 105 |
Too few (<3) loci in 3Mb regions, not enough for TrioCaller to work. | 1,360 |
Loci unmapped or not mapped uniquely during LiftOver | 32,419 |
Filtered out by GATK’s FilterLiftedVariants | 4,094 |
Whole contig removed for contigs with >1 chromosome switching events per 100 loci | 6,208 |
LiftOver MapScore <0.5 | 61,721 |
Loci mapped to the same coordinate in the new reference genome | 4 |
Alignment: identified regions of poor alignment (mapping quality <2- or coverage >2-fold range of global median depth) and masked these genotypes as missing. Sites with >50 % missing in 4X and above monkeys are removed | 438,423 |
Sex chromosome SNPs | 65,271 |
>=5 Mendel errors in parent–child comparisons | 8,563 |
>60 % heterozygous calls | 6,201 |
Total | 865,772 |