In the research of alleles, one of the important steps is to remove the alignment bias. As we all know, human beings are diploid organisms, there will be a pair of alleles at the same position on the chromosome, which is generally homozygous (homozygous); sometimes one of the alleles is mutated (it can be understood as a SNP, oligonucleotide polymorphism) ), will become heterozygous (heterozygous) state.
If a person is heterozygous at a certain locus, such as AG (A is consistent with the reference genome, G is the mutation locus), in the process of comparing with software such as bowtie2 or bwa, reads carrying A are easier to compare than Right, and the reads carrying G will be relatively difficult to compare because they are not completely consistent with the reference genome (as a mismatch penalty), which will eventually lead to a difference in the number of reads between the two and cause errors.
The following are several methods that I have seen to remove alignment bias:
In 15 years GB, there is an article Tools and best practices for data processing in allelic expression analysis comparing the effects of several methods at that time:
WASP works well and loses the least reads.
Welcome to pay attention to the public account: daily common learning of Shengxin programming~