Hi. I have a question about processed pseudogene (not standard pseudogenes). Please could someone tell me how many processed pseudogenes (also called retropseudogenes) are in the human genome?
I think I read somewhere that very few have been found in the human genome. Humans have one functional NANOG gene and 10 copies scattered across the genome. We share 9/10 with chimps, gorilla and orangutan, and 7/10 in with macaque (which has another 3 NANOG processed pseudogenes that are unique to its lineage). If it’s true that we don’t have many (if any) other processed pseudogenes/retropseudogenes in the human genome how come only the NANOG gene has been repeatedly copied into an processed pseudogenes on at least 13 different occasions during primate evolution, but no other genes? (Sorry I’m still pretty new to this topic)
I just checked my local copy of the Gencode v36 annotation of the human reference genome (this is a recent version, but it’s one release behind the current version) and found 10167 different annotated processed pseudogenes.
I imagine there are also plenty of other segregating processed pseudogenes that are not found in the reference assembly.
Edit: an easier way to get the answer is to check the statistics page of the current Gencode release. Accordingly, there are 10,669 processed pseudogene annotations in the most recent version.
I’m pretty new to this subject and still trying to figure out all the terminology. I know that a processed pseudogene is a copy of a gene that is inserted elsewhere in the genomes, but what is the difference between annotated processed pseudogenes and segregating processed pseudogenes? Which category are the 9 NANOG processed pseudogenes?
Yes, specifically copies that were transcribed from DNA to mRNA and then reverse transcribed back into DNA. Usually, they are inserted by a retrotransposon, after most or all introns have been spliced out.
what is the difference between annotated processed pseudogenes and segregating processed pseudogenes?
An annotation is simply the identification of a “feature” (could be a gene, pseudogene, transposon, etc.) that is found or predicted to be present in a genome assembly. So in this case, an annotated processed pseudogene is one that is present in the human reference genome.
A segregating processed pseudogene is present at a frequency below 1.0 (usually quite a bit below) in a population or species–that is, some individuals have that particular copy of of the pseudogene, while others do not. If it’s segregating in the population, it may or may not be present in the reference genome.
Which category are the 9 NANOG processed pseudogenes?
I don’t know anything about NANOG specifically, but based on the names in Table 1 of this paper, they are annotated. Given that most of these are shared with chimps, they are most likely not segregating within humans, or if they are, they’re probably segregating at very high frequencies. It’s possible that there are other segregating non-reference copies of NANOG pseudogenes as well.