Predicting the number of protein-coding genes that are present in a genome is complicated by the presence of pseudogenes, by misannotations of non-coding sequence, by the incompleteness of assemblies ...