Evolution
Intelligent Design
ORFanID: An Online Search Engine for Identifying Orfan Genes

Taxonomically restricted genes are coding sequences that are restricted to a specific taxonomic group and lack recognizable homologs in other groups, including closely related species. The existence of such genes is surprising given the hypothesis of universal common descent — i.e., that all genes have evolved by numerous successive modifications from other genes, tracing all the way back to the last universal common ancestor. Some genes, called ORFan genes, are unique and found only in a single species.
Intelligent design theorists have long been interested in ORFan genes and taxonomically restricted genes. If no other similar gene exists to a particular gene, from what gene source did they evolve? Last year, a Discovery Institute-funded research paper was published in the journal PLOS One, introducing a “graphical web-based search engine,” called ORFanID, that “facilitates the efficient identification of both orphan genes and TRGs at all taxonomic levels, from DNA or amino acid sequences in the NCBI database cluster and other large bioinformatics repositories.”1 The software enables users to “identify genes that are unique to any taxonomic rank, from species to domain, using NCBI systematic classifiers.”
Limitations of other Programs
The authors note the limitations of other available genomics analysis programs that one might use for studying orphan genes:
Currently, the tool options researchers can find today for studying orphan genes are limited, as most software solutions focus on identifying orthologs or inferencing ortho groups and are generally limited to proteins. For instance, ORFanFinder functionality is limited to plants, bacteria, and fungi, and the URL in the original publication (DOI: 10.1093/bioinformatics/btw122) is not active. However, a web search has revealed that ORFanFinder is available at http://bcb.unl.edu/orfanfinder/ although this software has not been updated since 2016. SequenceServer performs BLAST without classifying the proteins/DNA sequences into taxonomic levels. The software Geneious can perform alignment and build a phylogenetic tree but identifying orphans using this software is challenging. Similarly, OrthoFinder provides the option to use DIAMOND or its recommended MMseq2 for sequence alignment. OMA orthology or the series of analytical resources developed by the Bioinformatics Resource Centers (BRCs) for Infectious Diseases program [bioinformatics tools, workspaces, and services for bioinformatics data analysis like AmoebaDB, FungiDB, OrthoMCL] only show orthologous genes/proteins and do not identify orphan genes. A newer gene classification platform, www.shoot.bio, may also be helpful to align and compare gene origins but it only uses protein (amino acid) sequences. In short, these tools perform alignment or identify conserved genes from genomes, but they are not suited to identify orphans as ORFanID is designed to do. [Internal citations omitted.]
They further highlight the distinctives of ORFanID relative to other available genomics analysis programs:
ORFanID’s distinctiveness is in three aspects: (1) It processes not just protein/amino acid sequences but also DNA/nucleotide sequences. (2) With its built-in homology interpreter and classifier, this search engine provides the taxonomic rank of a gene either as an orphan gene or as a gene restricted to a taxonomic level in the tree of life; (3) As ORFans and TRGs are identified, ORFanID builds its own database with the results of the analysis and provides the researcher with the possibility to further explore the data.
Testing the Program
The authors tested the utility of their program by using it to analyze DNA and protein sequences of organisms including C. elegans, S. cerevisiae, D. melanogaster, and H. sapiens, using genes that had been previously predicted to be taxonomically restricted. Their results confirmed “that the ORFanID algorithm is accurate in classifying proteins to their respective taxonomic group based on the functionality of BLAST and the choice of parameter settings.” They next tested the program on species-specific genes (i.e., orphans), using genes that had been previously identified as orphans in A thaliana, D. melanogaster, C. elegans, and S. cerevisiae. The result was that “ORFanID precisely identified these published genes as species-specific orphans.” The authors concluded that “Taken together, these results suggest that ORFanID works accurately and reliably in classifying and identifying species-specific orphan genes.”
The authors also compared the sensitivity and accuracy of ORFanID, in properly classifying taxonomically restricted genes, using five organisms (C. elegans, E. coli, H. sapiens, O. sativa, and Z. mays). The results revealed that “ORFanID is more sensitive and gives more accurate results in classifying orphan and strict-orphan genes.”
Though it was once thought that ORFan genes were rare, research has revealed that taxonomically restricted genes (and even strictly ORFan genes, which are limited to a single species) are in fact quite common. ORFanID is a tool that can be used in further research projects to explore and catalog taxonomically restricted genes and improve our understanding of their prevalence. The origin and nature of ORFan genes are research questions that everyone is interested in — whether they are doing science from an evolutionary paradigm or an intelligent design one. This project shows that ID is helping to inspire research that is of interest to science generally and bearing good fruit for the scientific community.
Notes
- Gunasekera RS, Raja KKB, Hewapathirana S, Tundrea E, Gunasekera V, Galbadage T, Nelson PA. ORFanID: A web-based search engine for the discovery and identification of orphan and taxonomically restricted genes. PLoS One. 2023 Oct 25;18(10):e0291260. doi: 10.1371/journal.pone.0291260. Erratum in: PLoS One. 2024 Aug 8;19(8):e0308834. doi: 10.1371/journal.pone.0308834. PMID: 37879070; PMCID: PMC10599687.