The maize genome and annotation nomenclature


As the number of genome assemblies and annotations increase, unambiguous identification of each of these entities is necessary for clarity and precision. Identifiers must be unique and persistent, be both machine and human readable, and be consistant across genomes. We realize that tradeoffs between these goals and simplicity are inevitable. To ensure that identifiers are unique, MaizeGDB will be the single naming authority for the assignment of identifiers. In addition to unique, persistent identifiers, metadata is required to properly identify and describe genome assembly datasets.


Because of the large number of existing and expected genome assemblies, assemblies are identified by a unique number rather than a name. While this is not easily human-readable, it handles cases in which the same cultivar is sequenced by multiple projects (e.g. Mo17) and cases where the cultivar name is a seed bank accession number. Additionally, chromosomes are not indicated in the gene model names as doing so could cause confusion, given the existence of large chromosomal rearrangements in maize. For example, in the NAM founder Oh7B, a large portion of chromosome 10 broke off and attached to chromosome 9. The gene model numbers are, however, numbered in sequential order along the chromosomes.


Genome assembly and annotation Project Personnel should work with MaizeGDB Personnel to acquire genome and annotation names that comply with the guidelines herein, and to provide required metadata, as outlined here.


To learn more, the maize genome and annotation nomenclature document is here and the complete maize nomenclature page is here.

Gene Model and Transcript Nomenclature


Assembly Name


Assembly Identifier