MaizeGDB Genome Center

Home > Genome Center > Zm-Dan340-REFERENCE-BAAFS-1.0

Zm-Dan340-REFERENCE-BAAFS-1.0 genome assembly

Project Details

Metadata

Browser

Metadata

Browser

Information about assembly Zm-Dan340-REFERENCE-BAAFS-1.0

Assembly identifier: Zm00104a

Click here to learn about maize genome and gene model nomenclature rules.

Genome Sequencing Project Information

	Project name	Dan340 Genome Assembly
	GenBank BioProject	PRJNA795201
	Project PI	Fengge Wang
	Project start date	2020-01-06
	Release date	2022
	Contributors	Yikun Zhao, Yuancong Wang, De Ma, Guang Feng, Yongxue Huo, Zhihao Liu, Ling Zhou, Yunlong Zhang, Liwen Xu, Liang Wang, Han Zhao, Jiuran Zhao, Fengge Wang
	Funding	This research was supported by grants from the special project for the construction of scientific and technological innovation capacity of Beijing Academy of Agriculture and Forestry Sciences (NO. KJCX20200305)
	Publication status	Published
	Project reference	A chromosome-level genome assembly and annotation of the maize elite breeding line Dan340. Yikun Zhao, Yuancong Wang, De Ma, Guang Feng, Yongxue Huo, Zhihao Liu, Ling Zhou, Yunlong Zhang, Liwen Xu, Liang Wang, Han Zhao, Jiuran Zhao, Fengge Wang DOI

Stock and Biosample Information

Stock information
	Stock name	Dan340
	Stock provided by	Beijing Academy of Agricultural and Forest Sciences (BAAFS)

Biosample information
	Species	Zea mays ssp. mays (maize)
	Sample name	Dan340
	Sample description	14 day old seedling
	GenBank BioSample	SAMN20821243
	Collection date	2021
	Collected by	Beijing Academy of Agricultural and Forest Sciences (BAAFS)
	Location	China
	Plant structure	14 day old seedling

Sequencing and Assembly Information

Assembly name

Zm-Dan340-REFERENCE-BAAFS-1.0

Assembly date

2022

Assembly accession

GCA_024505845.1

WGS accession

JAKJKK000000000.1

Assembly provider

Maize Research Center, Beijing Academy of Agricultural and Forest Sciences (BAAFS)/Beijing Key Laboratory of Maize DNA Fingerprinting and Molecular Breeding, Beijing 100097, China

Sequencing description

Sequencing technologies: PacBio, Illumina
Sequencing method: Sequencing technologies: Illumina; PacBio RSIISequencing method: PacBio, Illumina
Genome coverage: 193x

Assembly description

Assembly methods: Assembly methods: Hifiasm v. 0.16.0; SMRTLink v. 8.0; pbmarkdup v. 0.2.0; ALLHiC v. 0.8.12
Construction of pseudomolecules: yes

Browse Genome

Genome browser at MaizeGDB

Data download

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/024/505/845/GCA_024505845.1_Zm-Dan340-REFERENCE-BAAFS-1.0
https://download.maizegdb.org/Zm-Dan340-REFERENCE-BAAFS-1.0
http://gigadb.org/dataset/102221

Release date

2022

Assembly statistics

	Scaff num	2,223
	Longest scaff	153,467,472 bp
	Shortest scaff	13,966 bp
	N50 scaff length	222,765,871 bp
	Total contig length	2,348,678,871 bp
	Longest contig	148,865,517 bp
	Shortest contig	6,840 bp
	N50 contig length	45,109,016 bp

Total number of scaffolds in assembly.

Longest scaffold in assembly.

Shortest scaffold in assembly.

The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 50% of the total assembly size.

Total sequence length represented by contigs.

The longest contig.

The shortest contig.

The length of contig which takes the sum length (summing from longest to shortest contig) past 50% of the total assembly size.

A contig is a contiguous consensus sequence that is derived from a collection of overlapping reads.
A scaffold is set of a ordered and orientated contigs that are linked to one another by mate pairs of sequencing reads.

Annotation

	Annotation Identifier	Zm00104aa.1
	Annotation Date	2022-05-07
	Is current	yes
	Annotation Description	Repeat sequences of the Dan340 genome were annotated using both ab initio and homolog-based search methods. For the ab initio prediction, RepeatModeler (Version 1.0.8), RepeatScout (Version 1.0.5), and LTR_Finder were used to discover transposable elements (TEs) and to build a TEs library. An integrated TEs library and a known repeat library (Repbase Version 15.02, homolog-based) were subjected to RepeatMasker (Version 3.3.0) to predict the TEs. For the homolog-based predictions, RepeatProteinMask was performed to detect the TEs in our genome by comparing it against a TE protein database. Tandem repeats were ascertained in the genome using Tandem Repeats Finder (Version 4.07b). As a result, 1723.99 Mb of repeat sequences were identified, accounting for 73.40% of the genome size. Among these repeat sequences, 1555.57 Mb were predicted to be long-terminal repeat (LTR) retrotransposons, and 44.53 Mb were predicted to be DNA transposons, accounting for 66.23% and 1.60% of the genome, respectively. Furthermore, among the LTR retrotransposons, the Gypsy and Copia superfamilies comprised 23.81% and 12.75% of the genome, respectively. Thus, retrotransposons accounted for a large proportion of the Dan340 genome, which was consistent with the genomic characteristics of other maize inbred lines.All repetitive regions except the tandem repeats were soft-masked for protein-coding gene annotations. Five ab initio gene prediction programs, Augustus (Version 3.0.2), GENSCAN (Version 1.0), GeneID, GlimmerHMM (Version 3.0.2), and SNAP (Version 2013-02-16), were used to predict genes. In addition, the protein sequences of five homologous species (Sorghum bicolor, Setaria italica, Hordeum vulgare, Triticum aestivum, and Oryza sativa) were downloaded from Ensembl and NCBI. Homologous sequences were aligned against the genome using TBLASTN (E-value 1 × 10−5). GeneWise was employed to predict gene models based on the sequence alignment results.

Welcome to MaizeGDB!

Project

Outreach

Helpful Links

Maize genetics community

Maize Genetics Cooperation - MGC

Articles

Data

Resources

Maize Genetics Meeting

Archive

Featured tools at MaizeGDB

Other tools at MaizeGDB

A-I

L-Z

Information about assembly Zm-Dan340-REFERENCE-BAAFS-1.0

Genome Sequencing Project Information

Stock and Biosample Information

Sequencing and Assembly Information

Annotation