We report the genome sequences absent from the rice reference genome, Oryza sativa L. ssp. japonica cv. Nipponbare, through a metagenome-like de novo assembly of low-coverage population sequencing data of 1483 cultivated rice varieties. Reads unmapped to the reference genome were used to do the assembly. See ricevarmap.ncpgr.cn for the source of the data used in this study. Two datasets, the indica dispensable genome and the japonica dispensable genome, were obtained through assembly of sequencing data of the indica and japonica rice varieties respectively. The indica dispensable genome contains 52976 contigs while the japonica dispensable genome is comprised of 30349 contigs.
RiceVarMap provides comprehensive information of 6,551,358 single nucleotide polymorphisms (SNPs) and 1,214,627 insertions/deletions (INDELs) identified from sequencing data of 1,479 rice accessions. The SNP genotypes of all accessions were imputed and evaluated, resulting in an overall missing data rate of 0.42% and an estimated accuracy greater than 99%. The SNP/INDEL genotypes of all accessions are available for online queries and downloading. Users can search SNPs/INDELs by identifiers of the SNPs/INDELs, genomic regions, gene identifiers and keywords of gene annotation. Allele frequencies within various sub-populations and the effects of the variation that may alter the protein sequence of a gene are also listed for each SNP/INDEL. The database provides a tool to compare any two accessions and identify the polymorphisms between them. The database also provides geographical details and phenotype images for various rice accessions. In particular, the database provides tools to construct haplotype networks and design PCR-primers by taking into account surrounding known genomic variations.
CREP (Collection of Rice Expression Profiles) is a database of gene expression profiles which are conducted using the commercial Affymetrix Rice GeneChip microarray.