UMD3.1 Downloads

ASSEMBLY

UMD3.1_chromosomes.fa.gz - This file contains fasta formatted sequences for chromosomes 1-29 (accessions GK000001.2 to GK000029.2), X (GK0000030.2), MT (AY526085.1) and 3286 unassigned scaffolds (accessions that begin with "GJ").

 

ANNOTATION

Ensembl:

Ensembl72_Btau_UMD3.1_genes.gff3.gz - This file contains coordinates for Ensembl release 72 bovine protein coding genes and non-protein-coding genes. We downloaded the Bos taurus Ensembl gtf and reformatted it for our Jbrowse/Apollo database.

NCBI:

The following files contain data from NCBI Bos taurus Annotation Release 103. We downloaded the gff file from NCBI ftp site for the Bos taurus genome and formatted it into individual feature type-based files for our Jbrowse/Apollo database.

RefSeq_UMD_3.1_protein_coding.gff3.gz - This file contains protein-coding genes.

RefSeq_UMD_3.1_frameshift.gff3.gz - This file contains protein coding genes that are supported by cDNA alignments to the genome assembly, but have translational discrepancies due to assembly errors such as indels.

RefSeq_UMD_3.1_microRNA.gff3.gz - This file contains microRNA genes.

RefSeq_UMD_3.1_noncoding.gff3.gz - This file contains other non-protein-coding genes.

RefSeq_UMD_3.1_pseudogene.gff3.gz - This file contains pseudogenes.

 

Bovine Official Gene Set version 2

Bovine_OGSv2_liftOver_UMD3.1.gff3.gz - This file contains the bovine Official Gene Set version 2, which includes manual annotations submitted by bovine researchers as part of the Bovine Genome Sequencing Consortium project. Genes predicted on the Btau_4.0 assembly were mapped to UMD3.1 using the UCSC liftOver Tool. We are in the process of generating a new Official Gene Set on UMD3.1 using RNA-Seq data from Dominette (the individual whose genome was sequenced).

 

BOVINE HAPMAP SNP

BovineHapMapSNP50_UMD3.1.gff3.gz - This file contains single nucleotide polymorphisms from the Bovine HapMap Consortium project.

 

PROTEIN HOMOLOG ALIGNMENTS

The following files contain alignments of protein homologs to the UMD3.1 assembly. Protein sequences were aligned to the genome using Exonerate (protein2genome) via Maker.

Proteins from Ensembl:

Ensembl_Canis_familiaris.BROADD2.67.pep.all_vs_UMD3.1.gff3.gz

Ensembl_Equus_caballus.EquCab2.67.pep.all_vs_UMD3.1.gff3.gz

Ensembl_Homo_sapiens.GRCh37.67.pep.all_vs_UMD3.1.gff3.gz

Ensembl_Mus_musculus.NCBIM37.67.pep.all_vs_UMD3.1.gff3.gz

Ensembl_Sus_scrofa.Sscrofa10.2.67.pep.all_vs_UMD3.1.gff3.gz 

Proteins from RefSeq:

RefSeq_Canis_lupus_familiaris_protein_vs_UMD3.1.gff3.gz

RefSeq_Equus_caballus_protein_vs_UMD3.1.gff3.gz

RefSeq_Homo_sapiens.protein_vs_UMD3.1.gff3.gz

RefSeq_Mus_musculus.protein_vs_UMD3.1.gff3.gz

RefSeq_Ovis_aries_protein_vs_UMD3.1.gff3.gz

RefSeq_Sus_scrofa_protein_vs_UMD3.1.gff3.gz