UMD3.1 downloads

This page contains the old datasets. For the latest UMD3.1 datasets, click here.
 
 
ASSEMBLY
 
UMD3.1_chromosomes.fa.gz - This file contains FASTA formatted sequences for chromosomes 1-29 (accessions GK000001.2 to GK000029.2), X (GK0000030.2), MT (AY526085.1) and 3286 unassigned scaffolds (accessions that begin with "GJ").
 
 
ANNOTATION
 
Ensembl:
Ensembl75_UMD3.1_genes.gff3.gz - This file contains coordinates for Ensembl (release 75) bovine protein coding genes and non-protein-coding genes. We downloaded the Bos taurus Ensembl GTF and reformatted it for our Jbrowse/Apollo database.
 
NCBI:
The following files contain data from NCBI Bos taurus Annotation Release 103. We downloaded the GFF file from NCBI FTP site for the Bos taurus genome and formatted it into individual feature type-based files for our Jbrowse/Apollo database.
 
RefSeq_UMD3.1_protein_coding.gff3.gz - This file contains protein-coding genes.
 
RefSeq_UMD3.1_frameshift.gff3.gz - This file contains protein coding genes that are supported by cDNA alignments to the genome assembly, but have translational discrepancies due to assembly errors such as indels.
 
RefSeq_UMD3.1_microRNA.gff3.gz - This file contains microRNA genes.
 
RefSeq_UMD3.1_noncoding.gff3.gz - This file contains other non-protein-coding genes.
 
RefSeq_UMD3.1_pseudogene.gff3.gz - This file contains pseudogenes.
 
RefSeq_UMD3.1_multitype_protein_coding.gff3.gz - This file contains protein coding genes that have coding as well as non-coding transcripts.
 
RefSeq_UMD3.1_multitype_noncoding.gff3.gz - This file contains non-protein-coding genes that have coding as well as non-coding transcripts.
 
 
Bovine Official Gene Set version 2 
Bovine_OGSv2_liftOver_UMD3.1_genes.gff3.gz - This file contains the Bovine Official Gene Set version 2, which includes manual annotations submitted by bovine researchers as part of the Bovine Genome Sequencing Consortium project. Genes predicted on the Btau_4.0 assembly were mapped to UMD3.1 using the UCSC liftOver Tool. We are in the process of generating a new Official Gene Set on UMD3.1 using RNA-Seq data from Dominette (the individual whose genome was sequenced).
 
Bovine_OGSv2_liftOver_UMD3.1_partial_genes.gff3.gz - This file contains genes that did not completely liftOver from Btau_4.0 to UMD3.1 assembly.
 
 
BOVINE HAPMAP SNP
BovineHapMapSNP50_UMD3.1.gff3.gz - This file contains Single Nucleotide Polymorphisms from the Bovine HapMap Consortium project.
 
 
QTLs 
Bovineqtl_liftOver_UMD3.1_QTL.gff3.gz - This file contains Bovine QTL predicted on Btau_4.0, which are mapped to UMD3.1 using UCSC liftOver Tool.
 
Animalgenome_UMD3.1_QTL.gff3.gz - This file contains QTL for UMD3.1 assembly as provided by Animalgenome.org. We downloaded the GFF3 file and reformatted it for our JBrowse/GBrowse database.
 
 
PROTEIN HOMOLOG ALIGNMENTS
The following files contain alignments of protein homologs to the UMD3.1 assembly. Protein sequences were aligned to the genome using Exonerate (protein2genome) via Maker.
 
Proteins from Ensembl:
Ensembl_Canis_familiaris.BROADD2.67.pep.all_vs_UMD3.1.gff3.gz
 
Ensembl_Equus_caballus.EquCab2.67.pep.all_vs_UMD3.1.gff3.gz
 
Ensembl_Homo_sapiens.GRCh37.67.pep.all_vs_UMD3.1.gff3.gz
 
Ensembl_Mus_musculus.NCBIM37.67.pep.all_vs_UMD3.1.gff3.gz
 
Ensembl_Sus_scrofa.Sscrofa10.2.67.pep.all_vs_UMD3.1.gff3.gz
 
Proteins from RefSeq: 
RefSeq_Canis_lupus_familiaris_protein_vs_UMD3.1.gff3.gz
 
RefSeq_Equus_caballus_protein_vs_UMD3.1.gff3.gz
 
RefSeq_Homo_sapiens.protein_vs_UMD3.1.gff3.gz
 
RefSeq_Mus_musculus.protein_vs_UMD3.1.gff3.gz
 
RefSeq_Ovis_aries_protein_vs_UMD3.1.gff3.gz
 
RefSeq_Sus_scrofa_protein_vs_UMD3.1.gff3.gz