Downloading Data using MariaDB (MySQL)

The UCSC Genome Browser uses MariaDB as the backend database server. MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system, intended to remain free and open-source software under the GNU General Public License.

We have two MariaDB databases for public access:

These servers allow MySQL access to the same set of data currently available on our public Genome Browser site. The data are synchronized weekly with the main databases on our public site. During synchronization, the MariaDB server can be intermittently out of sync with the main website for a short period of time. The weekly synchronization takes place on Monday mornings from 4:00 am to 9:00 am Pacific Time (GMT -7:00 in summer, GMT -8:00 in winter).

Connecting

You must have MariaDb (MySQL) client libraries installed on your computer. You can read more about MariaDB on the MariaDB site.

You can connect to the US MariaDB server using the command:

mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -P 3306

Or the European MariaDB server with this command:

mysql --user=genome --host=genome-euro-mysql.soe.ucsc.edu -A -P 3306

The -A flag is optional but is recommended for speed.

Once connected to the database, you may use a wide range of SQL commands to query the database.

Conditions of use

For more details about the Conditions of Use, please refer to the following page, Genome Browser Conditions of Use.

Using the MariaDB server with our utilities

The MariaDB database can also be used by the numerous utilities in the Genome Browser source tree. Some of these utilities require a password, so you will need to add the following specifications to your $HOME/.hg.conf file (remember to chmod your .hg.conf file to 600 permissions) if you would like to access the US public MariaDB server:

#US MariaDB server
db.host=genome-mysql.soe.ucsc.edu
db.user=genomep
db.password=password
central.db=hgcentral
central.host=genome-mysql.soe.ucsc.edu
central.user=genomep
central.password=password
gbdbLoc1=http://hgdownload.soe.ucsc.edu/gbdb/
forceTwoBit=on

Or these lines if you'd like to access the European MariaDB server:

#European MariaDB server
db.host=genome-euro-mysql.soe.ucsc.edu
db.user=genomep
db.password=password
central.db=hgcentral
central.host=genome-euro-mysql.soe.ucsc.edu
central.user=genomep
central.password=password
gbdbLoc1=http://hgdownload.soe.ucsc.edu/gbdb/
forceTwoBit=on

The db.* and central.* settings tell our utilities how to connect to the public MariaDB server. The gbdbLoc1 setting tells our utilities where to find data files. The forceTwoBit setting is necessary for utilities that retrieve genomic sequence.

If you have set up your .hg.conf file as above, you can use the hgsql utility, available from our downloads server in the Utilities section, to access the public MariaDB server. The benefit of using the hgsql command is that you don't have to include the username or password as part of your command. You only need to specify the host:

hgsql -h genome-mysql.soe.ucsc.edu

If you prefer a graphical user interface (GUI) to the UCSC database tables, use the Table Browser.

If you would like to learn more about .hg.conf file setup and specifics for using our command-line utilities, see this example minimal.hg.conf file.

System problems should be reported to genome-www@soe.ucsc.edu. Send questions regarding the database contents or queries to genome@soe.ucsc.edu. Messages sent to this address will be posted to the moderated genome mailing list, which is archived on a SEARCHABLE, PUBLIC Google Groups forum.