The UCSC Genome Browser Coordinate Counting Systems

If you think dogs can’t count, try putting three dog biscuits in your pocket and then giving Fido only two of them.  

~Phil Pastoret

“Counting is easy. Right?”

I say this with my hand out, my thumb and 4 fingers spread out. With my other hand’s pointer finger, I simply count each digit, “one, two, three, four, five.” Easy.

But what happens when you start counting at 0 instead of 1? You can see that you have 5 digits (4 fingers and a thumb), but how do you calculate the size of your range?

With your hand in mind as an example, let’s look at counting conventions as they relate to bioinformatics and the UCSC Genome Browser genomic coordinate systems.

The UCSC Genome Browser uses two different systems:

“1-start, fully-closed” = coordinates positioned within the web-based UCSC Genome Browser. “0-start, half-open” = coordinates stored in database tables.
Table 1. UCSC Genome Browser coordinate systems summary
0-start, half-open (0-based) 1-start, fully-closed (1-based)
“BED” format (Browser Extensible Data):
chr1 127140000 127140001
Note: Spaces, not punctuation
When using BED format, browser & utilities
assume coords are 0-start, half-open.
“Position” format:
Note: Punctuation used, no spaces
When using “position” format, browser & utilities
assume coords are 1-start, fully-closed.
Stored in UCSC Genome Browser tables Positioned in UCSC Genome Browser web interface
To convert to 1-start, fully-closed:
add 1 to start, end = same
To convert to 0-start, half-open:
subtract 1 from start, end = same

Section 1: Interval types

0-start vs. 1-start : Does counting start at 0 or 1?
Sometimes referred to as “0-based” vs “1-based” or 
“0-relative vs “1-relative.”

Interval Types
For a counted range, is the specified interval fully-open, fully-closed, or a hybrid-interval (e.g., half-open)?

Ok, time to flashback to math class!
You might recall that specifying an interval type as open, closed (or a combination, e.g., “half-open”) refers to whether or not the endpoints of the interval are included in the set. For further explanation, see the
interval math terminology wiki article. Figure 1 below describes various interval types.


Figure 1. (To enlarge, click image.) Description of interval types.

Section 2: Interval types in the UCSC Genome Browser

UCSC Genome Browser web interface = “1-start, fully-closed”

A common counting convention is a system that we all used when we first learned to count the fingers on our hands; this is referred to as the “one-based, fully-closed” system (Figure 2, below). Note that an extra step is needed to calculate the range total (5).

The “1-start, fully-closed” system is what you SEE when using the UCSC Genome Browser web interface. However, all positional data that are stored in database tables use a different system.


Figure 2. (To enlarge, click image.) 1-start, fully-closed interval. Most common counting convention. Used within the UCSC Genome Browser web interface (but not used in UCSC Genome Browser databases/tables). We calculate that we have 5 digits because 5 (pinky finger, range end) – 1 (the thumb, range start) = 4. We then need to add one to calculate the correct range; 4+1= 5.

UCSC Genome Browser tables = “0-start, half-open”

While the commonly-used “one-start, fully-closed” system is more intuitive, it is not always the most efficient method for performing calculations in bioinformatic systems, because an additional step is required to calculate the size of the base-pair (bp) range.

To increase efficiency, the UCSC Genome Browser uses a “hybrid-interval” coordinate system for storing coordinates in databases/tables that is referred to as “0-start, half-open” (see Figure 3, below).

Although coordinates in the web browser are converted to the more human-readable “1-start, fully-closed” system, coordinates are stored in database tables as “0-start, half-open.” You may have heard various terms to express this 0-start system:

Synonyms for “0-start, half-open”

  • 0-based, half-open
  • 0-based start, 1-based end
    • Note: This is not technically accurate, but conceptually helpful. A “1-based end” refers to the end of the range being included, as in the common “1-based, fully-closed” system.
  • 0-start, hybrid-interval (interval type is: start-included, end-excluded)


Figure 3. (To enlarge, click image.) The UCSC Genome Browser coordinate system for databases/tables (not the web interface) is “0-start, half-open” where start is included (closed-interval), and stop is excluded (open-interval). We calculate that we have 5 digits because 5 (range end after pinky finger) – 0 (the thumb, range start)  = 5.

Another example which compares 0-start and 1-start systems is seen below, in Figure 4. This figure describes the differences in defining and calculating the range for a specified sequence highlighted in yellow, “T, C, G, A.”


Figure 4. (To enlarge, click image.)  Calculation of genomic range for comparing “1-start, fully-closed” vs. “0-start, half-open” counting systems.

Section 3: Formatting

Coordinate formatting indicates interval type

The UCSC Genome Browser and many of its related command-line utilities distinguish two types of formatted coordinates and make assumptions of each type.

The “Position” format (referring to the “1-start, fully-closed” system as coordinates are “positioned” in the browser)

  • Written as: chr1:127140001-127140001
  • No spaces.
  • Includes punctuation: a colon after the chromosome, and a dash between the start and end coordinates.
  • When in this format, the assumption is that the coordinate is 1-start, fully-closed.

The “BED” format (referring to the “0-start, half-open” system)

  • Written as: chr1 127140000 127140001
  • Spaces between chromosome, start coordinate, and end coordinate.
  • No punctuation.
  • When in this format, the assumption is that the coordinates are 0-start, half-open.

Section 4: Examples

SNP example

What we SEE in the Genome Browser interface itself is the “1-start, fully-closed” system. However, these data are not STORED in the UCSC Genome Browser databases and tables in the same way. The UCSC Genome Browser databases store coordinates in the “0-start, half-open” coordinate system.

Table 2. SNP coordinates in web browser (1-start) vs table (0-start)
rs782519173 (hg38) Start End
Positioned in web browser: 1-start, fully-closed  133255708  133255708
Stored in table: 0-start, half-open  133255707  133255708

LiftOver examples and coordinate formatting

Let’s take a look at the two types of coordinate formatting (“BED” and “position”) when using the UCSC Genome Browser web-based and command-line utility liftOver tools.

1) Web-based LiftOver example

Below is an example from the UCSC Genome Browser’s web-based LiftOver tool (Home > Tools > LiftOver). Depending on how input coordinates are formatted, web-based LiftOver will assume the associated coordinate system and output the results in the same format.

Table 3. UCSC Genome Browser web-based LiftOver and “position” coordinate formatting
Input: Assembly = panTro3
Output: Lifts to this position in hg19:
Notes: If your input is entered with the “position” formatted coords (1-start, fully-closed),
the browser will also output the same “position” format. (Note positional format
includes “:” and “-” and no spaces.)
Table 4. UCSC Genome Browser web-based LiftOver and “BED” coordinate formatting
Input: Assembly = panTro3
chr1 127140000 127140001
Output: Lifts to this position in hg19:
chr1 110255312 110255313
Notes: If your input is entered with the “BED” formatted coords (0-start, half-open), the
browser will also output the same “BED” format. (Note BED format contains no
punctuation and includes spaces.)
 * Note that the web-based output file extension is misleading in this case; while titled “*.bed” the positional output is not actually in “0-start, half-open” BED format, because the 1-start, fully-closed “positional” format was used for input. 

 2) Command-line liftOver utility example

When using the command-line utility of liftOver, understanding coordinate formatting is also important. Just like the web-based tool, coordinate formatting specifies either the “0-start half-open” or the “1-start fully-closed” convention. For example, if you have a list of 1-start “position” formatted coordinates, and you want to use the command-line liftOver utility, you will need to specify in your command that you are using “position” formatted coordinates to the liftOver utility.

To view the liftOver utility usage statement and options, enter “liftOver” on your command-line (with no other arguments, and without the quotes).

Table 5. UCSC Genome Browser command-line liftOver and “position” coordinate formatting
Command: liftOver -positions panTro3.txt liftOver/panTro3ToHg19.over.chain.gz mapped unMapped
Output: chr1:110255313110255313
via “mapped” file for hg19
Notes: Note: Must specify “-positions” for 1-start “position” format in command-line liftOver
Table 6. UCSC Genome Browser command-line liftOver and “BED” coordinate formatting
chr1 127140000 127140001
Command: liftOver panTro3.bed liftOver/panTro3ToHg19.over.chain.gz mapped unMapped
Output: chr1 110255312 110255313
via “mapped” file for hg19
Notes: Note: No special argument needed, 0-start “BED” formatted coordinates are default. 

Wiggle Files

The wiggle (WIG) format is used for dense, continuous data where graphing is represented in the browser. Wiggle files of variableStep or fixedStep data use “1-start, fully-closed” coordinates. Like all other UCSC Genome Browser data, these coordinates are positioned in the browser as “1-start, fully-closed.”

Note: Many other formats outside of the UCSC Genome Browser use 1-start coordinate systems, such as GTF/GFF.

Table 7. UCSC Genome Browser wiggle files & coordinate systems
File Type Wiggle file Coordinate system as positioned
in UCSC Genome Browser
bedGraph -> bigWig 0-start, half-open 1-start, fully-closed
wiggle variableStep -> bigWig 1-start, fully-closed 1-start, fully-closed
wiggle fixedStep -> bigWig 1-start, fully-closed 1-start, fully-closed

 Section 5: Resources

If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

GTEx Resources in the Browser

Have you been wondering when we’ll get some of that next-gen gene expression in human tissues up as tracks in the browser? The GNF Atlas microarray tracks are so 2004… Yes, we do have RNA-seq from ENCODE cell lines, but you can get only so far with cell lines (are they even human?). Well, wait no longer! Once we learned what the GTEx folks are up to – RNA-seq and genotyping of samples from 53 tissues in many hundreds of donors – we just had to get on board! Read on for details…

The NIH Genotype-Tissue Expression (GTEx) project was created to establish a sample and data resource for studies on the relationship between genetic variation and gene expression in multiple human tissues. In April this year the Genome Browser released the GTEx Gene Expression track, which showcases data from the GTEx midpoint milestone data release (V6, October 2015) – 8555 tissue samples obtained from 570 adult postmortem individuals. The track shows median expression level per tissue at each gene via a new bar graph display:


The height of each bar represents the median expression level across all samples for a tissue, and the bar color indicates the tissue (we are using GTEx publication color conventions). You can see the gene description and tissue name with expression level when you mouseover, and can view the tissue legend in glorious detail on the track configuration page. Above, notice the 3 highly expressed tissues for TCAP protein (titin-cap, used in muscle assembly) – unsurprisingly in this case, heart (2 sub-tissues) and skeletal muscle.

In the tissue mix sampled by GTEx, you’ll find a dozen brain sub-tissues, a handful of cardiovascular tissues, and bits from digestive, reproductive, and endocrine systems. For a nice summary of the tissues assayed, check out the GTEx project portal. Not so interested in all the tissues? Turn on the tissue filter and limit the graph to show just your faves!

Once you’ve found your favorite gene, you can drill down for more detail. A nice boxplot showing the range for all samples and the sample count is right here on the details page:


You’ll also see this plot on the new RNA-Seq Expression panel of the UCSC Genes detail page:


If gene-level calls aren’t your thing – you’re more of a deep diver and want to see the actual RNA-seq coverage – you might find the newly released GTEx Signal Hub just your style. We were fortunate to be able to team up with the Global Alliance crowd here within the UCSC Genomics Institute and convince them to pump all the available GTEx RNA-seq through their hot new Toil pipeline (along with twice as much cancer data) to produce signal graphs. A round of ‘biggification’, lifting and track configuration (gotta have those GTEx colors!) produced the hub. Find it on the Public Hubs panel of the Track Hubs page, which you can navigate to via the My Data > Track Hubs menu option in the top blue bar.

Did I mention you can find the GTEx gene track and the GTEx Signal hub on both the hg19 (GRCh36) and hg38 (GRCh37) genome browsers?

Give the new tracks a spin! To get you started, here’s a session:


Now enjoy!!







If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

The new Genome Browser gateway

New UCSC Genome Browser gateway page design

New UCSC Genome Browser gateway page design

The opinions expressed here are those of the author, Cath Tyner, and do not necessarily reflect those of the University of California Santa Cruz or any of its units.

Maybe it’s just me, but I can clearly remember the excitement of getting brand new sparkly shoes as a young kid.  Half the excitement was picking out the shoes – my siblings and I would try on every potential new-shoe option. There were high standards, of course; rigorous criteria that had to be thoroughly discussed and tested. Could they make you jump SUPER high? Low tops or high tops? Classic laces or cutting edge velcro? After finally picking out the perfect pair and racing to put them on with maniacal laughter, the reality set in. New shoes. Ahhhh. The excitement of showing off my new kicks at school was only one night away.

That’s a little bit how we all felt when we unveiled the brand new sparkly Genome Browser gateway page  earlier this week. This was a project that had been “in the works” for quite a long time, starting from ideas and drawings, moving into design phases, and finally maturing into many iterations of testable versions as the development process gained its own momentum. This project soon had a life of its own – we all became shepherds as we guided it into what we finally knew was a final product.

The things we are most excited about? We’ve already received feedback that the new human-centric phylogenetically ordered tree menu is downright awesome (and we think so too).  For me, the graphics and colors pull me in, inviting me to visually scroll through our entire genome species collection. With a flick of the scroll handle on the tree menu, I can zip from “us humans” all the way down to sea hare or Ebola virus; within two seconds, I’ve just traveled through millions and millions of evolutionary years. Based on NCBI’s taxonomy database, the “tree menu” provides an interactive way to explore our genome species collection. Little known fact: Try hovering over one of the “branches” of the tree (the horizontal and vertical lines connecting all species) and see what you find!

Example of mouse hover on tree menu branch

Example of mouse hover on tree menu branch

Another exciting new feature that makes our eyes light up is the autocomplete search function and “popular species” button shortcuts:

Button shortcuts & autocomplete search

Button shortcuts & autocomplete search

We know that over 95% of you will benefit from our “popular species” buttons as quick access shortcuts to the genomes that you use most. We also believe that just about everyone will benefit from the autocomplete search function. For example, you can enter “fish” to see genomes from our aquatic friends, or you can enter something as specific as “hg38” to load a particular assembly version. With a whopping 276 genomes and counting, autocomplete search is a celebrated new feature! The same autocomplete function works great for our public genome hubs; try typing “plant” to see related hubs.

Want to jump to your favorite gene in the genome browser? The “position/search term” functionality remains just as efficient – just enter a genomic position, gene symbol, or search term, lean back in your comfy chair, and press “GO.” You’re there.

To see more details, including a few menu option changes, visit the gateway announcement on our news page and watch the short gateway video tour.

We sincerely hope you enjoy the new gateway page as much as we do – and as always, we invite you to contact us with questions, concerns, and compliments. 😉


If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

How to share your UCSC screenthoughts

by Robert Kuhn      August 12, 2015

The UCSC Genome Browser is great tool for visualizing your data alongside a ton of data from all over the place.  Perhaps, at long last, you have loaded up a gene set, the supporting mRNAs and maybe the SNPs from OMIM or dbSNP, and the Conservation track to make a great point.

Now you want to save that thought, or share it with a colleague, or make a slide for a meeting, or publish it in a paper. Saving your screenthought can take two forms: static or dynamic.  You can snap and save a picture of the screen, or you can share a link to an active Genome Browser.  We’ll talk about both approaches here and discuss some of the advantages and pitfalls of each.

Share a static image.    You can always take a screen grab and throw it onto a slide with little effort.  The screen resolution is fine for  a slide, because your computer and your slide will viewFingerboth be 72 or 96 dpi.  But, if you try that for a publication, your image will have to be really small (scale down 3x in each dimension to get 300 dpi for print) or it will be unacceptably fuzzy.

To get high resolution images for publication, use the Browser’s .pdf export function to allow the vector-graphics image to scale to full journal size and resolution. Look for the .pdf output in the “View” pulldown menu at the top of the Browser page.  Both the chromosome ideogram and the main Browser graphic can be saved in this fashion.

Share a dynamic session, but DO NOT copy a URL.  To save a dynamic screen session that would allow you or others to look around, add more data tracks, check out other genes, etc., you might be tempted to simply copy the URL from your Firefox or Chrome web browser.  That might even seem to work OK at first, but it is in fact not a stable link and can lead to weird Browser behavior.  Worse, you may not even be sharing what you think you are, and will never know it.

Let’s break down a URL as copied directly from my Firefox and see how it plays out.


This URL contains a parameter, hgsid, which is actually a pointer to a row in a UCSC database identifying your session and keeping the state of all your variables (we borrowed the name “cart”).  If you send this URL to someone, yet keep browsing around, your cart will continue to change as you work, and your friend will see the latest state your Genome Browser is in when she clicks the link. The original state of your cart when you shared the URL is long gone before she sees it.

Your shared URL might even appear to work OK because two of the variables in the URL, db (database) and position, will override values stored in your cart (cart variables are separated by an ampersand).  Your friend will see the right genome assembly (db variable) and location (position variable) and think she’s seeing what you want.  But, if you have turned any data tracks on or off in the interim, or removed a custom track, those changes will also be part of what she sees. The original state is lost.  A different colleague could click the link at some other time and see something different still.

As an experiment, here is that same URL in a form you can click or copy/paste into your web browser:

Does it look like this?


That’s what it looked like when I shared the URL. Your click will show the 5’ end of the FGFR1 gene region on human assembly hg19 (because the URL has explicitly included db and position variables), but who knows what tracks might be turned on or off in the interim? Whatever the last person to click it did to it will rule. Every person who reads this blog and clicks the link can change the track configuration for whomever comes next. Only the db and position are going to persist.

Quick-and-dirty URL hack.    If you really want a quick-and-dirty way to share a link, here are a couple of suggestions.  You could send the link as it is above, then strip a few characters out of the hgsid in the URL in your own browser and refresh.  Because the new long hgsid string will not exist in our database, you will be assigned a new hgsid and the state of the old one will stick – until your friend starts messing with it.  Or you could strip out the hgsid parameter entirely and add in other parameters that define the tracks you want to turn on, e.g.:


That will better define the tracks you want, but it is neither as stable nor as easy as saving a Session. You can use “hide,” too, to be sure certain tracks are turned off. Read more about configuring your links here.

Share a stable dynamic Session.    The best way to save a train of thought in a stable fashion is via the Saved Session tools under the “My Data” pulldown menu. A Saved Session acts as a mydataFingerstable snapshot of all the details of your Browser view.  Saving a thought using this feature requires a login, but it allows you to save the state of a Browser session (semi)-permanently. Anyone viewing your session will be able to further browse around the genome without affecting the session you saved.  After you have saved a session, you will see a “Browser” link that can be copied and shared.

For example, to load the view above as a stable session, try this link (no login is required to view some else’s Saved Session):

Although anyone with this URL can view this session, no one can change it unless logged in as user “SessionGallery.”

In the past we endeavored to save the Session for at least 3-4 months after the last time it was viewed, and custom tracks in sessions were subject to persist for at least 48 hours after the last time they were viewed. We have now moved to not remove session data, unless deleted, and to not remove custom tracks in sessions.  We still encourage people to save their Session cart to a local file using the “Save Settings” feature (and to keep backups of all their custom tracks on a local machine).  That way, you can load your Session settings any time and onto any copy of the Browser (such as to the European mirror or a local Genome Browser-in-a-Box) and avoid any possible loss of data due to unforeseen circumstances.  We do the best we can to maintain our servers so that you do not lose your sessions, but computers are only human and they break.

Really stable sessions.    If you are looking to create a permanent link for a publication, you should consider hosting your downloaded Session and any of your own custom data on a server you control (such as in a Track Hub). It will still be loaded onto the UCSC Genome Browser, but you are not at the mercy of California earthquakes, wildfires or crashed servers (except for your own).  You can read more about building links to remotely hosted user information here and on our Session’s Gallery page here.

On both pages you can learn about the following parameters for forming links to launch sessions from your hub:


We hope we have given you some food for thought about how to make the Genome Browser more useful in your work.  Using a reliable method for saving and sharing sessions is great way to avoid the frustration of lost data and misleading links.  Stay tuned for more useful Browser tips in future blogs.

New default gene set on GRCh38: GENCODE Basic genes

Screen Shot 2015-06-29 at 3.32.45 PM

Genome Browser screen shot of the GRCh38 (hg38) human assembly showing the GENCODE Basic track opened in the PTEN region on chromosome 10.

As of Monday, July 29, 2015, the UCSC Genome Browser will use the GENCODE v22 comprehensive gene set as its default gene set on the human genome assembly GRCh38 (hg38), replacing the previous default set of genes created here at UCSC using code written by Jim Kent. This track, which is labeled as “GENCODE Basic” in the Genes and Gene Predictions track group, replaces UCSC Genes track as the default gene set.  We’re making this change in recognition of the value of reducing the number of competing gene sets used by the bioinformatics community.  With this change we will be using the same set of genes as Ensembl, reducing the potential for confusion, especially in clinical settings.

We’ve kept the same familiar UCSC Genes schema for the new gene set, using nearly all the same table names and fields that appeared in earlier versions of UCSC Genes. Hopefully this will make the transition to the new GENCODE models easier. Every transcript in the new set has both a UCSC ID and a GENCODE transcript ID. There are a couple of new tables: knownCds, which has the coding frame numbers for each gene, and knownToMrna, which captures the association to GenBank mRNAs. A couple tables are no longer present: knownGeneTxMrna and knownGeneTxPep.

By default, we display only the transcripts tagged as “basic” by the GENCODE Consortium. However, all the transcripts in the GENCODE comprehensive set are present in the tables. You can view them in the browser by selecting “show comprehensive set” in the “Show” section of the track’s description page. On that same page, you can also configure the browser to label the genes with the GENCODE transcript IDs by selecting “GENCODE Transcript ID” label option.

The new gene set has 195,178 total transcripts, compared with 104,178 in the previous UCSC Genes version. The total number of canonical genes, now defined using the GENCODE gene loci ( ENSG* identifiers), has increased from 48,424 to 49,534.

Comparing the previous gene set with the new version:

  • 9,459 transcripts are identical.
  • 22,088 transcripts were not carried forward to the new version.
  • 43,681 have consistent splicing, but changes in the UTR.
  • 28,950 transcripts overlap with those in the previous set, but have
    at least one different splice.

We plan to continue using the previous UCSC computational pipeline to generate the default gene set on the mouse assembly, GRCm38 (mm10), for the foreseeable future. We will also periodically update the old UCSC-computed gene set on the human GRCh38 assembly as an ancillary track (“Old UCSC Genes”) without the rich set of link-outs we maintain for the default gene set.

If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

Introducing the Genome Browser YouTube Channel

Here at the Genome Browser we’re constantly looking for ways to improve the Browser and make it more accessible. A big part of that is making it as easy as possible for people to learn how to use our tools to best serve their research. In the past this has included setup and maintenance of documentation, including our help docs as well as a dedicated wiki site, where browser staffers and external users alike have shared content. We also continue to offer real-time support on our mailing list (

Thanks to funding support from the NHGRI we were recently able to amp up our training efforts in two ways. We now have a program whereby interested groups can economically host a Genome Browser workshop at their institution. For more information, fill out our intake survey:

The other thing we have been able to do is launch a YouTube channel where you will find video tutorials explaining how to use various parts of the Browser. While static documents and email support are great, we realize some people learn better by seeing how something is done. We also hope this will be a good resource for those unable to physically attend one of our trainings. The video topics are meant to address some of the common workflows and questions we get from users. Each video is an illustration of how to answer a particular query, for example: “How do I identify exon numbers with the UCSC Genome Browser?

The answer will follow a sequence of steps traversing different parts of the Browser. For those who want to jump straight to one of the steps/skills listed in the video, you will find a set of internal links to the timepoints within the video in the YouTube video description. There, you will also find a transcript of the video if you want to follow along or take notes:

Screen Shot 2015-02-26 at 2.35.02 PM

You can find links to these resources on our training page. If you have a question that you’d like to see demoed in a video, we are always open to suggestions! You can reach the training department by email or tweet us an idea @GenomeBrowser.

If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

New features & data – Winter 2015

We realize that it is sometimes difficult to keep up with all of the new features and data sets in the Genome Browser. After all, we release new annotation tracks almost daily, and we update our software every three weeks. This post highlights a smattering of the most recent updates.

Browser & Track Hub Features

– Personalize your view of GENCODE Genes

In addition to choosing which GENCODE Gene tracks to view (e.g. basic gene set, PolyA, pseudogenes), you can now filter and highlight transcripts within the tracks. Try it here (click on the “Genes” link).

– Display your bigWig data on the other strand in your track hub

Use the new trackDb setting, negateValues on, to allow your bigWig data to be displayed on the Crick strand. This setting negates the values in the wiggle file, meaning that positive values become negative and vice versa. This is useful for wiggles representing transcription or other activities on the Crick strand. Note that wiggles with negative values are drawn in the color specified in altColor, not color as positive values are.


– Disconnect your hub automatically

If you need to automatically disconnect your hub, you can use the hubClear variable in the URL. This is especially helpful for users who are  creating hubs dynamically. For example, to disconnect the urlOfHubToClear hub, use a URL constructed like so:

– Enable BLAT for your assembly hub

If you have created your own assembly hub you can now set up a BLAT server to enable quick mRNA/DNA and cross-species protein alignments. All you need is a server from which you can run gfServer, and the .2bit file containing the sequence of your assembly. Read the detailed instructions here.

Note that the BLAT and gfServer programs and source code are freely available from the University of California Santa Cruz for academic and non-commercial use. A license is required for commercial use.

Annotation Tracks & Assemblies

– dbSNP v141 for hg19/GRCh37 & hg38/GRCh38

We released four annotation tracks from human Build 141 of NCBI’s database of short genetic variations, dbSNP. This release marks the first set of data available for the newest human assembly, hg38/GRCh38. Read more.

Since then, NCBI has released the next database update: dbSNP Build 142. We have derived another four tracks from this release, which are currently undergoing our rigorous quality assurance process and will be released very soon.

– Proteomics data available for hg19/GRCh37: PeptideAtlas track & CPTAC data hub

Data from the National Cancer Institute’s (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) is now available in the UCSC Genome Browser as a public track hub. This track hub contains peptides that were identified by CPTAC in their deep mass spectrometry-based characterization of the proteome content of breast, colorectal and ovarian cancer biospecimens that were initially sequenced by The Cancer Genome Atlas (TCGA).

In addition, we have also released a PeptideAtlas track that displays peptide identifications from the PeptideAtlas August 2014 (Build 433) Human build. This build, based on 971 samples containing more than 420 million spectra, identified over a million distinct peptides covering more than 15,000 canonical proteins. Read more.

– GenBank track updates

We have reduced the frequency of GenBank data updates for assemblies other than human and mouse. The GenBank-based tracks for selected recent assemblies are now updated about once a week rather than daily. The remainder of the 150+ assemblies in the Genome Browser are updated whenever a newer assembly is released and after that, about once a month. The GenBank update schedule for the human and mouse assemblies remains unchanged. Read more.

– UniProt track for hg19/GRCh37

We have added a UniProt track to hg19/GRCh37, and merged the old PFAM (Protein Families) track into it. Check it out here.

– New assembly browsers:

  • CowBos taurus, bosTau8 – sequenced/assembled by University of Maryland (UMD 3.1.1)
  • Fruitfly, D. melanogaster, dm6 – provided by the FlyBase Consortium/Berkeley Drosophila Genome Project/Celera Genomics
  • Ebola Virus, Sierra Leone 2014 outbreak, eboVir3 assembly browser and portal

Ebola virus

New Product

– Genome Browser in a Box (GBiB)

In case you missed the previous blog post, we have created an easily installable version of the Genome Browser. You can set it up in just a few minutes on your laptop for private browsing of your own data alongside the native annotation tracks. It’s fine-tuned to work with hg19/GRCh37, but it works with all other assemblies as well. If you have genomic sequence for other organisms, you can add your own assembly hub. Read more.

If you would like to stay informed about our new features and data sets, you are welcome to subscribe to our low-volume announcement mail list:

If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

Ebola update

The opinions expressed here are those of the author, Jim Kent, and do not necessarily reflect those of the University of California Santa Cruz or any of its units. 

It’s been nearly a month since I wrote my first Ebola blog entry. Since then the world at large and myself in particular have learned more about Ebola. We have seen clearly that the virus can be transmitted within hospitals in developed countries. We’ve gotten more data showing that good hospital care including hydration, survivor plasma, and electrolyte balancing can save 75% of the patients, perhaps more if applied early. We’ve seen that, from a political point of view, it’s better for the Centers for Disease Control (CDC) to overreact than under-react. We see the epidemic continue to grow, but we also see some signs of its growth rate slowing, at least in Liberia. It seems a good time for a follow-up post.

At UCSC we’ll be adding new data and editing the Ebola Portal ( in the coming weeks. Wikipedia has done such a great job synthesizing Ebola scientific knowledge that we’re dropping the Treatments and Vaccines section of the Ebola Portal in favor of a Wikipedia link. We’re continuing to encourage people to release Ebola viral and antibody sequences. We’ve added a new viral genome sequence from the smaller epidemic going on in the Democratic Republic of Congo (Maganga et al., 2014), and expect the first sequences from American patients soon.

In broader scientific terms, I think that most of the important medical, scientific, and epidemiological issues are now known. The challenge of how to formulate this knowledge into the most effective response is still a huge task. How can we minimize the loss of life with the resources at our disposal?

Many aspects of the epidemiology of Ebola are clear. In Africa as a whole, the time it takes to double the number of people who have been infected is about three weeks. In rural areas and affluent urban areas the doubling time is approximately four weeks, while in the shantytowns it is approximately two weeks. Epidemics in general follow an S-shaped curve, as shown in Figure 1 below. Initially there is a period of exponential growth. Approximately at the point where half of the people have become infected, the growth slows simply because there are fewer people left to infect. Even in the worst hit place in this epidemic, the shantytown of New Cru Town in Monrovia, the epidemic is still in the exponential growth phase on the left side of this curve. This is both good and bad: good because most of the people have not been subject to the certain pain and likely death of Ebola infection, but bad in that the epidemic will rapidly worsen.


Figure 1. A graph of the constrained growth equation that epidemics tend to follow in enclosed, freely mixing areas.

Within a single patient, the medical course of the disease is also relatively clear. After initial infection there is an incubation period of typically 9 days, which can be as short as three days and at least as long as three weeks before symptoms develop. The first symptoms are similar to those of many diseases – aches, fatigue, sometimes a headache, and sometimes stomach pains. After about three days of general malaise, usually a fever develops. The disease progresses rapidly in the next four days. During this phase there is intense diarrhea, usually vomiting, and sometimes bleeding. An adult patient will lose about 10 liters of fluid per day from these causes, if kept hydrated, and will often die from the effects of dehydration otherwise. After four days of intense symptoms patients will start improving if they are destined to recover, or deteriorate further if not. The recovery rates in Africa are only about 30%.

Taking care of an Ebola patient is a lot of work and is vastly complicated by the precautions caretakers must take to avoid becoming infected themselves. The patients are in considerable pain and subject to retching, spasms, and convulsions. For many patients, a madness sets in during the peak of the disease as well. Getting the patients to drink their 10 liters of electrolytes or stay attached to their IV lines, as well as clean up after them, is physically demanding and emotionally draining work. This is exacerbated by the need to wear a protective suit that gets so hot people can safely work in it for only 45 minutes without themselves getting dehydrated. In the U.S. hospitals, approximately 100 staff are required for a single Ebola patient. Doctors without Borders manages to get by with much fewer staff than this, but it is unrealistic to think that an Ebola patient can be managed with less than two staff per bed.

This is where we come to the fundamental conflict between the epidemiology and the medicine.   Medically we want to treat every Ebola patient. The combination of hydration and plasma and/or antiviral treatment seems to raise the recovery rate from 30% to 75%, and is likely to improve further as our experience and tools for treatment grow. However, according to CDC estimates (corrected for under-reporting), as of 9/26/2014 there were 1500 people needing beds in Ebola treatment facilities in Liberia and Sierra Leone alone. We did not have the ~3000 support staff we needed then, and do not have the ~10,000 staff we would need for the ~5000 people estimated to need beds as I write this on Nov 3.

In medicine, generally prevention is far easier than treatment. For Ebola the most important prevention is keeping the patient away from other people during the most infectious phase when the patient is sickest, typically starting the day after the first sign of a fever and continuing until the patient dies or recovers. If the patient dies, the body is also exceedingly infectious. By and large the Africans have accepted the need to treat the body as hazardous and to bypass traditional funeral practices as a result. The big controversy in Africa right now concerns what to do with the patient during the infectious stage.

Ideally, patients would be brought into a treatment facility a day or two before they become highly infectious. This would have the dual benefit of isolating the population at large from infection and more than doubling the patient’s chance of survival. Unfortunately, because we don’t have enough people to treat patients this way, we have to pursue other courses of action as well that are not ideal for the people currently infected, but at least reduce the amount of people who will be infected in the future. Once we have vaccines in quantity, likely by March 2015, the situation will get much better. In the meantime though, to save lives, we have to consider a measure nobody really likes – quarantine.

Quarantine has become a bad word, in large part because most of the recent quarantines have been implemented so poorly. Quarantine is never going to be a joyful event, but if done carefully and with compassion, it need not be particularly unpleasant either. Certainly being quarantined is much more pleasant than catching Ebola or having friends and family die, and for the next several months at least, that is the alternative.

In general, people need food, water, and protection from extremes of temperature to live, and a degree of social contact with friends and family and a bit of entertainment to be happy. There is no reason that these can’t be provided inside of quarantine, and the cost of doing so is ever so much less than the cost of providing care for an Ebola patient.

The worst hit parts of Africa, and the ones in most need of quarantine, are the shantytowns. In a shantytown in the tropics, most structures are little more than a roof for shade and protection from the rain. Setting up structures such as these, capable of holding a family or social unit of about six with simple cots to sleep on, would not be hard and could be the basis of a quarantine unit. Food could be distributed in a central mess hall, and temperatures taken before one was allowed into the mess hall to eat. People showing fevers or other signs of sickness would be taken from the mess hall to a community care center where family could see patients. Ideally quarantine units of approximately 250 people could be set up in many places. The 250-person limit would reduce the spread of infection within a unit.

Once out of quarantine, ideally the dwellers of a shantytown would be moved into a refugee camp that would slowly grow to the size of the shantytown it is replacing. This camp would need a mess hall and a latrine system of some sort.

People would be invited, not forced, from the shantytown into the quarantine facility. If food, water, shelter, and minimal medical care are available, it is likely that the demand for going into such a quarantine facility would exceed the space available. A lottery would be a fair way to decide who gets in first.

After a certain point in time, everyone in the shantytown will either have passed through quarantine and into the refugee camp, have caught Ebola and either died or become non-infectious, proven naturally immune, or gotten very lucky. At this point the shantytown could be disinfected and the people from refugee camp could move back home. It seems likely that we may have a vaccine deployed as well by then.

Outside of the shantytowns, needed quarantines could be done in people’s own homes. In villages, a community care center coupled with contact tracing is all that is necessary. The traditional methods of contact tracing do work well outside of dense urban settings lacking basic infrastructure.

What would a community care center look like? The goal would be to have a place where the patients could, to the best of their ability, take care of themselves with limited help from survivors of Ebola and the bravest volunteers from their friends and family. The crucial parts of a facility are:

  • Adequate stocks of oral rehydration fluids containing the correct balance of sugars, sodium, and potassium salts.
  • “Cholera cots” (see Figure 2) that can efficiently and safely collect the patient liquid hazardous waste.
  • A place to disinfect and dispose of the waste.
  • Basic protection equipment and disinfection facilities for the workers.
  • Water and simple food such as bananas and rice.
  • Lamivudine or other mass-produced antivirals that don’t require refrigeration, if available.
  • A fence so patients can’t exit until they’ve recovered and to keep out unprotected people.

Figure 2. A cholera cot – a must for treating diarrheal diseases in the tropics. (Image from Hesperian health guides.)

How well these community centers will work is perhaps the most uncertain part of this plan but, particularly with the cooperation of survivors, they may represent our best hope until vaccines are widely available. Socially they would need to be set up so that people could visit and talk through the fence to patients, but be located out of sight of the main habitations so as not to provoke despair. Community care centers have worked successfully in some Liberian towns, as described in detail in the Nov. 4, 2014 issue of Morbidity and Mortality Weekly Report (MMWR) from the CDC (Logan et al., 2014).

The CDC has done a lot of good work in containing this epidemic. Where they’ve faltered has been in portraying more certainty and perhaps more optimism than is warranted by what we know. Perhaps the CDC and leadership are worried that people won’t listen to them if they don’t convey absolute certainty; that if they don’t minimize statements of risk, people will panic. However, panic is normally a temporary condition. In the end, level heads that can reasonably appraise the situation will prevail. How can we appraise the situation, though, if we are not told the truth in all of its uncertainty and risk?

It is true that Ebola is mostly spread by contact with bodily fluids. It is true in previous, smaller epidemics that airborne spread between humans, if any, has played a minor role. However, it is wishful thinking, not science, to absolutely rule this out. With a disease as dangerous as Ebola, certainly it is better to err on the side of caution. Wearing a face mask on public transportation in an Ebola-infected area and washing one’s hands when one arrives back home or at work should be our advice, not — as Obama has said in videos aimed at West Africans — that you need not worry about catching Ebola on the bus if you live in an area where it is rampant. Wearing full body protection including a breathing apparatus should be the norm among Ebola medical personnel, and somewhat belatedly it has become so.

It is true that people with Ebola will mostly show a fever before the illness gets really serious, and vomiting and diarrhea start. However, the temperature increase one develops in response to an illness is highly variable across the population. Children in general spike higher fevers than adults. A noticeable fraction of adults, around 10%, don’t get fevers higher than 100 degrees even in the absence of medicine. A significant fraction of people are on anti-inflammatory medications for arthritis and other common conditions and don’t get fevers for this reason. In Africa, where presumably people tend to be less medicated than in the U.S., reports show that 11% to 13% of people sick enough with Ebola to take themselves to the hospital do not have a fever (Schieffelin JS et al., 2014; Who Ebola Response Team, 2014).

It is true that Ebola is mostly non-contagious before people reach the stage of illness where they show a fever (if one is going to develop a fever). Using a RT-PCR test, we can’t detect virus in the blood before the initial pre-fever symptoms of malaise, aches, fatigue etc. are felt. By the time fever shows, typically we do get solid RT-PCR results, but the viral levels measure only 10% of what they will the next day when the viral level typically peaks and the blood, at least, is maximally infectious (Towner et al., 2014). The viral loads in blood typically remain at the peak level for four days, and then either the patient dies, or the viral loads decrease and the patient recovers. If we assume (and it is an assumption) that a person’s level of contagiousness follows the blood viral load, then certainly most of the disease transmission occurs in the last four days, rather than the days leading up to and including the initial fever stage. Because there is a lag time before people notice that they have a fever and go to the hospital, how much of the transmission is likely to occur in the 8 hours after fever starts? Since the viral load will be rising from 10% to 100% over the course of the day, following an exponential progression, I’ll estimate the viral load on average during the first 8 hours after fever as 13% of peak, and the next 18 hours after fever as 33% of peak. With this I can estimate the viral load over time in the 8-hour window as:

Initial-transmission/(initial-transmission + later-transmission)
(8 hours * 13%) / (8 hours * 13% + 16 hours * 33% + 4*24 hours * 100%)

which comes to almost exactly 1%. So, while it is scientifically reasonable to estimate that 99% of the transmission will be avoided if people go into isolation relatively promptly after they’ve reached the stage of the disease usually associated with a fever, it is also reasonable to estimate that 1% of the transmission occurs before this stage. The clinical and epidemiological data suggest that it could not be much higher than this, but are not strong enough to say that it could be lower. Given the deadliness of the disease, it is prudent to consider people infectious at a low level even before the illness becomes severe.

If the world at large tended to under-react early in the course of this epidemic, for the most part this has changed. The CDC and others have tightened their recommendations and response in the USA. African nations and health organizations have been effective in keeping the spread of Ebola outside of Guinea, Liberia, and Sierra Leone to small, quickly extinguished outbreaks. The combination of popular education about how to avoid catching Ebola, contact tracing, and quarantine seems to be putting the brakes on the epidemic in the rural areas of West Africa. I do hope a system similar to the quarantine-into-refuge I describe here can be applied to the slums and shantytowns, and that these, together with community care centers, will help save many of those in even the hardest hit regions.


Logan G et al. Establishment of a Community Care Center for Isolation and Management of Ebola Patients — Bomi County, Liberia, October 2014. MMWR 2014;63(Early Release):1-3.

Maganga GD et al. Ebola Virus Disease in the Democratic Republic of Congo. N Engl J Med. 2014 Oct 15. [Epub ahead of print]

Schieffelin JS et al. Clinical Illness and Outcomes in Patients with Ebola in Sierra Leone. N Engl J Med. 2014 Oct 29. [Epub ahead of print]

Towner JS et al. Rapid diagnosis of Ebola hemorrhagic fever by reverse transcription-PCR in an outbreak setting and assessment of patient viral load as a predictor of outcome. J Virol. 2004 Apr;78(8):4330-41.

WHO Ebola Response Team. Ebola virus disease in West Africa–the first 9 months of the epidemic and forward projections. N Engl J Med. 2014 Oct 16;371(16):1481-95.

If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

Genome Browser in a Box (GBiB) Origins

The opinions expressed here are those of the author, Jonathan Casper, and do not necessarily reflect those of the University of California Santa Cruz or any of its units.

I’m happy to say that we’ve finally released the Genome Browser in a Box (GBiB). GBiB is essentially a virtual machine image of a mirror of the UCSC Genome Browser. Download it, set it up, and voilà – instant mirror. This lets you do cool things like have a mirror of the browser on your own personal computer – more information on how to set it up is available on the help page at Here, however, I’m going to talk about the background of GBiB: how it got started, what kinds of decisions we faced, and what became my favorite feature.

At UCSC, we have known for a long time that it can be hard to set up a mirror server. We even have a separate mailing list devoted to the topic. Before the GBiB project began, one of our developers was working on a script to completely automate the process for a computer running stock Ubuntu Linux.  Eventually the developer had an epiphany: why not just create a barebones mirror once on a virtual machine, and then make that available to our users? It wouldn’t solve the problem of allowing people to easily add the assemblies and tracks they wanted, but at least they wouldn’t have to wrestle first with setting up Apache and MySql. The developer’s suggestion came at an opportune moment – we had just received several mailing list questions about using sensitive data with the UCSC Genome Browser website. We didn’t have a good answer.

The problem is that the UCSC Genome Browser has always been focused on being an academic research tool, not a clinical one. We aren’t designed to provide the kind of data security that HIPAA and Institutional Review Boards call for. The only answer we could give to people who wanted data security was “create your own secure mirror, or use another genome browser”. Knowing how difficult it could be to set up a mirror, that wasn’t much of a choice.

Into that mix, we were suddenly presented with a new option: give everyone a pre-installed mirror with the hardest parts already done. Just place it behind a firewall, load up your sensitive data, and enjoy! I thought it was a great idea, as did many other browser staff members.

From there, the idea quickly snowballed. UCSC already provides a public MySQL server and download site with most of the data from our browser. We realized that we could set up the virtual machine to take advantage of those resources and load our data over the internet. This was a great advantage over normal mirror servers. UCSC provides many terabytes of data. Most mirrors have to pick and choose which assemblies and tracks they make available; there’s far too much data to download and keep synchronized. By using our public internet resources, GBiB could provide all of it.

In practice, we discovered it wasn’t quite that easy. Latency issues meant that for anyone not on the west coast of the United States, GBiB worked really slow. Just loading the default view of the human GRCh37/hg19 genome assembly could take over 10 seconds. We had to make a compromise: GBiB wouldn’t have to download track data to use it, but downloading would still be an available option for users in remote locations.

There is now a new CGI just for this purpose: “Mirror Tracks”. It combs through the list of database tables and files associated with browser tracks and allows you to download the data for any of them. If you’re interested in looking at, say, mRNA alignments in the Painted turtle (chrPic1) genome and GBiB is just too slow, a few clicks in Mirror Tracks will put them all on your own hard drive. If you really want, you can even then put GBiB into full-offline mode. You’ll lose access to any track data that you haven’t downloaded, but you’ll always have those Painted turtle mRNAs.

My favorite feature of GBiB, though, has to be what it does for track hubs. Track hubs are a feature we released in 2011 to allow users to view their own data files in the UCSC Genome Browser alongside our annotation. Unlike custom tracks, where all the data must be sent to our server at once, track hubs only send the data for the region you are looking at. That is much more manageable for something like a VCF file, which can be on the order of 10-100 GB.

There are two problems with track hubs. First, you must have web hosting space for your data to construct a track hub. Not everyone does. There are public hosting solutions like DropBox, but they don’t always work. Second, once again there is the problem of sensitive data. Even if you are willing to send your sensitive data directly to our servers at UCSC, you may not be willing (or even allowed) to make it publicly available on a web server. GBiB solves both problems beautifully.

GBiB already has a built-in web server that it uses to communicate with your computer. With a few small adjustments, you can take advantage of that and let GBiB also host your data files. This means that you can build and use a track hub with GBiB, and none of the data will ever leave your computer or be accessible to anyone else, unless you grant them access to your GBiB.

Genome Browser in a Box is available from our web store at It is free for non-commercial use by non-profit organizations, academic institutions, and for personal use. Please see the store website for full terms and conditions.

If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to

2014 Ebola epidemic

The opinions expressed here are those of the author, Jim Kent,  and do not necessarily reflect those of the University of California Santa Cruz or any of its units.

I first learned about Ebola in a microbiology course in 1997. This was a great course and the teacher was very funny. He joked that our mothers taught us to wash our hands after we went to the bathroom, but in this class it was a good idea to wash hands before going to the bathroom. He once held up a test tube of Clostridium botulinum and joked it was enough to wipe out a day care center. He didn’t make any jokes about Ebola.

Ebola has been around in the African rain forests for a long time. There is some animal, probably a fruit bat, but nobody is really sure, that is the primary host of this virus. The disease a virus causes in its primary host is usually pretty mild, since when the host dies, the virus dies too. The virus can infect other animals as well. Dogs in the regions often have antibodies indicating exposure, but apparently it does not make them very sick at all, and they clear the actual virus from their systems quickly. Pigs do get sick, but generally recover. For chimps, gorillas, and humans it is usually fatal, particularly the Zaire strain of Ebola that is currently ravaging West Africa.

In humans Ebola is a disease that moves fast. It tends to either kill you quickly, or you survive it and clear it from your system relatively quickly. For a long time Ebola outbreaks affected only humans in remote villages in the forest, the ones most likely to come in contact with infected animals in the wild. The disease would burn through the village, and by the time survivors, if there were any, managed to come into contact with other people, they were non-infectious.

As Africa grew more populated and villages turned into towns, Ebola became a greater problem. People infected with Ebola and still in a contagious state would manage to stagger into a medical facility seeking care. The Nova episode Ebola, the Plague Fighters documents one such case. The patient arrived in 1995 in a hospital in Kikwit, Zaire (now DRC) with extreme pain in the abdomen. The doctors suspected appendicitis and operated. In spite of the usual surgical care against infection, nearly all of the medical personnel at that operation got infected and spread it throughout the hospital. It took heroic efforts to contain the resulting outbreak, which resulted in 315 infections and 244 fatalities.

There have been 7 Ebola outbreaks infecting 100 or more people since Kikwit. With moats of bleach-water, aggressive measures to trace contacts and quarantine people infected, and with the growing experience and energetic activities of groups such as the Doctors without Borders, each of these outbreaks has been contained. These outbreaks have, fortunately, provoked enough concern in the research community that vaccine and drug development is far along.

The 2014 outbreak in West Africa started in Guinea, and initially it looked like it would be contained as well. From what we can reconstruct, this one started with a two-year-old in December of 2013. The two-year-old survived, but the virus infected his family and many other people. The Guinea local health officials noticed the problem before it was too large and called in Médecins sans Frontières, as the Doctors without Borders are known in French (and indeed they prefer to use the French initials internationally, MSF). The MSF responded promptly and, working with locals, were able to contain the outbreak to just over 100 infected. Baize et al., 2014, in the New England Journal of Medicine, describe the course of this outbreak well and include sequence from 3 viral genomes from this outbreak.

For more than 42 days, long enough for two of the usual quarantine periods for Ebola to pass, there were no more cases. Then, in approximately the same places, the disease resurged. The reason for this resurgence is unknown. Sequencing of the resurgent outbreak by Gire et al., 2014, makes it clear that the resurgence is a continuation of the previous outbreak. It is possible that in this recently war-torn region, near the borders of Guinea, Sierra Leone, and Liberia, some of the infected fled and hid where MSF and the Guinea health workers never even knew about them. It is possible that local wildlife, perhaps a ground-based secondary host, acted as a reservoir for the virus.

Regardless of the cause of the resurgence, it has happened, and it has grown large, to the point that this is not an outbreak, but an epidemic. Ebola has for the first time hit densely populated regions. The epidemic has grown large enough that for all of their dedication and talent, MSF and similar organizations simply don’t have enough doctors and other health workers to contain it. The doubling time inside the worst hit city, Monrovia, Liberia, is just 2 weeks. The CDC’s best estimate is that 21,000 people were infected as of September 30. There is a very real danger that this epidemic could spread throughout Africa. There is a possibility that can’t be discounted, that in spite of the better sewers and other sanitation systems, it could spread through the developed world as well.

There are two things that are necessary to avoid a global pandemic. First, aggressive quarantine and isolation measures must be taken to slow the spread. Second, vaccines and treatments developed in response to previous outbreaks must be quickly scaled up so that hundreds of thousands, and ideally millions of doses are available. If either of these two things fails, we will face a worldwide problem of a scope that has not been seen for generations.

Amidst this gloom and doom there is some hope. Vaccine developers have already pressed two vaccines forward as far as they could go without actually having a human epidemic to test against. They work well in non-human primate trials. Comparative genomics analysis of the virus over the course of many outbreaks and across many strains makes it clear that there are large, antibody-accessible, parts of the virus that are highly constrained evolutionarily, and that the virus mutates slowly compared to HIV. This is something easy to see in the UCSC Genome Browser ( It is exceedingly likely from the slow rate of change that both vaccines will work, and that once they are manufactured at large scale it will become much easier to contain the epidemic.

Similar logic applies to the treatment options that are under development. The ZMapp antibody cocktail tested in non-human primates and found effective against earlier versions of this virus likely will work in humans against the current strain. Likely, but not certainly. The Tekmira liposome-encapsulated siRNA is also a hopeful option, and perhaps can be scaled up faster than ZMapp, and perhaps also will work in humans. There is also hope that an existing drug, Lamivudine, a nucleoside analog currently approved for use against HIV and Hepatitis B, will be effective. Dr. Gorbee Logan in Liberia has had success using it on Ebola patients, and NIAID is following up on this with investigation in the lab. I’m hopeful the Ebola Genome Browser that UCSC just released is helpful for others developing new treatments.

Still, without effective quarantine measures this epidemic will grow very large indeed before vaccines and treatments can be deployed. The CDC estimate is 1.4 million cases by January 20, 2015, in Liberia and Sierra Leone alone, unless the rate of infection is slowed. Obama has ordered the army to deploy, and this is likely to help. From what I understand the plan is for the army to build field hospitals, and to help distribute food, water, and other necessary things, including vaccines and medicines once they are available.

Nonetheless I remain deeply concerned. It doesn’t really seem like people realize just how contagious this disease is. There’s thinking that somehow what is happening in Africa won’t happen elsewhere, that the African-specific customs and lifestyles favor the spread. While there is a grain of truth in this, it’s only a grain. We are fortunate in the developed world to have much better sewage systems than in Africa. We tend to leave it to professionals to prepare a body for burial rather than washing it ourselves. However, you only need look at how fast a cold or the flu passes around in the USA to see that we are not immune to a quickly progressing epidemic.

There have been many statements by the CDC and others, based on rather thin scientific evidence, that Ebola spreads only by direct contact with bodily fluids.   From this people seem to get the notion that if you avoid touching pools of blood, diarrhea, or vomit, you are ok.   Please let me take the opportunity to correct this notion.

The scientific evidence such as it is shows that if one cage of monkeys is infected with Ebola by injecting the virus into muscle tissue, it won’t infect a nearby cage in the same room in a set-up made to minimize large droplet transmission. In contrast with the same set-up, oro-nasally infected pigs were able to infect a nearby cage of monkeys in the same room (Weingartl et al., 2012). Earlier studies with a more casual layout (Jaax et al., 1995) did show indirect transmission between non-human-primates. For me, at least, the question of whether Ebola can infect without direct touch or large droplet contact remains unanswered.  At the least I would like to see if the monkeys infected by the pigs ora-nasally could in turn pass on the virus to other monkeys in the Weingartl et al. set-up. The site of initial infection has a profound effect on the progress and mode of transmission of many diseases.

Beyond this, even for the cold and the flu, 70% of infections are via contact. See the Influenza B article by La Rosa et al., 2013, and the review by Cowling et al., 2014, for further information on the modes of transmission of common viruses. What’s more, Ebola is a robust virus. It survives well outside of the body, particularly indoors away from the sun, compared to many common viruses (Piercy et al., 2010). You really need only touch something that someone with Ebola has touched recently. Among the bodily fluids that contain the virus is sweat!

Fortunately, there is no evidence that people are infectious before symptoms show. In macaques, sensitive RT-PCR tests are unable to detect the virus outside of the site of infection until day three post-infection, the same day symptoms develop, and even on day three, the levels are low, 1/10,000 of the levels they reach by day six (Giesbert et al., 2003). In humans we don’t have presymptomatic RT-PCR data, but we see similar patterns of a massive increase in viral load from the first symptomatic day to later in the disease (Towner et al., 2004). In all, the evidence that people are not infectious before showing symptoms seems solid.

Still, in the light of the relative ease of transmission of this virus once symptoms show, and the deadliness of the virus once caught, I simply don’t understand why travelers from the Ebola-stricken nations are not being quarantined. A single air traveler from Liberia to Lagos managed to infect 11 people directly, and 8 more indirectly. The Nigerian CDC was able to contain the outbreak, but it involved the efforts of hundreds of people tracing 900 contacts, and in the end 8 people died. A quarantine is a small price to pay to avoid situations like this. Recently a traveler from Liberia has developed symptoms in Dallas, and 100 contacts are being traced. We can only hope no fatalities will result.

It is my hope that people will start giving the virus the respect it deserves. I hope that all flights out of the infected regions will cease other than military flights where decontamination of planes and quarantine of passengers can be insured. I hope that sensible public health measures such as these will give us time to deploy vaccines and treatments, and develop backup vaccines and treatments in case the first set don’t work.

Meanwhile, back in West Africa, in Monrovia in particular, the epidemic has gotten so far that local quarantine is not sufficient. It has reached the stage where we have to start isolating the healthy into refuges. There are not enough health workers to handle the situation now, and the situation will rapidly get worse. The best we can do is to help people take care of their loved ones safely – distributing kits that ideally would include adult diapers, vomit buckets, gloves and breathing masks to help contain the spread of infected fluids; food, water, and electrolytes to support the patient; chlorine to disinfect the area and — if the Lamivudine treatment does hold up to its initial promise — a 5-day treatment of Lamivudine for the Ebola and a 5-day treatment of Amoxicillin to help contain the secondary infections. Kits such as these could end up saving hundreds of thousands of lives, especially if they can be delivered by people who know the local language and customs.

Many people in the world are doing all they can to fight this epidemic. It is inspiring to see. Through our combined efforts, I’m sure that the worst-case scenario won’t happen. Nonetheless what will happen is likely to be pretty bad if vaccine productions lag or if people don’t become a bit more realistic about just how infectious this disease is.



Baize S, Pannetier D, Oestereich L, Rieger T, Koivogui L, Magassouba N, Soropogui B, Sow MS, Keïta S, De Clerck H et al. Emergence of Zaire Ebola Virus Disease in Guinea. N Engl J Med. 2014 Apr 16. doi: 10.1056/NEJMoa1404505

Cowling BJ, Ip DK, Fang VJ, Suntarattiwong P, Olsen SJ, Levy J, Uyeki TM, Leung GM, Peiris JS, Chotpitayasunondh T, Nishiura H, Simmerman JM. Modes of transmission of influenza B virus in households. PLoS One. 2014 Sep 30;9(9):e108850. doi: 10.1371/journal.pone.0108850. eCollection 2014.

Geisbert TW, Hensley LE, Larsen T, Young HA, Reed DS, Geisbert JB, Scott DP, Kagan E, Jahrling PB, Davis KJ. Pathogenesis of Ebola hemorrhagic fever in cynomolgus macaques: evidence that dendritic cells are early and sustained targets of infection. Am J Pathol. 2003 Dec;163(6):2347-70.

Gire, SK, Goba, A, Andersen, KG, … Sabeti PC et al. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science 345, 1369–1372. doi:10.1126/science.1259657

Jaax N, Jahrling P, Geisbert T, Geisbert J, Steele K, McKee K, Nagley D, Johnson E, Jaax G, Peters C. Transmission of Ebola virus (Zaire strain) to uninfected control monkeys in a biocontainment laboratory. Lancet. 1995 Dec 23-30;346(8991-8992):1669-71.

La Rosa G, Fratini M, Della Libera S, Iaconelli M, Muscillo M. Viral infections acquired indoors through airborne, droplet or contact transmission. Ann Ist Super Sanita. 2013;49(2):124-32 .doi: 10.4415/ANN_13_02_03.

Piercy TJ, Smither SJ, Steward JA, Eastaugh L, Lever MS. The survival of filoviruses in liquids, on solid substrates and in a dynamic aerosol. J Appl Microbiol. 2010 Nov;109(5):1531-9.

Towner JS, Rollin PE, Bausch DG, Sanchez A, Crary SM, Vincent M, Lee WF, Spiropoulou CF, Ksiazek TG, Lukwiya M, Kaducu F, Downing R, Nichol ST. Rapid diagnosis of Ebola hemorrhagic fever by reverse transcription-PCR in an outbreak setting and assessment of patient viral load as a predictor of outcome. J Virol. 2004 Apr;78(8):4330-41.

Weingartl HM, Embury-Hyatt C, Nfon C, Leung A, Smith G, Kobinger G. Transmission of Ebola virus from pigs to non-human primates. Sci Rep. 2012;2:811. doi: 10.1038/srep00811. Epub 2012 Nov 15.

If after reading this blog post you have any public questions, please email All messages sent to that address are archived on a publicly accessible forum. If your question includes sensitive data, you may send it instead to