Users are often tempted to break axis values to present data of different orders of magnitude on the same graph (see here). While this may be convenient it's not always the preferred way of displaying the data (can be misleading at best). What are alternative ways of displaying the data that are different in several orders of magnitude?
I can think of two ways, to log-transform the data or use lattice plots. What are other options?
I am very wary of using logarithmic axes on bar graphs. The problem is that you have to choose a starting point of the axis, and this is almost always arbitrary. You can choose to make two bars have very different heights, or almost the same height, merely by changing the minimum value on the axis. These three graphs all plot the same data:
An alternative to discontinuous axes, that no one has mentioned yet,is to simply show a table of values. In many cases, tables are easier to understand than graphs.
Some additional ideas:
(1) You needn't confine yourself to a logarithmic transformation. Search this site for the "data-transformation" tag, for example. Some data lend themselves well to certain transformations like a root or a logit. (Such transformations--even logs--are usually to be avoided when publishing graphics for a non-technical audience. On the other hand, they can be excellent tools for seeing patterns in data.)
(2) You can borrow a standard cartographic technique of insetting a detail of a chart within or next to your chart. Specifically, you would plot the extreme values by themselves on one chart and all (or the) rest of the data on another with a more limited axis range, then graphically arrange the two along with indications (visual and/or written) of the relationship between them. Think of a map of the US in which Alaska and Hawaii are inset at different scales. (This won't work with all kinds of charts, but could be effective with the bar charts in your illustration.) [I see this is similar to mbq's recent answer.]
(3) You can show the broken plot side-by-side with the same plot on unbroken axes.
(4) In the case of your bar chart example, choose a suitable (perhaps hugely stretched) vertical axis and provide a panning utility. [This is more of a trick than a genuinely useful technique, IMHO, but it might be useful in some special cases.]
(5) Select a different schema to display the data. Instead of a bar chart that uses length to represent values, choose a chart in which the areas of symbols represent the values, for example. [Obviously trade-offs are involved here.]
Your choice of technique will likely depend on the purpose of the plot: plots created for data exploration often differ from plots for general audiences, for example.
Maybe it can be classified as lattice, but I'll try; plot all the bars scaled to the highest in one panel and put another panel showing zoom on lower ones. I used this technique once in case of a scatterplot, and the result was quite nice.
I'd separate the problem of log axes from the problem of bar charts.
Logarithmic axes IMHO are best suited for things that come or happen in multiples (... increased by a factor of 20 when treated with ...).
In that case, 1 = 10⁰ is the natural origin. There is a whole range of physical/chemical values which are in fact logarithmic, e.g. pH or absorbance $A = lg I_0 - lg I$, and which have "natural" origins. For A that would be $I_0$. For pH in aqeous solutions, e.g. 7.
Bar charts can never be sensible if there is no sensible and fixed origin which takes the role of a control (baseline, blank). But this doesn't have anything to do with the log axes.
The only regular use I have for bar charts are histograms. But I could imagine that they do well to show the difference to this origin (you also immediately see whether the difference is positive or negative). Because the bars depict an area, I tend to think of barcharts as a very discretized version of area under a curve. That is, the x-axis should have a metric meaning (which may be the case with time, but not with cities).
If I'd find myself wondering what origin to use for the log of something that had a "natural" origin at 0, I'd step back and think a bit about what is going on. Very often, such problems are just an indicator that the log is not a sensible transformation here.
Now a bar chart with log axes would emphasize increases or decreases that happen in multiples. Sensible examples that I can think of right now all have some linear relationship to a value of interest. But maybe someone else finds a good example.
So I think the data transformation should be sensible with respect to the meaning of the data at hand. This is the case with the physico-chemical units I mentioned above (A is proportional to concentrations, and pH has, for example, a linear relationship to the voltage in a pH-meter). In fact, it is so much the case, that the log unit gets a new name, and is used in a linear way.
Last, but not least, I come from vibrational spectroscopy, where broken axes are quite regularly used. And I consider this use one of the few examples where the breaking of the axes isn't deceiving.
However, we don't have changes in the order of magnitude. We just have an uninformative region of 30 - 40 % of our x range: Here's an example:
For this sample, the part between 1800 - 2800 /cm cannot contain any useful information.
The uninformative spectral range is therefore removed (which also indicates the spectral ranges we actually use for chemometric modeling):
But for the interpretation of the data, we need precise readings of the x-position. But generally we do not need multiples that span the different ranges (i.e. there are such relations, but most connections are more complicated. E.g.: Signal at 3050/cm, so we have unsaturated or aromatic substance. But no strong signal at 1000/cm, so no mono, meta, nor 1,3,5-substituted aromatic ring ...)
So it is better to depict x with a larger scale (actually we often use millimeter-sheet like guides or label the exact locations). So, we break the axis, and get a larger x scaling:
Actually, it is very much like facetting:
but the broken axis IMHO emphasizes that the scale of the x-axis in both parts is the same. I.e. Intervals within the plotted regions are the same.
To emphasize small intensities (y-axis), we use magnified insets:
[... For details, see the magnified (x 20) νCH region in blue ....]
And this is certainly possible with the example in the linked plots as well.
The broken-axis solution works best when there is a clear break right across the plot and the ordinate is labeled so that the gap is obvious. The advantage of this is that the scale is preserved across the two sets of values. Panel plots with different scales may not convey the relative variation within the low and high groups. I do like the idea of the zoom-in plot, which I programmed for scatterplots but hadn't thought of using for bar plots.
Two ideas that were alluded to, but not explicitly described in when I looked at the excellent answers and comments were that you are using a bar chart "in a manner inconsistent with labelling" and normalized/dimensionless data.
The star/spider/radar-style chart (link)(link) is often very good for comparing several different things along multiple coordinates. There are a number of very useful plots that (sadly) are rare in business presentations, likely because leadership prefers to use conclusions to make decisions rather than using information to get understanding and then use the understanding to make the decisions. In business it is sometimes very difficult to build consensus and so the results-only approach can have higher yield in a consensus-first, decision-next environment. This informs the popularity of the bar/column chart. Please consider the examples of other graph types that are good for gaining understanding (link).
If you divide the values that you are charting by a "characteristic" value then you can transform the scaling to improve readability without losing information. Fluid Dynamicists prefer dimensionless numbers because of their predictive utility and their elasticity in application. They look at things like the Buckingham Pi Theorem as sources for candidate dimensionless forms (link). Popular, and useful, dimensionless numbers include Reynolds Number, Mach number, Biot number, Grashof number, Pi, Raleigh number, Stokes number, and Sherwood number. (link) You don't have to be a physicist to love dimensionless numbers because they are useful in non-physics applications. Measures like density, homogeneity, circularity, and coplanarity can define images, pixel fields, or multivariate probability distributions. Don't just consider taking a logarithm, or a relative distance from a known value - you can also consider inverting the numbers, taking their square roots.
Best of luck. Please let us know how things turn out.