The more data the better??
The more information the better…right? Today’s quandry.
The FGGE dropsonde data
This story’s objective is to show that 1) more data is not necessarily better, 2) it is important to have an idea of what you should expect in your results, whatever your area of study, and 3) complementary information can be very helpful. Some other inferences might be drawn about the inference of junior and senior researchers etc, but that was not a serious problem in this case.
Most students today and even researchers will not appreciate the efforts made to collect global data during the First Garp Global Experiment (FGGE) in 1979. No such global effort has been carried out since, for many reasons, including cost, increasing availability of satellite imagery, and improvements in numerical weather prediction. Nonetheless, an enormous effort was made to make in-situ soundings over the tropical oceans, especially between 10˚N and 10˚S. Most soundings were made by large military aircraft operating out of Panama, Acapulco, Hawaii, Ascension Island and NOAA aircraft operating from Diego Garcia. In all, a period of 30 days during the winter and 30 days during the summer were sampled by the aircraft, with approximately 6000 omega[1] dropsondes being released. In addition to the aircraft component many other ships used omega radiosonde systems during FGGE.
Within one month of the ending of the FGGE summer dropsonde missions, a major regional subcomponent of FGGE, the Summer Monsoon Experiment (SMONEX), took place over the Arabian Sea and Bay of Bengal. Research aircraft using the same dropsonde systems made several hundred omegasoundings during this experiment and this author was a graduate student at the time participating in many of the flights. During the flights the scientists onboard would examine the incoming dropsonde data and then plot the observations at different levels, so that by the end of the flight a synoptic analysis had been prepared at the standard pressure levels below the flight level of the aircraft. Upon landing these analyses could be shared with other scientists.
By plotting the data from many soundings and then preparing the analyses those participating in the flights knew something about the quality of the data and the smoothness of the meteorological fields. The quality of the wind data appeared very good, despite the limitation that the onboard dropsonde operator could only select three out of the possible 8 omega transmitting stations to calculate the real-time winds. The operator usually choose the three strongest stations, though they were also aware of the required geometry for the best wind estimates. The signals from all of the 8 omega stations were recorded on the aircraft, for later complete processing of the winds.
Upon returning to the US the research community awaited the arrival of the FGGE final dropwindsonde data set, being prepared at NCAR. All omega stations would be used in the final wind estimation since every station that was received by the dropsonde could in principle improve the wind estimate. Final research analyses awaited the arrival of the final omegasonde winds.
When the final sounding data arrived, we compared them against the real-time winds that were available during the research flights. There were large differences (up to 10 m/s or more), and the final winds produced spatial patterns of the windfield that were much noisier and in less agreement with satellite cloud drift wind estimates. Correspondence with those responsible for reprocessing the omega sounding data at NCAR indicated that we were in error and that their procedure, using all available signals, could only be an improvement over what we had seen in real-time.
During a FGGE workshop held in Tallahassee in January 1981 (18 months after the SMONEX observations) we were finally able to present our comparisons between the real-time and final winds to the key individual (Paul Julian) responsible for the reprocessing effort. Meeting in my office with maps lain out comparing the two data sets, the individual grudgingly indicated that there seemed to be a problem. Several weeks later came a message – they had found “the problem”… Their procedure for wind calculation had assumed that the signals that were received followed the shortest great circle path from the transmitter to the dropsonde. However, under certain conditions, the strongest signal being received was actually coming via the long-path great circle route, making the geometry incorrect when these signals were included in the calculation of wind. Since it was not easy to predict which sounding winds were contaminated by this long-path propagation, they recommended reprocessing all of the winds with a procedure to eliminate long-path signals from the calculation. This was done and the final winds eventually distributed. Unfortunately, by this time, the original “final” winds had been incorporated into a data assimilation procedure to produce the FGGE 3-B global analyses, which eventually were redone since they did not incorporate the correct dropsonde winds. But many papers were published using the incorrect FGGE 3-B data set before the revised version was produced. (One wonders how long the erroneous data would have contaminated research efforts if we had not been aware of the quality of the real-time data and our insistence that something was wrong with the “final” data.)
Several lessons are evident here. More information / data does not necessarily make a better product if some of those data degrade other information already present. And the best understand of today does not account for processes that are not understood. The example above is analogous in some ways to the controversy over “continental drift”, where Afred Weigner and his supporters could see with much evidence that the continents must have moved, but they had not counter argument to the calculations of Lord Kelvin, who showed that, under certain assumptions, the earth’s internal heat was likely insufficient to drive such motion. Lord Kelvin, and all others, were unaware of the phenomenon of radioactivity decay, which invalidated one of his critical assumptions, and which produces the heat required for the plate motion that is observed today.
[1] Winds were calculated by comparing the phases of the omega signals transmitted from the 8 omega stations located around the world, essentially a version of triangulation on a sphere. Not all stations could be heard at a given location; a sophisticated procedure had been developed to calculate winds from all stations that could be heard.