2012-05-31

Preventing self-masks from diverging

Masks can be used in various ways in MAKEMAP. For instance, the dimmconfig_bright_extended.lis file specifies:

ast.zero_snr = 5

which causes the areas of the map that have a signal-to-noise ratio of less than 5 to be masked on each iteration. This is a dynamic mask, in that a new and potentially different mask is generated and used on each iteration. However, the added flexibility introduced by allowing such a mask to change on each iteration can result in eventual divergence of the iterative  map-making algorithm. To prevent this, a set of new configuration parameters has been added that result in the mask being frozen after a specified number of iterations. These parameters are AST.ZERO_FREEZE, COM.ZERO_FREEZE and FLT.ZERO_FREEZE. Each of these parameters may be set to a positive integer value, indicating that he corresponding mask should be frozen after that many iterations. They all default to zero, which reproduces previous behaviour (i.e. the mask is never frozen).

2012-05-25

Better core utilisation

Over the past couple of weeks I've modified several parts of the makemap command to split up their calculations between all available threads rather than doing them all in a single thread. This means that makemap should now run significantly faster on multi-core machines. The speed up will be greater for larger data sets than for smaller data sets. As an example, producing a map using a single subarray from a  40 minute observation took  11 mins 49 secs with the new scheme on my 12 core machine, and achieved an average CPU load of 791%. With the old scheme, the same map took 19 minutes 14 secs with an average CPU load of 461 %.

2012-05-17

Changes to the common-mode estimation

For the impatient: the new improved algorithm described below is enabled by default and requires no changes to scripts or configuration files. However, the old algorithm is still available, and can be used by setting COM.OLDALG=1 in the configuration file.

For extended sources, MAKEMAP usually uses the mean normalised change in map pixel values between iterations as a measure of convergence. This value drops rapidly over the first few iterations, and eventually levels out, typically somewhere between 0.01 and 0.05. However, if MAKEMAP is allowed to continue iterating beyond this point, the mean normalised change between iterations often starts to increase and decrease by large amounts in an erratic manner. Examining the maps shows that these changes reflect the creation of nasty artifacts in the map, which often take the form of individual bolometer tracks. As an example, Dave Nutter saw the following variations in mean normalised change with iterations for one particular observation:


A relatively smooth decrease until about iteration 50 and then madness. For illustration the following shows the map after 50 iterations (left) and 200 iterations (right). Notice the visible streaks that have appeared by iteration 200.



The cause of this wild behaviour seems to be the way the common-mode signal is estimated and used. On each iteration, the common mode value at each time slice is taken to be the mean of the bolometer values at that time slice, averaged over all remaining bolometers  (i.e. bolometers that have not previously been flagged as unusable for one reason or another). Each bolometer time series is then split into blocks of 30 seconds and compared to the corresponding block of the common-mode signal. This comparison takes the form of a least squares linear fit between the bolometer value and the common-mode value, and generates a gain, offset and correlation coefficient for each bolometer block. Bolometer blocks that look "unusual" - either because they have a low correlation coefficient (i.e. do not look like the common mode) or an unusually low or high gain - are flagged, and all the bolometer values within such blocks are rejected from all further processing. So on the next iteration, the common mode within each block is estimated from only the subset of bolometers that passed this check on all previous iterations. With the passing of iterations, more and more bolometer blocks are rejected, often resulting in the common mode in a block being estimated from a completely different subset of bolometers to those of the neighbouring blocks. This produces discontinuities in the common-mode signal at the 30 second block boundaries, which get larger as more iterations are performed. When the common-mode is subtracted from the data, these discontinuities are transferred into the residuals, which are then filtered within the FLT model. The sharp edges at the discontinuities cause the FFT filtering to introduce ringing (strong oscillations on the scale length of the filter that extend over long times). The following plot shows an example of these discontinuities in the common-mode estimate:


The horizontal axis is sample index. The red curve is the common mode at iteration 85 and the white curve is the common mode at iteration 99. Due to down-sampling, 30 seconds corresponds to 2400 samples, and discontinuities at intervals of 2400 samples are clearly visible. In addition, extensive oscillations are visible, particularly between samples 48000 and 50000.

In order to avoid this problem, changes to the way the common-mode is estimated and used have been made. The heart of the change is that bolometer blocks that are flagged as unusual are no longer excluded from the estimation of the common-mode on the next iteration - all such flags are cleared before the new COM estimate is made. So the common mode is always based on data from all bolometers, thus avoiding the discontinuities at the block boundaries. Other more minor changes in the code include:

  1. The common-mode value at each time slice can be estimated using a sigma-clipped mean, rather than a simple mean. The new configuration parameters COM.NITER (the number of iterations) and COM.NSIGMA (the number of standard deviations at which to clip on each iteration) control this algorithm. In practice, sigma-clipping does not seem to add much, and may slow down convergence. At the moment, both of these parameters default to 3. Changing COM.NITER to 1 results in a single simple mean being calculated with no clipping. This may become the default in the near future.
  2. No attempts are now made to refine the common mode. Previously, once the original COM signal was estimated and unusual bolo-blocks were flagged, a refined COM signal was formed by excluding the flagged bolo-blocks. Further unusual bolo-blocks were then flagged by comparing them with the refined COM signal. This refinement process was repeated until no further bolo-blocks were rejected. In the new code, no attempts are made to refine the original COM signal resulting in the new algorithm being significantly faster than the old algorithm.
Note, the new algorithm still uses COM.GAIN_BOX to define the size of the blocks in which the gain and offset of each bolometer is estimated. Also, "unusual" blocks are still flagged in the same way and excluded from the subsequent FLT model and map .

The new algorithm seems effective at overcoming the problem of divergence illustrated at the start of this post. For instance, with the new code the variation of mean normalised change between iterations for the same data looks as follows:


 Note that:

  • The wild variations above iteration 80 have gone.
  • The curve drops somewhat faster at the start (i.e. fewer iterations are needed to achieve a given maptol).
  • Each estimation of the common-mode signal is faster than before due to the lack of the refining process.
  • Far fewer samples are rejected from the final map ( 2.38 % of the data is now flagged by the COM model, as opposed to 17.37 % previously).
  • The resulting maps at iteration 50 and 200 are shown below.  There is very little difference between them, and they both look visually like the iteration 50 map from the old algorithm shown above.

2012-05-01

Interrupting makemap using control-C

So you're sitting there waiting for a long run of makemap to finish thinking "It's going to take ages! I asked for far more iterations than I need, but I can't restart it because it's taken so long to get to this point". As of git commit efde7077500 you can simply hit control-C and all will be well - makemap will continue running until the end of the next iteration. It will then create a map from the information it has, and terminate normally. Note, it needs to perform one more whole iteration before creating the map in order to get proper pixel values for the areas masked by ast.zeromask. One thing to beware of though - if the input data is split into more than one chunk, subsequent chunks that have not yet been started will not be included in the map.

If you want to abort NOW! without a map,  rather than waiting for the next iteration to complete, then press control-C a second time.

Note, if you are running makemap within a shell script, then the shell may handle the control-C signal itself, leading to some potentially odd behaviour. For instance,  the script may appear to terminate immediately, but in fact may leave the makemap process running in the background until the next iteration has completed. Most shells have ways of controlling what happens when an interrupt signal is detected. For instance, the (t)csh has the "onintr" command.