2012-12-04

A significant bug using multi-chunked data

Today I have found and fixed a bug that could seriously upset the MAKEMAP "maptol" convergence test when used with data that has been split into more than one chunk. Basically, the map at the end of the first iteration on each new chunk was compared with the final map from the previous chunk, rather than with a blank map full of zeros. The effect of this could be to produce spurious values for the "normalized map change" reported at the end of each iteration, and thus cause the iterative loop to be left prematurely. 

Change to way config files work

With today's cutting-edge starlink, configuration parameters work slightly differently in cases where a value is assigned to a parameter twice. For instance, if you config file holds

450.flt.filt_edge_largescale=600
flt.filt_edge_largescale=500

then previously the value 600 would have been adopted when processing 450 um data. With the new changes, the value 500 will be adopted instead. That is, unqualified values now take priority over qualified values.

This will be most relevant if your config file inherits from another config file (which almost all do), in which parameters are specified with qualifiers. Previously, in order to override such a value in your own config, you needed to include the wavelength. But now, you no longer need to include the wavelength.

These changes have been installed at Hilo. To use them at your locally, you will need to rsync from Hilo.

2012-10-26

Inside-out map-making

If the time-series data supplied to SMURF:MAKEMAP does not consist of a single continuous sequence of time samples, it will divide the data up into two or more "chunks" of continuous samples. A separate map is made from each chunk in turn, using the iterative procedure  described in SUN/258, and these maps are simply co-added at the end to create the final returned map. This procedure can be summarised as follows:

initialise final map to zero
for each chunk:
   initialise chunk map to zero
   for iteration = 1 to numiter
      update chunk map using time-series data in current chunk
   next iteration
   finalise chunk map
   add the chunk map into the total map
next chunk

In general, it seems that the more data that is included in each iteration, the more constrained is the map, and so the fewer the spurious features. So it would be nice if we could turn these loops "inside out" so that the iteration loop occurred outside the chunk loop, causing data from all chunks to be included in the map at the end of each iteration. 

initialise final map to zero
for iteration = 1 to numiter:
   for each chunk:
      update final map using time-series data in current chunk
   next chunk
next iteration
finalise final map

As an aid to experiment, I have added some simple features to MAKEMAP to allow something like this effect to be achieved, at the expense of considerably longer run-times and more effort on the part of the user. If experiments with these facilities show that there is a significant benefit in turning the loops "inside-out", then it may be possible to implement the idea in a more time-efficient manner.

The importsky configuration parameter:

The main feature in the new scheme is the facility to supply an initial estimate of the sky when running MAKEMAP. This is done by specifying 

importsky=ref

in the config file, and then assigning the NDF containing the  initial sky values to the "REF" ADAM parameter on the MAKEMAP command line:

% makemap ref=mysky.sdf

Before the first iteration starts, the specified sky map is sampled at the position of each bolometer value, and each sampled sky value is subtracted from the corresponding cleaned bolometer value. The remaining residuals are then processed as normal by estimating and subtracting (typically) COM and FLT models. The values sampled from the sky map are then added back onto the residuals, which are binned into a new map - the first "itermap". From then on, the algorithm proceeds to do the second and subsequent iterations in the usual manner.

This allows us to produce the "inside-out" looping described above by repeatedly re-running MAKEMAP, doing only one iteration on each run, and  supplying the output from one invocation as the initial sky map for the next invocation. In effect, you are doing the "iteration" loop by hand at the shell level.

Speeding things up:

This is slow. But with patience some experimentation, inside-out looping can be performed. It can be made slightly faster by caching some of the processing performed by the first invocation for use by subsequent invocations. Two things can be done:
  1. Caching the cleaned bolometer values. If "exportclean=1" is included in the config file on the first invocation of MAKEMAP, the cleaned time-series data will be written to disk in NDFs that end with the suffix "_cln.sdf". These NDFs can be used in place of the raw data as input for all subsequent invocations of MAKEMAP. In this case you should create a new config file for the second and subsequent invocations and add "doclean=0" to it, so that the cleaned input data will not be re-cleaned.
  2. Caching the EXT model values. Calculation of the EXT model values happens only once, before the first iteration starts. So if you add "exportndf=ext"  to the config file for the first invocation, a set of NDFs with suffix "_ext.sdf" will be created holding the EXT model values. These can be supplied as input to the second and subsequent invocations by adding "ext.import=1" to the config file. Note, MAKEMAP uses fixed pre-defined names for the NDFs when writing and reading EXT model values, so the NDFs created on the first invocation should not be moved or re-named. We also add "noexportsetbad=1" to prevent EXT values for flagged bolometers being set bad in the exported NDFs (since a different set of bolometers may potentially be flagged as bad on subsequent invocations). 
There are other possibilities for speeding things up, such as caching the LUT model, but these require changes to the code, which have not yet been implemented.

Other parameter settings:

  • Setting "numiter=1" is required to ensure only one iteration is performed by each invocation of MAKEMAP.
  • The NOI model, which contains estimates of the noise in each bolometer time stream, is normally calculated at the end of the first iteration, once the final residuals are known. This is no good for us here since each invocation only performs one iteration. The simplest solution is to add "noi.calcfirst=1" to the configuration for every invocation. This forces the noise estimates to be made before the start of the first iteration.
  • Care needs to be taken if any AST masking is used. Firstly, since we have set "numiter=1", the first iteration is also the last iteration. So we need to add "ast.zero_notlast=0" to the configuration for every invocation. Without this, no masking would be performed on any of the invocations of MAKEMAP. However, we do want to suppress masking on the final invocation, and so we should revert to the default value of "ast.zero_notlast=1" for the final invocation.
  • If an external mask is to be used, it should be supplied as normal as the REF parameter on the first invocation. The output map from the first invocation will have this same mask and so will mask the AST model correctly when supplied as the REF parameter on the second and subsequent invocations. 

An example:

In this example, four invocations are performed. How to choose the best number of invocations is still to be determined. Continuing until the RMS change within the source regions between subsequent maps  is less than some target figure may be the way to go. 

First invocation:

% makemap in=/SCUBA2/s8\*/20111114/00055/\* ref=mymask config=^conf1 out=map1

We make an initial map from the raw 850 um data files for 20111114 obs. 55 using "mymask.sdf" as an external mask. The output map is put into "map1.sdf". The configuration file "conf1" contains:

^/star/share/smurf/dimmconfig_bright_extended.lis
numiter=1          # Only do one iteration
850.flt.filt_edge_largescale=600
450.flt.filt_edge_largescale=600
ast.zero_mask=1
ast.zero_notlast = 0          # Mask the final (i.e. first) iteration
noi.calcfirst=1          # Calculate NOI before iterating
exportNDF=ext          # Export the EXT model values to NDFs
noexportsetbad=1        # Export good EXT values for bad bolometers
exportclean=1          # Export the cleaned  time-series data to NDFs

Second invocation:

% makemap in=./\*_cln.sdf ref=map1 config=^conf2 out=map2

We make a new map from the cleaned data files "*_cln.sdf" created by the first invocation (we assume the current directory contains no other cleaned data, for instance from an earlier run). The map created by the first invocation is supplied as the REF image, and the new map is put in "map2.sdf". The "conf2" file sets the importsky parameter to indicate that the initial sky supplied by the REF image should be subtracted from the time-series before starting the first (and only) iteration. 

^conf1          # Inherit all the settings in "conf1"
exportNDF=0         #  Do not export the EXT model this time
exportclean=0          # Do not export the cleaned data this time
doclean=0          # No need to clean the data this time
importsky=ref          # Subtract off the REF values at the start
ext.import=1          # Import the EXT values created on the first invocation

Third invocation:

% makemap in=./\*_cln.sdf ref=map2 config=^conf2 out=map3

Later invocations use the map created by the previous invocation as REF, and use the same config file as the second invocation.

Last (=fourth) invocation:

% makemap in=./\*_cln.sdf ref=map3 config=^conf3 out=final-map

On the last invocation we need to prevent the mask being applied, so we use a new config:

^conf2          # Inherit all the settings in "conf2"
ast.zero_notlast = 1          # Do not mask the final (i.e. first) iteration

2012-10-19

Using Multiple Masks

Today I made some changes to the way that MAKEMAP handles masks for the AST, FLT and COM models. Previously, only one mask could be used with each model. So for instance if the following configuration was used:

ast.zero_mask = 1
ast.zero_lowhits = 0.1

the AST.ZERO_MASK setting would take precedence over the AST.ZERO_LOWHITS setting, resulting in the external mask supplied via the REF parameter being used, and the AST.ZERO_LOWHITS setting ignored.

The current behaviour is now to combine such masks together. So in the above example the external mask and the lowhits mask would be combined to create a single mask. The combination is done in a manner determined by the AST.ZERO_UNION parameter. If this parameter is true (i.e. non-zero), a pixel in the combined mask is considered a "source" pixel if it is flagged as a source pixel in any of the individual masks (i.e. the combined source area is the union of the individual source areas). If AST.ZERO_UNION is false (i.e. zero) , a pixel in the combined mask is considered a "source" pixel if it is flagged as a source pixel in all of the individual masks (i.e. the combined source area is the intersection of the individual source areas). The default for AST.ZERO_UNION is 1 - that is, the union of the masks is used by default.

In addition, a different external masks can now  be used with each model. This is achieved by using a new syntax for the XXX.ZERO_MASK configuration parameters. Each of these configuration parameters can now be given the name of the ADAM parameter which is to be used to get the corresponding mask. [Just to remind you, "ADAM" parameters are different to configuration parameters - all the configuration parameters are obtained using a single ADAM parameter called CONFIG.] External masks can be specified using any of the ADAM parameters "REF", "MASK2" and "MASK3" ("REF" serves the purpose of "MASK1"). The REF parameter has been around for a long time, but the MASK2 and MASK3 parameters are new.

So for instance, to  use different external masks for the AST and FLT models, you could do:

ast.zero_mask = REF
flt.zero_mask = MASK2

and then run makemap as

% makemap ref=ast_mask.sdf mask2=flt_mask.sdf

Or to use the same mask for both models, you should do:


ast.zero_mask = REF
flt.zero_mask = REF

% makemap ref=mask.sdf


The old XXX.ZERO_MASK syntax is still supported. Supplying a positive integer value causes the mask to be accessed using the REF parameter, as before. The default for each XXX.ZERO_MASK parameter is still zero, meaning that no external mask is used.

One twist to beware of with the new scheme, is that the REF parameter is treated somewhat differently to the MASK2 and MASK3 parameters. The REF parameter is still used, as it always has been, to define the pixel grid of the output map. This means that any mask supplied via the REF parameter will, by definition, be aligned with the output map. The same is not true of the MASK2 and MASK3 parameters. If using these parameters, you must ensure that the mask NDFs are aligned in pixel coordinates with the output map. The easiest way to do this is to create your mask from the output map of a previous similar run of makemap.

2012-09-21

SCUBA-2 Data Reduction Manual 

A new cookbook, sc21,  is available describing the reduction of SCUBA-2 data. This cookbook focusses on the Dynamic Iterative Map-Maker (or map-maker), which performs the processing of raw data into a science grade map, along with common post-processing steps.

 http://www.starlink.ac.uk/docs/sc21.htx/sc21.html

An explanation of how the map-maker works is given, including details of the default parameters which control each stage. Specialised configuration files which alter these parameters to suit different science goals are also introduced.

The cookbook guides users through all the post-processing options, from cropping and co-adding maps using PICARD, to applying an FCF and calculating the noise. The science pipeline is also discussed, with instructions for running it on a local machine.

Chapter List:
1. Introduction
2. SCUBA-2 Overview
3. Raw SCUBA-2 Data
4. The Dynamic Iterative Map-Maker
5. Reducing your Data
6. Tweaking the Configuration File
7. Examples of Different Reductions
8. The SCUBA-2 Pipeline
9. SCUBA-2 Data Calibration

Processing CLS data with ORAC-DR

A dedicated recipe has been written to process data from the Cosmology Legacy Survey using a "jack-knife" method of estimating and removing residual large-scale noise. The recipe is based on a script written by Ed for S2SRO data (and used to produce the figures in the SMURF paper). In this article I will describe how to use the recipe, what steps it performs and what parameters may be used to control the processing. In the future, this technique may be made more widely available for applying to other data sets (not just blank field ones).

How do I use it?

To use this recipe, initialize the pipeline, create a text file with the names of all the files to process and a file with recipe parameters if necessary. Then add the recipe name REDUCE_CLS to the usual command line:

% oracdr -loop file -files filenames.lis -recpars params.ini 
  -log sf -nodisplay REDUCE_CLS
By default the recipe uses the "blank field" makemap config file (though this may be overridden with the MAKEMAP_CONFIG recipe parameter; see below). The recipe creates a slew of output files with the following suffices:
  1. _fmos - signal map (1 for each observation)
  2. _mappsf - map-filtered PSF (1 for each observation). This is the same as the above signal map but with an artifical gaussian added to the time-series (and located at the map centre).
  3. _wmos - coadded signal map
  4. _wpsf - coadded map-filtered PSF
  5. _jkmap - jack-knife map
  6. _whiten - whitened signal map
  7. _whitepsf - whitened, map-filtered PSF map
  8. _cal - calibrated, whitened signal map
  9. _mf - above map processed with matched-filter (using the _whitepsf map as the PSF)
  10. _snr - signal-to-noise map created from above map (_mf)
Note that the per-observation files begin with s, and the coadds (and subsequent products) begin with gs. The ones you are probably most interested in are the _mf and _snr files.

What does it do?

The recipe works as follows.
  1. Each observation is processed separately to produce a signal map (_fmos). In addition, each observation is re-processed with an artificial gaussian source added at the map centre (this will be used to create the "map-filtered PSF" image, _mappsf).
  2. The signal maps are combined using inverse-variance weighting to create a coadded signal map (_wmos). The images with the artificial gaussians added in are also coadded to produce the "map-filtered PSF" (_wpsf).
  3. The data are split into two groups made from alternating observations. Each group is coadded, and the two coadds are subtracted to produce the jack-knife map (_jkmap).
  4. The SMURF command sc2filtermap is used to estimate the radial angular power spectrum (circular symmetry is assumed) within a region defined by twice the minimum noise in the coadded signal map. (The size of this region is written to the FITS header of the output map under the keyword WHITEBOX.)
  5. The inverse of this angular power spectrum is applied to the coadded signal and PSF images (_whiten and _whitepsf) to remove residual low spatial-frequency noise.
  6. The amplitude of the fake source in _whitepsf is compared with the input value to derive a corrected FCF. This new FCF is used to calibrate the whitened signal map to create the _cal map.
  7. A matched filter is applied to _cal using _whitepsf as the PSF to create _mf.
  8. Finally, a signal-to-noise ratio image is created (_snr).

What parameters are available?

The following recipe parameters can be used to control the processing:
  • FAKEMAP_SCALE - Amplitude of the fake source (in Jy) added to the timeseries to assess the map-making response to a point source.
  • MAKEMAP_CONFIG - Name of a config file for use with the SMURF makemap task. The file must exist in the current working directory, $MAKEMAP_CONFIG_DIR, $ORAC_DATA_OUT, $ORAC_DATA_CAL or $STARLINK_DIR/share/smurf.
  • MAKEMAP_PIXSIZE - Pixel size in arcsec for the output map. Default is wavelength dependent (4 arcsec at 850 um, 2 arcsec at 450 um).
  • WHITEN_BOX - Size of the region used to calculate the angular power spectrum for removing residual low-frequency noise in the data. Default is a square region bounded by the noise being less than twice the minimum value.

What if I want to run it again?

In addition to the full pipeline recipe, there is a PICARD recipe called SCUBA2_JACKKNIFE which performs all of the post-map-making steps. This can be used to examine the influence of including or omitting individual observations (say ones with visible artefacts that the pipeline is not able to trap), or investigate the effect of varying the size of the whitening region, or trim the images to a specific size before re-running the jack-knife steps.

Note that the pipeline must have been run once to produce all the necessary files which go in to this recipe. And note that the full pipeline recipe must be run again if a different config file or input gaussian amplitude is to be used.

The majority of the flexibility in controlling the processing occurs after the individual signal maps have been created. As a refresher, running this recipe would mean typing:

% picard -log sf -recpars params.ini SCUBA2_JACKKNIFE myfiles*.sdf
All of the control is through the recipe parameters in the file params.ini. The most important item to note is that you must provide a map-filtered PSF (otherwise a default will be used). However, you already have one of these: the file ending _mappsf.sdf listed above.
  • PSF_MATCHFILTER - the name of the map-filtered PSF file
  • WHITEN_BOX - Size of the region used to calculate the angular power spectrum for removing residual low-frequency noise in the data. Default is a square region bounded by the noise being less than twice the minimum value.
This is only the first version of the recipe, and its limitations have yet to be determined. However, one case I have found where the recipe fails is in a large, multi-field map with highly-variable noise. In this case it might be better to process each field separately before creating the mosaic (though this has the disadvantage that different filtering may be applied to each field). Try it out and see what you get. All feedback welcome.

2012-09-12

FIT1D: A new SMURF command for ACSIS data

More time ago than I am willing to admit, I started coding a Starlink routine to fit spectral lines in ACSIS cubes. Got a long way until SCUBA-2 commissioning and calibration put a halt to it, but I have finally managed to finish the program, technically a beta-version, as part of the upcoming Kapuahi release of Starlink.

Usage:
% fit1d  in  out  rms  [config]  [userval]  [pardir]  [parndf]  [parcomp]

What distinguishes FIT1D from other profile fitting routines is that it specifically attempts to deal with two issues: non-gaussian profile shapes and the fact that ACSIS data-cubes have many, potentially very different, profiles to fit. Regarding the latter, there are many fitting routines that produce fitted profiles for data-cubes, but FIT1D also produces cubes with the fitted parameters themselves and can use such files as input to give control over the fit of, in principle, each individual profile in the cube. Thus it is e.g. possible to fit broad lines on the nucleus of a galaxy and narrow lines everywhere else. More about that below.

A section has been added to the SMURF documentation in SUN/258 about FIT1D, which I will try to summarize here. FIT1D is generic in that it can fit profiles along any axis of an up to 7-dim hyper-cube, but will be discussed here in the context of a default RA-Dec-Vel ACSIS cube. Note that the routine assumes that data have been baseline-subtracted, using e.g. MFITTREND, i.e. that the profiles have a zero-level at 0. 


Gauss-Hermite shapes as a function of the 3rd-order
      skewness coefficient 'h3' and the 4th-order the kurtosis (peakiness)
      coefficient 'h4'. The red box indicates the limits on acceptable
      values for h3 and h4 as defined in the defaults configuration file. Note
      that the fitted profile by default is restricted to positive values
      and will omit the shown negative features.

Non-gaussian Profiles.

FIT1D essentially re-implements the fitting code for non-gaussian profiles from the GIPSY package (Kapteyn Institute, Groningen, The Netherlands). Function types that can be fitted are Gaussian, Gauss-Hermite, and Voigt profiles. In particular, Gauss-Hermite functions are a powerful extention when fitting profiles that are skewed, peaky, or only approximately gaussian. The figure above shows Gauss-Hermite profiles as a function of the skewness coefficient h3 and the kurtosis (peakiness) coefficient h4. See  for SUN/258 further details, but note that the default setting in the configuration file is for FIT1D to suppress the negative features in the fitted profiles and to leave only the positive part of Gauss-Hermites.

Because of their ability to fit distorted shapes, Gauss-Hermites are particularly well suited to "capture" the maximum amount of emission from a cube. The fits can be remarkably accurate as is shown in the the figure below showing a 3-component fit (i.e. up to 3 spectral lines) using gausshermite2 functions (i.e. fitting both h3 and h4). Collapsing the resulting cube with fitted profiles can thus result in an accurate and almost noise-free white-light or total-emission map.
Fit1d - Black: original profiles; Red: results of a
    3-component Gauss-Hermite2 fit (fitting both h3 and h4)


FIT1D derives its ability to fit a complex line-shape both from the Gauss-Hermite function but also from that it can fit multiple (sub) components to get the best match possible. However, that can make the interpretation of the fits in terms of the physical characteristics and quantities difficult, hence for those you may also want to make a fit of the line-shape using a single standard Gaussian function. 

Component Parameter files

Besides a data-cube with the fitted profiles FIT1D also outputs so-called Component parameter files as NDF extensions in the header of the output file. These can also be copied out as independent data-cubes. There is a file for each component (i.e. line) that was fitted along the profile up to the number of components requested by the user. Each plane of a Component parameter file has an image of the value of a fitted parameter across the field-of-view. For instance, the one resulting from a gaussian fit has images respectively showing the fitted Amplitude, Position (velocity), and FWHM as well as a plane with an id-number of the function used.

Much of the (anticipated) use of FIT1D derives from the fact that Component parameter files can be used as input as well: either to provide initial estimates or fixed values to the fitting routine.  The difference between values specified in the Component parameter files
and ones declared in a User parameter values file is that the former can vary across the field-of-view whereas the latter will result in the same value being used for all profiles. E.g. for use with spectral-line surveys the User parameter values file can be used to provide initial estimates of the frequencies or velocities at which lines are expected or to fix fits at those frequencies.

By manipulating Component parameter files e.g. resulting from an initial fit, the user can customize or correct subsequent fits. In extrema, a Component parameter file could be made from scratch based on a model and be used to create a spectral-line data-cube with that model (config option: model_only=1) or be used as initial estimates for a fit. Of more practical use, Component parameter files can be used to correct problems associated with a fit since the art of fitting is not in the fitting algorithm, but in providing accurate initial estimates. For instance, the left image below shows a section of an Amplitude plane of a fit where there are problems in a few locations. Setting these location to bad values and using FILLBAD to interpolate over them, the corrected Component parameter file was used as initial estimate for a subsequent fit, resulting in the image on the right

Fit1d - Left: Section of a parameter file showing
      originally fitted amplitudes; Right: Amplitudes after using a
      corrected parameter file from the original fit as initial estimates
      for a subsequent fit.

More creative options are possible: after an initial fit with a gaussian, the function id can be changed to a gausshermite1 in part of the field and the resulting file used as initial estimates for a subsequent fit to account for skewed profiles there. Similarly, the initial guess of the FWHM can be made wide on e.g. nucleus of a galaxy while leaving it more narrow outside. As another example, the fit of multiple components can be limited to only part of the field by setting the parameter file for the second and higher components to bad values outside the relevant region (multiple component parameter files can be used as input: one for each component to be fitted).

In conclusion: please remember that this is a beta-release and that you may run into unanticipated issues. Also chosen limits in the configuration file may need tweaking. If an initial fit looks poor, try adjusting minamp (in units of rms!) or, in particular, minfwhm (in units of pixels!) in the configuration file (see: $SMURF_DIR/smurf_fit1d.def). Also use range to limit the fit to a relevant region of the spectrum.

The released implementation of FIT1D can fit up to 7 components per profile per run, but the output of multiple runs each covering a range in velocities or frequencies can be combined. The fit itself is fully multi-threaded and will be much faster on a modern multi-core computer: a 3-component gausshermite2 fit of 1.1 million spectra (a 2 Gb input file) took 15 minutes on a dual-core, 16 Gb memory machine versus 4 minutes on one with 12 cores and 75 Gb of memory.

Happy fitting!

Remo



2012-08-23

SCUBA-2 reference publications webpage

Please see the following page for appropriate references for the SCUBA2 instrument, calibration and data reduction at:


http://www.jach.hawaii.edu/JCMT/continuum/scuba2/scuba2_references.html

This includes an arxiv link to the Dempsey et al 2012 SPIE paper on SCUBA-2 commissioning which can be used as the reference for SCUBA-2 calibration.
http://arxiv.org/abs/1208.4622

A trilogy of up-to-date SCUBA-2 papers  on the instrument, data-reduction and calibration are currently ready for submission. Please check back to the above link periodically. The new links will be added when they are in press. 

2012-08-16

Updates to FCFs and extinction correction

Over the past couple months the flux conversion factors and extinction corrections have been updated and we are now very happy with the answers. The main issue from the user perspective is that the FCF you apply critically depends on which version of SMURF was used to generate the map. At the time of writing this data products downloaded from CADC use an older extinction correction than the value you will find in place for the current SMURF. We plan to update CADC processing shortly but reprocessing the historical archive will take some time.

We have set up a web page at JAC listing the parameters that should be used and instructions on how to determine which version of the software was used to generate your map:

http://www.jach.hawaii.edu/JCMT/continuum/scuba2/scuba2_relations.html

2012-06-17

Displaying an outline of the mask used to create a map

The smurf:makemap command allows a mask to be specified that defines background areas on the sky. These background areas are forced to zero on all but the last iteration in order to suppress spurious large scale structures. The user may supply an external NDF to be used as the mask, or alternatively makemap can be left to generate its own mask based (for instance) on the evolving SNR at each pixel. Either way, it is often useful to visualise the mask that was used to create a given map.This can be done using KAPPA commands as follows (assuming the map created by makemap is in file "map.sdf"):

% kappa
% lutgrey
% display map
% contour clear=no mode=free heights=0.5 comp=quality map style='colour=red'


This will display the final map as a greyscale image with the mask outlined in red. Other properites of the mask outline (colour, line thickness, etc) can be specified by including other options in the "style" parameter value when running contour.


2012-05-31

Preventing self-masks from diverging

Masks can be used in various ways in MAKEMAP. For instance, the dimmconfig_bright_extended.lis file specifies:

ast.zero_snr = 5

which causes the areas of the map that have a signal-to-noise ratio of less than 5 to be masked on each iteration. This is a dynamic mask, in that a new and potentially different mask is generated and used on each iteration. However, the added flexibility introduced by allowing such a mask to change on each iteration can result in eventual divergence of the iterative  map-making algorithm. To prevent this, a set of new configuration parameters has been added that result in the mask being frozen after a specified number of iterations. These parameters are AST.ZERO_FREEZE, COM.ZERO_FREEZE and FLT.ZERO_FREEZE. Each of these parameters may be set to a positive integer value, indicating that he corresponding mask should be frozen after that many iterations. They all default to zero, which reproduces previous behaviour (i.e. the mask is never frozen).

2012-05-25

Better core utilisation

Over the past couple of weeks I've modified several parts of the makemap command to split up their calculations between all available threads rather than doing them all in a single thread. This means that makemap should now run significantly faster on multi-core machines. The speed up will be greater for larger data sets than for smaller data sets. As an example, producing a map using a single subarray from a  40 minute observation took  11 mins 49 secs with the new scheme on my 12 core machine, and achieved an average CPU load of 791%. With the old scheme, the same map took 19 minutes 14 secs with an average CPU load of 461 %.

2012-05-17

Changes to the common-mode estimation

For the impatient: the new improved algorithm described below is enabled by default and requires no changes to scripts or configuration files. However, the old algorithm is still available, and can be used by setting COM.OLDALG=1 in the configuration file.

For extended sources, MAKEMAP usually uses the mean normalised change in map pixel values between iterations as a measure of convergence. This value drops rapidly over the first few iterations, and eventually levels out, typically somewhere between 0.01 and 0.05. However, if MAKEMAP is allowed to continue iterating beyond this point, the mean normalised change between iterations often starts to increase and decrease by large amounts in an erratic manner. Examining the maps shows that these changes reflect the creation of nasty artifacts in the map, which often take the form of individual bolometer tracks. As an example, Dave Nutter saw the following variations in mean normalised change with iterations for one particular observation:


A relatively smooth decrease until about iteration 50 and then madness. For illustration the following shows the map after 50 iterations (left) and 200 iterations (right). Notice the visible streaks that have appeared by iteration 200.



The cause of this wild behaviour seems to be the way the common-mode signal is estimated and used. On each iteration, the common mode value at each time slice is taken to be the mean of the bolometer values at that time slice, averaged over all remaining bolometers  (i.e. bolometers that have not previously been flagged as unusable for one reason or another). Each bolometer time series is then split into blocks of 30 seconds and compared to the corresponding block of the common-mode signal. This comparison takes the form of a least squares linear fit between the bolometer value and the common-mode value, and generates a gain, offset and correlation coefficient for each bolometer block. Bolometer blocks that look "unusual" - either because they have a low correlation coefficient (i.e. do not look like the common mode) or an unusually low or high gain - are flagged, and all the bolometer values within such blocks are rejected from all further processing. So on the next iteration, the common mode within each block is estimated from only the subset of bolometers that passed this check on all previous iterations. With the passing of iterations, more and more bolometer blocks are rejected, often resulting in the common mode in a block being estimated from a completely different subset of bolometers to those of the neighbouring blocks. This produces discontinuities in the common-mode signal at the 30 second block boundaries, which get larger as more iterations are performed. When the common-mode is subtracted from the data, these discontinuities are transferred into the residuals, which are then filtered within the FLT model. The sharp edges at the discontinuities cause the FFT filtering to introduce ringing (strong oscillations on the scale length of the filter that extend over long times). The following plot shows an example of these discontinuities in the common-mode estimate:


The horizontal axis is sample index. The red curve is the common mode at iteration 85 and the white curve is the common mode at iteration 99. Due to down-sampling, 30 seconds corresponds to 2400 samples, and discontinuities at intervals of 2400 samples are clearly visible. In addition, extensive oscillations are visible, particularly between samples 48000 and 50000.

In order to avoid this problem, changes to the way the common-mode is estimated and used have been made. The heart of the change is that bolometer blocks that are flagged as unusual are no longer excluded from the estimation of the common-mode on the next iteration - all such flags are cleared before the new COM estimate is made. So the common mode is always based on data from all bolometers, thus avoiding the discontinuities at the block boundaries. Other more minor changes in the code include:

  1. The common-mode value at each time slice can be estimated using a sigma-clipped mean, rather than a simple mean. The new configuration parameters COM.NITER (the number of iterations) and COM.NSIGMA (the number of standard deviations at which to clip on each iteration) control this algorithm. In practice, sigma-clipping does not seem to add much, and may slow down convergence. At the moment, both of these parameters default to 3. Changing COM.NITER to 1 results in a single simple mean being calculated with no clipping. This may become the default in the near future.
  2. No attempts are now made to refine the common mode. Previously, once the original COM signal was estimated and unusual bolo-blocks were flagged, a refined COM signal was formed by excluding the flagged bolo-blocks. Further unusual bolo-blocks were then flagged by comparing them with the refined COM signal. This refinement process was repeated until no further bolo-blocks were rejected. In the new code, no attempts are made to refine the original COM signal resulting in the new algorithm being significantly faster than the old algorithm.
Note, the new algorithm still uses COM.GAIN_BOX to define the size of the blocks in which the gain and offset of each bolometer is estimated. Also, "unusual" blocks are still flagged in the same way and excluded from the subsequent FLT model and map .

The new algorithm seems effective at overcoming the problem of divergence illustrated at the start of this post. For instance, with the new code the variation of mean normalised change between iterations for the same data looks as follows:


 Note that:

  • The wild variations above iteration 80 have gone.
  • The curve drops somewhat faster at the start (i.e. fewer iterations are needed to achieve a given maptol).
  • Each estimation of the common-mode signal is faster than before due to the lack of the refining process.
  • Far fewer samples are rejected from the final map ( 2.38 % of the data is now flagged by the COM model, as opposed to 17.37 % previously).
  • The resulting maps at iteration 50 and 200 are shown below.  There is very little difference between them, and they both look visually like the iteration 50 map from the old algorithm shown above.

2012-05-01

Interrupting makemap using control-C

So you're sitting there waiting for a long run of makemap to finish thinking "It's going to take ages! I asked for far more iterations than I need, but I can't restart it because it's taken so long to get to this point". As of git commit efde7077500 you can simply hit control-C and all will be well - makemap will continue running until the end of the next iteration. It will then create a map from the information it has, and terminate normally. Note, it needs to perform one more whole iteration before creating the map in order to get proper pixel values for the areas masked by ast.zeromask. One thing to beware of though - if the input data is split into more than one chunk, subsequent chunks that have not yet been started will not be included in the map.

If you want to abort NOW! without a map,  rather than waiting for the next iteration to complete, then press control-C a second time.

Note, if you are running makemap within a shell script, then the shell may handle the control-C signal itself, leading to some potentially odd behaviour. For instance,  the script may appear to terminate immediately, but in fact may leave the makemap process running in the background until the next iteration has completed. Most shells have ways of controlling what happens when an interrupt signal is detected. For instance, the (t)csh has the "onintr" command.

2012-04-17

Masking of FLT and COM models

In previous blogs (e.g. Suppressing large-scale noise with zero-masks), Ed has described how masking the AST model - the astronomical signal - between iterations can reduce the amplitude of low frequency structures in the final map. Over the last month or two, I have extended the uses of masks to allow the COM and FLT models to be masked in a similar way to the AST model.

COM masking: The common-mode signal is the average of all available bolometer time streams. Once found, a least-squares linear fit is performed between the average and each individual bolometer time stream. This fit generates a scale factor and offset, together with a correlation coefficient. Bolometers that have a low correlation with the common-mode are rejected as bad. At any one time, the astronomical source is seen only by a subset of the bolometers, and so will be largely absent in the common-mode estimate. This means that bolometers that see the source will have a lower correlation to the common mode than those that do not see the source, which can result in bolometers being rejected unnecessarily. To avoid this, it is now possible to exclude samples that see the source from each time stream when finding the common-mode signal, and also when performing the least squares linear fit. This can produce a better estimate of the common-mode, unpolluted by bright sources samples, and also better estimates of the correlation coefficients, resulting in fewer bolometers being rejected.

The masking of the COM model is controlled by the new configuration parameters COM.ZERO_MASK, COM.ZERO_CIRCLE, COM.ZERO_LOWHITS and COM.ZERO_SNR. These are all analogous to the corresponding set of AST.xxx parameters used for masking the AST model. As an example, a configuration of:

^/star/share/smurf/dimmconfig_bright_compact.lis
com.zero_circle = 0.02

was used to produce an 850 um map of Mars from observation 40 on 20120202. This excludes samples from the COM model that are within a circle of radius 0.02 degrees, centred on the source. The resulting final map contained 6744407 samples (or 2877.31 bolos). Without the COM masking, the final map contained 4676093 samples (1994.92 bolos).

Sadly, not all observations show such an improvement using COM masking. Indeed, some show little or no improvement, and some actually show a decrease in the number of samples in the final map. For this reason, COM masking is not switched on by default in any of the dimmconfig files distributed with SMURF.


FLT masking: A similar set of configuration parameters - FLT.ZERO_MASK, FLT.ZERO_CIRCLE, FLT.ZERO_LOWHITS, etc., are now available to control masking of the FLT model. The FLT model holds the low frequency background for each bolometer time stream, and is estimated by smoothing the time stream using a filter defined by FLT.FILT_EDGE_LARGESCALE, etc. On the first iteration, any bright astronomical source in the data will result in the smoothed background being over-estimated in the vicinity of the source. This in turn, produces a negative bowl around the source when the background is removed. It can take many iterations for the map-maker to reduce this bowling to acceptable levels. Using the new FLT masking parameters allow the source to be omitted before estimating the background in each bolometer time stream (the source samples are replaced by a values interpolated from the neighbouring data, with added noise). Masking the FLT model in this way on the first two iterations can have a dramatic effect on the number of iterations need to reach convergence. For compact sources, it can typically reduce the number of iterations required to 50% of the original number, and sometimes fewer.

For this reason, FLT masking is switched on by default in the dimmconfig_bright_compact.lis file, using the settings:

flt.zero_circle = 0.0167
flt.zero_niter = 2

The effect of FLT masking on extended sources is marginal, and so is not switched on by default in dimmconfig_bright_extended.


2012-02-03

SCUBA-2 Calibration: REDUX.

The short story:

  • The heater coupling factors have been adjusted to more realistic values. In practice, this does not change the performance of the instrument - however it does change the absolute value of the FCFs. These values were adjusted in the software in mid-December.
  • The WVM tau algorithm has been fixed and improved. This will not affect you directly: though the nightly plots now look extremely good and are officially used for weather band determination.
  • This has allowed new, and better calculation of the relation between the 225GHz tau derived from the WVM and the opacities at the two SCUBA-2 filter-bands. They are now as follows:
TAU_[850] = 4.6 * (TAU_[225] - 0.0043)
TAU_[450] = 26.0 * (TAU_[225] - 0.019)
  • The FCFs (flux conversion factors) have been derived for both wavelengths from an extensive reduction of calibrator sources observed over eight months of SCUBA-2 commissioning and science verification observations. They are as follows:
850um:
FCF_[arcsec] = 2.42 +/- 0.15 Jy/pW/arcsec**2
FCF_[peak] = 556 +/- 45 Jy/pW/beam
Beam area = 229 arcsec**2

450um:
FCF_[arcsec] = 6.06 +/- 0.32 Jy/pW/arcsec**2
FCF_[peak] = 606 +/- 55 Jy/pW/beam
Beam area = 97 arcsec**2


  • Reminder on how to calibrate your data:

Other posts discuss how best to reduce your data (and what recipes are needed). The latest software releases (since January 2012) all include extinction correction (with the relations above) and the changed coupling factors. If you reduced your data prior to this, you will need to reduce them again to account for these changes. Applying the FCFs reported here to old reductions of your data will be wrong.
  • The arcsec FCF: (when you want integrated fluxes)

The arcsec FCF is the factor by which you should multiply your map if you wish to use the calibrated map to do aperture photometry.

  • The peak FCF: (the FCF-formerly-known-as-beam):

This FCF is the number by which to multiply your map when you wish to measure absolute peak fluxes of discrete sources.


The whys and the wherefores (the gory details):

  • Reductions of the calibration observations:

Uranus and Mars were used as the primary calibrators for these results. In addition, CRL 618, CRL 2688 were also predominant secondary calibrators. All of the calibrators were reduced in January 2012 using the updated dimmconfig_bright_compact.lis. The improvements in the recipe (mostly in how it chooses to stop iterating) have resulted in extremely flat maps with nearly no evidence of the 'bowling' seen around strong sources during S2SRO reductions. To improve the accuracy of the peak-fitting and aperture photometry, the maps were reduced with 1 arcsecond pixels at both wavelengths.

Once the maps were reduced they were then analysed using the PICARD script SCUBA2_FCFNEFD. There have been some changes to this script: some few bugs have been fixed, the average FCF's have been adjusted, as well as the (now emprically derived) beam area, and a few reference fluxes have been adjusted. FCF_beamequiv has been removed entirely, and all calculations of integrated values are now done using AUTOPHOTOM, with a defined aperture and annulus for background subtraction.

Following extensive analysis of the most optimal parameters, all calibrations were reduced using a 60" diameter aperture (at both wavelengths) with an annulus between 90" and 120" from the source position.


  • Questions? Let's provide answers to a few we've already seen:


  • "These FCFs are very different to the old numbers quoted!"

Yes they are. The heater coupling factor change and the new tau relations play a significant part in this. But in addition, the large sample of observations has allowed for a much more accurate determination of the beam area. At 450um in particular, optical effects of the telescope show that the error beam is large and the beam is not gaussian. This results in an effective FWHM that is much broader than the 7.5" quoted previously (though that is the approximate FWHM of the fit to the centre of the beam) - it is more like 9.5", taking into account the error beam. Therefore, the measured (and fitted) peak is relatively lower, requiring a higher FCF to calibrate the peak flux in your data.


  • "There is a lot of scatter when I calculate the 'beam' or peak FCFs for my calibrators (particularly at 450um)"

No kidding. Peak values are obtained either from reading off the peak of the map (in gaia or by another method) or by fitting to the peak using beamfit (as is done in PICARD). The beam shape (particularly at 450um) can be extremely susceptible to changes in focus and atmospheric instability, amongst other things. The integrated value (FCF_arcsec) is more robust against such changes. If you are measuring a peak fit from a calibrator and see a strong deviation from the expected value things to check are:

- how 'focussed' does the image look? If you see distortion in the shape of a source that should be point-like, or distortion or 'shoulders' in the beam then it is likely that the peak value will be unreliable.

- was the observation taken early in the evening? Focus and atmospheric effects are known to be worst in the early evening hours and sometimes in the morning after sunrise. If you are looking at calibrators, try and look at ones taken later in the night and see if there is improvement.

A 'trap' has been set in PICARD to warn you if the attempted fit to the peak misses the actual peak value by more than 10%. Looking at the fit to the shape also helps in this instance. In any case, the quoted peak FCF value at the top of the post is derived from the arcsec FCF and the empirical beam area derived from nearly 500 observations at both wavelengths and this number has been shown to be robust.


  • "How stable are these FCFs? (read: do I need to reduce my own calibrators?)"

Very. The absolute errors at both wavelengths are within 5% and no significant trends have been seen in the last six months. Instrument performance is being monitored very closely and any deviations are likely to be noted specifically. However, we do not discourage you taking calibrators from the nights your data was taken and reducing them yourself - we appreciate the sanity checks! Another handy rule: if you do it to your data, do it to your calibrator. If you have specific methods you plan to use on your data, apply the same methods to your calibrator in order to ensure your calibration is correct. We are now happy to say though, that these FCFs look stable and correct, so using these numbers should provide you with well-calibrated data.


  • "What happened to FCF_beamequiv?"

FCF_beamequiv is a seductive, evil little value that tempted us to stray to the dark side, albeit temporarily. In essence it was created to use as a comparison to SCUBA performance, but should never have been used to actively calibrate SCUBA-2 data as it assumed a perfect gaussian beam. The statements above explain that this is patently untrue, especially at 450um. The beamequiv number was quoted previously, and incorrectly, as the true FCF, and it is largely the reason that the new (and correct) numbers seem so much larger. We have banished it from PICARD and it shall now be known as the FCF-that-shall-not-be-named.