Pipelines and Archives

2014-06-06

Change to FLT Ringing Filter

If you use the AST.SKIP and FLT.RING_BOX1 configuration parameters to prevent artificial blobs of emission appearing in your SCUBA-2 maps, then you may have seen cases where the centres of real bright sources are badly affected, as in the following example spotted by Harriet:

This issue may also cause complete holes to appear around bright sources. These problems are caused by the fact that the ringing filter is applied on the initial "pre-iterations" specified by AST.SKIP, as well as on all subsequent "real" iterations. If the ringing filter is restricted to just the "real" iterations, the following map is produced (left uses the same scale as above, right is a close up of the central region):

The stardev system at Hilo has been updated to include this fix.

For those interested in the details... The problem is caused by the fact that no FLT masking can be done on the first pre-iteration, since there is no map available at that point to use as a mask. Therefore bright sources cause heavy ringing on the first iteration. This ringing is then detected and the corresponding samples flagged by the ringing filter. This means that the bright sources are blanked out in the map created at the end of the first iteration. On the second pre-iteration, this map is used to locate the bright sources that should be masked out when estimating the FLT model - but the bright source is not present in this map and so is not masked out on the second iteration either, leading to heavy ringing again. And round it goes doing the same thing on each iteration. Restricting the ringing filter to "real" iterations solves the problem because bright sources have been masked out in the FLT model by the time the ringing filter is used for the first time, and so have not caused the heavy ringing that would otherwise have been detected and flagged by the ringing filter. In fact, in may have been sufficient just to inhibit the ringing filter on the first pre-iteration, rather than on all pre-iterations, but this would have been of little benefit since blobs only start to develop once the AST model is included in the iterative scheme (i.e. once all the pre-iterations have completed).

2014-06-05

Interpreting SCUBA-2 map Quality Arrays

SCUBA-2 maps created using "AST-masking", such as is done by dimmconfig_bright_extended, will contain a Quality array indicating the background pixels that were masked (i.e. forced to zero). In itself this is fairly simple - background pixels have a non-zero Quality value and source pixels have a Quality value of zero. However, it is also possible to have independent masks for the FLT and COM models, in addition to the AST mask. For instance, if the "auto-FLT masking" scheme is used the Quality array contains both a FLT and an AST mask and so some care needs to be used when interpreting the Quality array.

Each value in the Quality array is restricted to taking integer values between 0 and 255, and so can be thought of as 8 separate bits. Each "bit plane" within the Quality array holds a single mask - AST, FLT or COM. The showqual command can be used to find out which mask is held by which bit plane:

% kappa
% showqual fred.sdf
   AST (bit 1) - "Set iff AST model is zeroed at the output pixel"
   FLT (bit 2) - "Set iff FLT model is blanked at the output pixel"

This means that the AST mask is stored in bit 1 (the least significant bit), the FLT mask is stored in bit 2, and there is no COM mask. Note, if a map was produced using FLT masking but no AST masking, then the FLT mask would be stored in bit 1.

The decimal integer value of any element of the Quality array is equal to the binary value formed from the bits listed by showqual. So in the above case the maximum Quality value is 3 (the decimal equivalent of binary "11" - both bits set). Remembering that a bit is set (i.e. is 1) for background pixels and cleared (i.e. is 0) for source pixels, it follows that the four possible decimal Quality values in the above case (0-3) are:

0 - neither bit set, so the pixel is inside both the AST and the FLT mask (a source pixel).

1 - bit 1 set but not bit 2, so the pixel is outside the AST mask but inside the FLT mask (a border-line pixel).

2 - bit 2 set but not bit 1, so the pixel is inside the AST mask but outside the FLT mask (a border-line pixel).

3 - both bits set, so the pixel is inside neither mask (a background pixel).

In the following screen shot of a Quality array displayed using GAIA, the black areas have value zero and are thus inside both masks, the dark brown areas have value 2 and are thus inside the AST mask but outside the FLT mask. The light brown areas have value 3 and are inside neither mask. In this particular case, there are no areas with a quality value of 1, so the FLT mask is contained entirely within the AST mask.

There are several commands within kappa that manipulate Quality arrays in various ways. For instance, the setbb command allows pixel data values to be set bad if the associated Quality value has a specified collection of set bits. Thus:

% setbb fred 1

will set all pixels bad except for those inside the AST mask. Likewise,

% setbb fred 2

will set all pixels bad except for those inside the FLT mask. Note, the change made by setbb is temporary - it can be undone by doing:

% setbb fred 0

To display the fred.sdf map and then overlay the AST mask in blue and the FLT mask in red, do:

% display fred mode=perc percentiles=\[2,98\]
% setbb fred 1
% contour fred clear=no mode=good labpos=! style='colour=blue'
% setbb fred 2
% contour fred clear=no mode=good labpos=! style='colour=red'
% setbb fred 0

2014-04-03

Map Variances when using FLT.FILT_EDGE_LARGESCALE_LAST

The FLT.FILT_EDGE_LARGESCALE_LAST makemap configuration parameter allows you to create a map in which the background regions (i.e. the regions zeroed out by the AST mask) are filtered on a shorter scale than the source regions, thus producing a pleasingly flat background without removing structure from the source regions.

However, when using this option care needs to be taken when interpreting the variances stored with the map. Setting FLT.FILT_EDGE_LARGESCALE_LAST causes the filter size to be changed on the last iteration. Whilst this does not affect the data values in the source regions much (since the source signal has been extracted into a separate model by the time the final iteration occurs), it does affect the variances. This is because the variances are formed from the time-stream residuals that are left after removal of the astronomical signal estimated on the penultimate iteration, and all these residuals - no matter where they are on the sky - are filtered using the smaller filter size given by FLT.FILT_EDGE_LARGESCALE_LAST.

The upshot of all this is that the basic scheme described above results in artificially low variances in the source regions. To avoid this, a change was introduced into SMURF on 13th February so that now, if the FLT.FILT_EDGE_LARGESCALE_LAST parameter is set, the variances stored in the map are the ones generated on the penultimate iteration (that is, the final iteration that used the basic filter size rather than the special last-iteration filter size). This should give more realistic variances for the source regions, but at the cost of unrealistically high variances in the background regions.

2014-01-22

CHECK_RMS: comparing SCUBA-2 map noise with the ITC

There's a new SCUBA-2 analysis tool in town, and its aim is to help observers get a feel for whether their maps are reaching the intended sensitivity levels. It comes in two forms:

PICARD recipe: SCUBA2_CHECK_RMS
Offline pipeline recipe REDUCE_SCAN_CHECKRMS

At its core, the "CHECK_RMS" analysis estimates the RMS and NEFD for a given observation to compare with values from the SCUBA-2 integration time calculator (ITC). To obtain access to these new recipes, you must be using /stardev in Hilo or update your own version using rsync (see http://starlink.jach.hawaii.edu/starlink/rsyncStarlink).

PICARD recipe: SCUBA2_CHECK_RMS

As usual, the PICARD recipe is designed to be run on processed data. To run it, type:

% picard -log sf -nodisplay SCUBA2_CHECK_RMS myfiles*.sdf

in the directory containing the files to be analyzed. You may notice a warning (cyan text) about missing NEP data: that doesn't affect the main analysis and can safely be ignored. However, if run in Hilo, the recipe can query the archive of log files generated by the QL pipelines for NEP data. These NEP data are used to determine an effective noise and NEFD from the timeseries noise calculations, which can then be compared with the other values.

The recipe can be given calibrated or uncalibrated maps: the default FCF will be applied to uncalibrated data. Note that these maps must be made from a single complete observation, either by hand with makemap or the offline pipeline. Single observations must be used because the recipe relies on FITS header information that will be incorrect for coadded files. Note that maps made by the SUMMIT pipeline should not be given to this recipe; the SUMMIT pipeline will soon start recording the same CHECK_RMS data itself.

The input images are cropped to a circle of radius 90 arcsec from which the median noise and NEFD (the values RMS_MAP and NEFD_MAP in the log file) are calculated. The mean exposure time is estimated over the same map area (TEXP). It's important to note that the RMS_MAP is derived from the variance component in the map, and not from the map itself. This means that the results should be largely independent of the chosen makemap config parameters, and are more likely to reflect changes in the number of samples that go into the map rather than any differences in spatial structure introduced by applying different high-pass filters in the map-maker.

The map size and scan type are read from the FITS header and used to define the observation type for the ITC. The RMS corresponding to this observation type is calculated from the SCUBA-2 ITC (RMS_ITC); the integration time is used to convert that to a corresponding NEFD (NEFD_ITC). However, it is important to note that if you have chosen a non-standard map size for a PONG then this recipe will be unable to calculate values from the ITC, rendering it somewhat less useful.

If the NEP data are available (in Hilo only, and for dates after 20120515), the average NEP for a given observation is calculated from the archived QL pipeline log file (log.nep) corresponding to the date and wavelength of the input map. The FCF (either from the file or the standard value) and mean transmission over the observation is used to convert this to a zenith NEFD (NEFD_NEP). The ITC scaled elapsed time is used to convert that to an RMS noise (RMS_NEP).

The results are written to a log file called log.checkrms, the format of which is described below.

The following recipe parameters are supported:

KEEPFILES - if set to 1, the _crop files created during processing will be retained on disk, otherwise all intermediate files will be deleted.
MAP_RADIUS: radius (in arcsec) of the region used for calculating the map noise and NEFD. Default is 90 arcsec. Note that poor results can be derived if the map radius is too small and it may be more useful to use a larger radius for larger maps.
STATS_ESTIMATOR: Statistical estimator for the NEFD and RMS calculated from the input map. May be mean or median - default is median.

Offline pipeline: REDUCE_SCAN_CHECKRMS

An alternative to the PICARD recipe, this recipe performs all the necessary steps on the raw data to do a complete self-consistent analysis for each observation, from calculating the timeseries noise to processing the data and comparing results with the ITC. The steps performed are:

Calculate timeseries noise from the entire observation;
Determine various values from the FITS headers (elapsed time, average airmass, elevation, optical depth, transmission, date etc);
Make a map using the given config file (default if not specified);
Calculate the noise (from the variance component), NEFD and exposure time within a 3-arcmin diameter circle at the map centre;
Use the mapping parameters and elapsed time to derive the expected RMS and NEFD from the ITC;
Write results to log file.

Since a custom makemap config file can be given (see below), it is possible to use this recipe as your main data reduction recipe (if the standard processing is sufficient), to avoid re-reducing data. Note that the recipe does not yet do any coadding, so currently output maps will have to be combined with MOSAIC_JCMT_IMAGES.

To use it, run it like any other pipeline recipe. Initialize the pipeline in the usual way:

% oracdr_scuba2_XXX -cwd

Add the names of the relevant files to a text file (say, myfiles.lis), and run the pipeline with:

% oracdr -log sf -nodisplay -loop file -files myfiles.lis REDUCE_SCAN_CHECKRMS

All of the recipe parameters supported by the default REDUCE_SCAN recipe can be given to this recipe, the most important of which is likely to be MAKEMAP_CONFIG for specifying an alternative config type for makemap. If not given, the default config file (dimmconfig.lis) is used. Note that the same config file will be used for all data so be sure to use an appropriate config.

Log file format

The log file, called log.checkrms, produced by each method contain the following entries for each map (observation):

UT - UT date including day fraction
Source - object name, upper case with spaces removed
Obs - observation number
FILTER - filter (wavelength)
telapsed - elapsed time of observation (sec)
texp - mean exposure time, derived from EXP_TIME NDF component (sec)
trans - mean line-of-sight transmission
nep_av - mean NEP for current observation (W/sqrt(Hz))
nep_av_err - uncertainty in nep_av (W/sqrt(Hz))
rms_nep - RMS derived from nefd_nep (mJy/beam)
nefd_nep - NEFD derived from nep_av and scaled elapsed time (mJy sqrt(sec))
rms_map - RMS noise in map, obtained from median of error component (mJy/beam)
nefd_map - NEFD derived from combination of variance and exposure time images (mJy sqrt(sec))
rms_itc - RMS noise estimated by SCUBA-2 integration time calculator (ITC) (mJy/beam)
nefd_itc - NEFD derived from rms_itc and scaled elapsed time (mJy sqrt(sec))
itc_obstype - observation type used by the ITC to derived RMS
rms_ratio - ratio of rms_map to rms_itc
El - mean elevation in degrees
CSO - mean zenith optical depth at 225 GHz, derived from WVM
Tau - mean zenith optical depth at the current wavelength (given by FILTER above)
Radius - radius in arcsec of image used in analysis
pixscale - pixel scale in arcsec
f - ITC f parameter
project - project ID

Note that ITC-derived values will be undefined if the PONG map size is non-standard.

So what should I look for?

The log file contains all the results of the analysis. It's worth comparing RMS_ITC with the number you determined at the time your proposal was written. It's important to be aware that the PICARD and ORAC-DR recipes estimate values from the ITC using the actual observed airmass/opacity, rather than an average obtained from the source declination. This will undoubtedly lead to differences in the ITC-derived parameters.

Perhaps the most important number is the rms_ratio which is the ratio of the RMS_MAP to the RMS_ITC. Ideally this value should be 1, though in practice that's rarely the case. Sometimes it's down to the region used for calculating the RMS being too small (in this case, increaseIf the results are very far from the expected values then contact your Friend of the Project to see if it's possible to work out why.

SUMMIT pipeline

While not yet released, the SUMMIT pipeline will also perform its own CHECK_RMS analysis and will write a corresponding log.checkrms with the same format as for the other methods.

A few things to keep in mind when this is released:

The NEP and derived values will be undefined (NaN) in the log file, as the SUMMIT pipeline does not have access to the QL log file data.
An entry is written each time a new image is created, which may yield multiple entries for a given observation. However, each entry should be self-consistent.

This blog entry will be updated with the relevant documentation when the SUMMIT pipeline starts to produce these data.

2014-01-21

Temporary problems with SKYOOP

If you use the SKYLOOP script to create SCUBA-2 maps, you need to be aware that some issues have been reported recently with SKYLOOP that are under active investigation. One is that it can use more memory than is allowed by your system, with the result that it can be killed without warning in order to protect the rest of your system. No error or warning messages are issued when this happens - it just suddenly stops for no apparent reason. One way to avoid this to limit the amount of memory used by makemap. To do this run skyloop as follows:

% skyloop extra="maxmem=120000"

where the number is in units of MiB and should be changed to suit your system.

Sadly, even with this setting, things can sometimes still go wrong. This is because makemap is currently under-estimating the amount of memory it needs.

A second issue is that if you use skyloop to create a map from a single contiguous chunk of data, and then use makemap to create a second map from the same chunk of data, there are significant differences between the two maps. In theory, the two maps should be the same - skyloop should produce better results when making maps from multiple chunks of data, but they should give the same results for single chunk maps. This indicates that something - maybe several things - is wrong with skyloop.

I'm currently investigating both these issues and I am making good progress but have not yet completely solved them. So in the mean time, beware of using SKYLOOP and watch this space...

David

2014-01-16

Updates to SCUBA2_CHECK_CAL: 2-component fitting

Happy New Year to everyone reading this (or Hau‛oli Makahiki Hou as we say in Hawai‛i), and welcome to 2014.

The PICARD recipe SCUBA2_CHECK_CAL has been updated to use a two-component gaussian profile to fit the beam when determining an FCF from a calibration observation, as long as the signal-to-noise ratio exceeds 100. Previously, a single-component fit was used which was not constrained to be a gaussian. (If the signal-to-noise ratio is not high enough it will fall back on the old behavior.) This two-component fit is based on the FWHM and relative amplitudes of the two components derived in Dempsey et al. (2013).

This change has no effect on the ARCSEC FCFs, but results in a small (~1%) but consistent reduction in the BEAM FCFs. The effect should be small enough to be negligible, and with this change we now have slightly higher confidence in the resulting FCFs than with the old one-component fits. BEAMMATCH FCFs are likewise unaffected by the change, as testing showed they were best fit with profiles not forced to be gaussian.

You can set the behavior of SCUB2_CHECK_CAL manually using the FIT_GAUSSIAN parameter in your config file. The default value of 2 uses a two-component gaussian fit (when S/N > 100), a value of 1 uses a one-component gaussian fit, and a value of 0 recovers the old behavior of a one-component fit not constrained to be a gaussian.

At the moment, to use this feature you will need to update your Starlink to the current development version. Linux users can simply rsync Starlink from JAC following the instructions at http://starlink.jach.hawaii.edu/starlink/rsyncStarlink

2013-10-22

How to loose fewer samples with NOI.BOX_SIZE

At the end of the first iteration, when the common-mode signal (COM model), low frequency drift (FLT model), and the astronomical signal (AST model) have all been removed from the data, the remaining residuals are used to estimate the variance of the high frequency noise in each bolometer (the NOI model). The reciprocal of these variances are used as weights on the second and subsequent iterations, and are applied to the time-series values that are binned to form the map.

By default, the NOI model contains a single variance value for each bolometer , and so all samples from a single bolometer get the same weight in the map. However, the NOI.BOX_SIZE parameter can be used to force each bolometer to have multiple variance estimates, with one value for each box of samples. See this post.

Whilst this is often in an improvement over the default situation in that it allows for varying noise levels, it has two potential issues:

All samples in a box are given the same variance value, thus introducing the potential for steps in the weights and corresponding artifacts in the map.
The noise in each box is found by taking the FFT of the data and finding the average power in the 2 to 10 Hz band. Using an FFT is problematic on short time streams, as any missing values (such as caused by steps, jumps, common-mode flagging, etc) have a proportionally larger effect, causing the FFT to be less representative of the data. A requirement that the time stream should have a majority of unflagged values in order to be usable means that a lot of data is often rejected because a reliable FFT cannot be obtained.

An alternative scheme that avoids these two issues has been added to makemap, which is enabled by setting "NOI.BOX_TYPE =1" in your configuration. This option causes the noise at a sample to be set equal to the variance of the neighbouring sample values - those that are less than half a box size (i.e. NOI_BOX_SIZE / 2) away from the sample. This scheme does not involve taking an FFT and produces a continuously varying estimate of the variance and so avoids steps in the weights used to make the map. It also is minimally affected by flagged samples and so results in fewer data samples being flagged as unusable (as shown by the "NOISE" entry in the samples statistics at the end of each iteration).