2010-09-30

Suppressing large-scale noise with zero-masks

Over the last couple of weeks we have been implementing and testing two new masking features that significantly improve the large-scale noise properties. As mentioned in an earlier post the common-mode signal (and also low-frequencies removed using a high-pass filter in the FLT model) are degenerate with angular scales larger than the array footprint on the sky.

In that earlier post, I described a method of specifying a zero boundary condition in which the regions of the maps with low hits (the edges) are set to zero after each iteration. While this helped significantly with smaller maps (particularly pointing observations), it wasn't much help for significantly larger maps (i.e., covering much more area than the array footprint). This would lead to "saddle" shapes all over the map as the number of iterations was increased. We have therefore added two new masking options.

The first is to define a circular masked region:

ast.zero_circle = (LON,LAT,RADIUS)
or
ast.zero_circle = (RADIUS)

In this case, the portion of the map beyond a circular region centered over LON,LAT with the given radius (all quantities in degrees) is constrained to zero. Alternatively, only a single value may be supplied, in which case it is interpreted as the radius, and the LON,LAT are taken to be the pointing centre of the map (usually the compact source of interest for which this option is generally useful).

The second method automatically determines the masked region based on a SNR threshold.

ast.zero_snr = SNR

In this case, after each iteration of the map, all map pixels above the threshold SNR are included in the "source mask", and map pixels outside this mask are constrained to zero. As bright structures "grow" with respect to the neighbouring zero-constraiend map pixels, each iterations of the source mask will also grow to include more pixels.

For both of these options, the ast.zero_notlast flag applies: for the final iteration the zero constraint will not be applied if this is set.

So from a numerical point of view, what does this constraint do in the regions that were constrained to zero until the last iteration? In its simplest form the map-maker is trying to solve for several signal components in the bolometer time series:

Signal = ASTronomical + COMmon + FiLTered

where AST is the signal produced by astronomical sources, COM is the average signal seen by all of the bolometers, and FLT is generally residual low-frequency noise. By setting AST to zero at any location in the map, the only other places that signal can go are COM and FLT. COM only contains the average signal seen by all of the detectors, and FLT only contains low-frequency signal components. Therefore this operation, perhaps surprisingly, will not remove any compact structures in the zero-padded regions; the effect seems to be roughly equivalent to setting the mean value on array-footprint-sized patches within those regions to zero.

The updated dimmconfig_bright_compact.lis is a good default configuration for small/bright sources; in this case a circular mask is used with a radius of 60 arcsec. Similarly, there is an updated dimmconfig_bright_extended.lis that masks based on a SNR threshold of 5-sigma. We illustrate the effect of the SNR masking using the short observation of Orion mentioned in the previous blog post, s4a20100219_00039.

First, a map produced without any zero constraints and 20 iterations, clearly showing the large-scale degeneracy that results in a strong gradient in the map:



Here is the same reduction, but setting ast.zero_snr=5 and ast.zero_notlast=1:



and the mask indicating which pixels are constrained to zero until the final iteration (white) -- this is the QUALITY component of the final image:



Note that 20 iterations was chosen in this example to illustrate the improved convergence properties when using the zero mask. In fact, only about 8 iterations are needed in this particular case. A better automatic convergence test is still an outstanding issue for SMURF, and at present we recommend checking the maps after each iteration visually.

2010-09-21

SCUBA-2 Pointings in the science archive

This is just a quick note to explain what has been happening to pointing observations in the science archive. Pointings are public data so the reduced images of pointing sources account for a significant fraction of publicly available SCUBA-2 reduced data. We have realised that these data are problematic because we were running the standard pointing recipe at CADC which filters out large scale structure and clips the image in order to make it easier to detect the pointing source. This is not obvious to the casual user of the archive and has recently caused some confusion.


The image above (a 20 second 450 micron observation from 2010-02-19 #39) is the standard product for a pointing of OMC-1. As is obvious, a lot of extended emission has disappeared. This week we are going to start reprocessing pointing observations using the standard recipes so that the archive contains science products rather than products that aid telescope observing. As an example the image below is the same observation using the standard recipe and it is much more representative of what SCUBA-2 is capable of:

2010-09-13

Spatial noise vs noise maps

I've been comparing the spatial noise in the two SRO SASSY maps (0.5 degrees by 0.5 degrees each) to the noise maps produced by the pipeline and matched-filter picard script. I've been told that this is a known issue, but that I should also share my findings. To do this I made histograms of signal to noise pixel values within the two maps that were processed using the matched-filter picard script. Since the maps are mostly empty and the matched-filter script should remove most of the artificial large scale structure introduced by the mapmaker algorithm, the spatial noise should agree well with the noise maps produced by the pipeline.


850μm maps: For the two SASSY fields (one with 29mJy rms and another with 47mJy rms), the signal to noise map underestimates the noise in the map by about 17% (± about 2% since I'm doing this by eye).
Larger image here

450μm maps (we don't expect to see any astronomical signal from these): For the NGC 2559 field with 1700mJy rms, the noise map underestimates the noise in the map by about 29%. For the cirrus field with 2400mJy rms, the noise in the map is overestimated by about 13%.
Larger image here
Larger image here

Hopefully this is helpful for others who are trying to figure out the amount of noise in their maps and are wondering about the accuracy of the noise maps produced by the pipeline.

2010-09-09

S2SRO data processing notes

A bunch of S2SRO people are asking whether they can just download the JSA products and go, without doing any of the data processing themselves. The answer to that is obviously yes, as long as you have convinced yourself that the JSA processing (which is also the default ORAC-DR processing) in combination with your data characteristics will support your scientific conclusions.

In aid of this, Ed Chapin has provided the following note summarising what we are doing to your data (thanks Ed).

The S2SRO data maps

At the time the data were acquired, SCUBA-2 was operating with single subarrays at each of 450um and 850um (the full complement is four subarrays in each band). In addition, there were a number of challenging data artifacts to overcome, including glitches in the bolometer time-series, and severe low-frequency noise caused, primarily, by oscillations in the SCUBA-2 fridge and magnetic field pickup (for further details, see SC/19 Appendix A). Despite these problems, it has been possible to produce science-grade maps for most of the S2SRO projects using one of two standard sets of configuration parameters for the map-making software (see the SMURF manual).

The first set is designed to produce maps of extremely faint fields containing point sources, using strong high-pass filtering to remove most of the low-frequency noise in the data (see dimmconfig_blank_field.lis in SC/19). It should be emphasized that any structures larger than the point spread function (PSF) are removed by the processing. In addition, these maps are run through a matched-filter to assist with the identification of point-sources (see further description below in "Noise Estimates").

The second set of parameters is used to reduced all of the remaining data, including maps of extended and/or bright objects (see dimmconfig.lis in SC19). The filtering is less ambitious in this case, and it is performed iteratively, which reduces ringing around bright sources, but may produce large-scale patchy or saddle-shaped structures in the maps.

For both types of processing, a common-mode signal has also been removed iteratively from the data --- the average signal recorded by all of the detectors at each instant in time. This processing, again, helps to suppress low-frequency noise, but removes all structure in the maps that are larger than the array footprint (approximately 2' x 2').


Calibration

Extinction correction has been performed on all of the data using the line-of-sight water vapour radiometer (WVM) on the JCMT. It is not currently believed that the calibration uncertainties are dominated by noise in these measurements.

Flux conversion factors (FCFs) have also been applied to all of the data. These factors have been established using regular observations of point-like flux calibrators. While there have been large variations observed in these factors, it is not presently understood whether these are real systematic variations (i.e. as a function of time of day, surface quality etc.) or simply reflect measurement uncertainties (see SC/19 Appendix B.1). We therefore apply a single FCF at each wavelength within the JSA pipeline, and assuming the errors are not systematic, estimate the total calibration uncertainties to be about 20% in a random observation.

Unless there are reasons to question the calibration factors for a particular observation, we believe that the current values are the best that can be applied.


Noise Estimates

The map-making process provides an estimate of the noise in each pixel (the VARIANCE component). Its value is calculated by:

  • (i) estimating the weighted sample variance for the N bolometer samples that land in the pixel (which are averaged together to estimate the mean flux in that pixel; weighted by the inverse of the noise in the respective bolometers from which they came).

  • (ii) dividing the result of (i) by N to calculate the variance on the mean (i.e. the variance on the estimated flux).
On small scales, this noise estimate appears to be accurate. For example, in maps produced using the faint-field processing, which are run through a matched-filter to correlate samples on the scale of the PSF, the noise is well-behaved (see Section 6.1 and Figs.13-14 in SC/19).
However, this noise map does not accurately characterize the uncertainty in aperture-photometry measurements on scales larger than the PSF, especially in the case of bright-field processing, since there is low-frequency noise that is correlated between different pixels. In these cases, the noise does not drop as sqrt(M) as expected, where M is the number of pixels in the aperture. For example, if a cluster of pixels all happen to reside on a local postive noise patch, the uncertainty in the photometry of that cluster is dominated by the error in the baseline of the patch, not the small-scale noise estimated for each pixel. It is therefore necessary to estimate the noise for larger-scale measurements by placing measurement apertures at random locations to estimate the real (and larger) uncertainties.

Finally, we re-iterate the fact that a common-mode signal has been removed from the data, so that in practice the SCUBA-2 maps only contain information on scales ranging from the PSF (6"--14") up to the size of the array (2').

2010-09-08

Processing date now visible in JSA web UI

By popular request, you now have access to the processing date of your product when you do a JSA search - it's the field at the bottom of the "Program Constraints" section called, unsurprisingly, "Processing Date". Like all the other date fields, you can enter a range to search, or if you just leave it blank but tick the box, you will get the extra column of information  in your results page.





Many thanks to Dustin and Sharon for rolling this out so quickly.

2010-09-07

SCUBA-2 products update - fresh batch

Just to let folks know, all the S2SRO data in the JSA has been reprocessed to pick up any observations marked as bad (and also to pick up some infrastructure changes I won't bore you about). Whether it is worth your while or not to re-download these products depends on how diligent you were about marking any bad observations since the last re-run a month or so ago.

Feel free to continue flagging bad observations as you find them - the data will be re-processed again in the future, or you can just drop us a line and ask for a reprocess of your particular project any time.

A few of the groups failed reduction - due to the aforementioned infrastructure improvements we are in a much better position to keep track of these failures, so we are working through the list to resolve the issues. If you are desperate for something you don't see in the archive, don't be afraid to ask to be bumped up the list.

By the way, a few of you have noticed that the JSA "house" products look better than what you are managing at home. Rest assured that there isn't some secret sauce that we are feeding the hamsters; maps will look better when processed on high-memory (16G+) nodes because they will have a higher effective integration time due to the way the mapmaker works. And of course the JSA processing always uses the most recent version of the software, which if you are not one of our bleeding edge Starlink rsync users, will net you a considerable improvement.

2010-09-01

JSA FAQ: examining data reduction parameters

If you know what hislist and provshow do, you can skip this entry.

Say you have downloaded your most excellent project product from the JSA and you are interested to see what went in it. Here's what you do (assuming, of course, that you have already sourced your Starlink startup files).

Convert the FITS file back to NDF (we'll call the result example for brevity):

 ; convert

; fits2ndf jcmts20100228_00040_850_reduced001_pro_000.fits example

So we now have our example.sdf

To see the history of the file, run the KAPPA hislist (history list) command:

; kappa

; hislist example

In the voluminous output, you will see the full parameters of any command ran on this file, for example, the makemap parameters that were used to generate this SCUBA-2 product:



In order to see what observations went into this product, use the KAPPA provshow (provenance show) command:


; provshow example

which will show you the entire provenance tree for your file.

Thanks to the NDF history and provenance mechanism, every operation on the data files is automatically recorded, so you have a complete record of what was done to the data.