2010-03-25

SMURF and quality flagging

As promised in one of the recent operations telecons, a blog post about quality flagging, and how all the new quality flagging statistics can help you understand what's going on.

First some background. The raw data is a cube, containing 2 spatial dimensions, and 1 time dimension. There is a single quality array, with these same dimensions, that is used by the map-maker to keep track of various problem areas that are identified both before and during the iterations. Each element of this array is an unsigned byte, and each bit is a flag. You can list the meaning of each bit using Kappa showqual on an exported "_res" model component (with "qua" also specified in exportndf in the dimmconfig file):

BADDA (bit 1) - "Set iff a sample is flagged by the DA"
BADBOL (bit 2) - "Set iff all data from bolo to be ignored"
SPIKE (bit 3) - "Set iff a spike is detected"
DCJUMP (bit 4) - "Set iff a DC jump is present"
PAD (bit 5) - "Set iff data are padding"
APOD (bit 6) - "Set iff data are apodized/boundary"
STAT (bit 7) - "Set iff telescope was stationary"
COM (bit 8) - "Set iff data common-mode rejected"

Note that "BADDA" used to be called "BADSAM", and "COM" is also relatively new (common-mode rejection was using BADDA previously, which got confusing).

When you run the iterative map-maker you will now get some information about the quality. When the first iteration starts you will see something like this:

--- Size of the entire data array ------------------------------------------
bolos : 1280
tslices: bnd:2000(0.2 min), map:150000(12.5 min), tot:152000(12.7 min)
Total samples: 194560000
--- Quality flagging statistics --------------------------------------------
BADDA: 68400000 (35.16%), 450 bolos
BADBOL: 69312000 (35.62%), 456 bolos
SPIKE: 0 ( 0.00%),
DCJUMP: 0 ( 0.00%),
PAD: 2560000 ( 1.32%), 2000 tslices
APOD: 0 ( 0.00%), 0 tslices
STAT: 2560000 ( 1.32%), 2000 tslices
COM: 0 ( 0.00%),
Total samples available for map: 123600000, 64.38% of max


The first part of this output is just telling you about the dimensions of the entire chunk that it is currently processing. With a single sub-array, the number of bolos is 1280 (ignoring for a moment how many are actually working), and "tslices" is the number of time slices (200 Hz samples). This second number is broken down: "bnd" is the number of samples in the boundary region -- these are the parts of the data that are either flagged as padding or apodization. "map" is all of the other samples -- this is the theoretical maximum amount of data that could go into the final map if all the bolometers were working, and there were no spikes, steps or other glitches. "tot" is simply the sum of the two. The "Total samples" line is then product of the number of bolometers with the total number of time slices in the array.

The second part of the report gives a breakdown of the quality flagging bits in the data array, after loading in the data and applying the flatfield, but before any of the pre-conditioning steps, or the first iteration of the model components. In the quality report the first column of numbers gives the total number of samples with those quality flags, followed by percentages of the total data array size in brackets. Since both BADDA and BADBOL apply to entire detectors, the third columns gives the total number of samples that have been flagged, divided by the number of time slices -- giving the numbers of bolometers. "PAD" and "STAT" flags are already set since padding and apodization of the data happens as they are loaded. Since all bolometers are flagged at a given time slice, the third column of numbers gives the number of samples divided by the number of bolometers, giving a number of time slices.

After each iteration, a similar report will be provided:


--- Quality flagging statistics --------------------------------------------
BADDA: 68400000 (35.16%), 450 bolos ,change 0 (+0.00%)
BADBOL: 80104000 (41.17%), 527 bolos ,change 10792000 (+15.57%)
SPIKE: 0 ( 0.00%), ,change 0 (+0.00%)
DCJUMP: 2113837 ( 1.09%), ,change 2113837 (+inf%)
PAD: 2560000 ( 1.32%), 2000 tslices,change 0 (+0.00%)
APOD: 5120000 ( 2.63%), 4000 tslices,change 5120000 (+inf%)
STAT: 2560000 ( 1.32%), 2000 tslices,change 0 (+0.00%)
COM: 10640000 ( 5.47%), ,change 10640000 (+inf%)
Total samples available for map: 107990669, 57.79% of max
Change from last iteration: -15609331, -12.63% of previous


At the end of the first iteration, this will report quality flags that were set both during pre-processing, and the first calculation of the model components. While some of the quality bits are static throughout the iterations (BADDA,PAD,APOD), the others can change each time. The fourth columns of numbers give the changes both as an absolute number of samples since the last iteration, and as a percentage of the number from the previous iteration (+infinity if the previous iteration didn't have any flagged). Here we see that the number of bad bolometers (BADBOL) has increased from 456 to 527 (15%) from the initial list of good bolometers. These extra detectors are almost exclusively rejected during calculation of the common-mode signal (they have a shape that is a significant outlier from all the other detectors). For this reason the increase in COM (in samples) is nearly identical to BADBOL. No spikes were flagged during this iteration (SPIKE), but about 1% of the data was flagged as having DC steps (in addition to the steps themselves, a small region around each step is also flagged). Finally, "STAT" identifies parts of the data where the telescope was not moving (threshold speed given by "flagstat" in the dimmconfig file). For these data it only flags the padding at the start and end of the data! (this is clearly not useful behaviour, so I will fix it to ignore these parts of the data in the future).

At the end of the individual quality stats is a summary for the entire array. "Total samples available for map" is the number of samples with no quality bits set, and it is also expressed as a percentage of the theoretical maximum discussed above for the "Size of the entire data array" at the start. The final line shows how this number has changed since the previous iteration.

In summary: watch the number of samples (and percentage) of the data going into the final map. If we typically have about 50% working bolometers, we should expect a number about that large going into the map at the end! If that number falls precipitously (or goes to 0), this report will help you to identify the problem. The usual culprits are DC step flagging, and common-mode rejection.

2010-03-18

SMURF cookbook

For the benefit of those not at the telecon this morning (or those that zoned out...)

The question was "is there a DR cookbook". The answer is yes, but it is about two months out of date. It should serve as an introduction, with the material in this blog providing an update. Of course, we will bring the cookbook up to date as things stabilise.

I have added a link to the SMURF cookbook in the "Useful links" sidebar section to the right. If you have a current Starlink installation you can also type "showme sc19" for the cookbook and "showme sun258" for the SMURF manual.

2010-03-17

Flatfielding updates

Summary: Flatfield ramps work really well and SMURF can now automatically handle them in the map-maker.

SCUBA-2 bolometers need to be calibrated to understand how they respond to varying signal coming from the sky and astronomical object. The original plan was to calibrate in the dark (shutter closed). The sequence goes something like:

  1. Select a reference heater value, take a dark frame
  2. Choose a new heater setting, take a dark frame
  3. Take a dark frame at the reference heater value
  4. Choose a different heater setting, take a dark frame
  5. Take a dark frame at the reference heater value
and continue until you have covered a reasonable range of heater settings. As the heater is changed the bolometers read out a different current. Any drifts in the instrument are compensated by averaging the surrounding reference frames and subtracting. This means that you end up with a curve that goes through zero power at the reference heater value. In order to convert this to a flatfield you either fit a polynomial as a function of measured current (so that you can look up the power) or else use "TABLE" mode and do a linear interpolation between measurements either side of the measured current. The gradient of the curve (how the bolometer responds to changes in power) is the "responsivity" and is measured in amps per watt. The responsivity image can be calculated using the SMURF calcflat command.

When you open the shutter the idea is that you "heater track" to the sky. This involves you adjusting the power to the heater such that the sky power detected by the bolometer results in the same current being measured by the bolometer as it measured in the dark. We do this by looking at the signal from a set of tracking bolometers and assume that those bolometers are representative of the others on the array. In reality what happens is that about 80% of the bolometers do more or less read the same signal before and after opening the shutter but the other 20% are in a completely different place. This would not be a problem if the responsivity didn't change for those 20% but unfortunately it does. We have verified this by doing finely spaced pong maps on Mars covering a 6x6 arcmin area. This takes about 15 minutes but gives us a beam map of every single bolometer. Analysing the Mars images showed that the bolometers with the lowest responsivity also measure a very low integrated flux for Mars and so the calibration does change when the shutter is opened.

The solution for this was to change flatfielding to work on the sky rather than in the dark. This works in just the same way as previously, using reference sky measurements to compensate for drift, and the top plot in the figure shows a sky flatfield that is working pretty much perfectly. Finely-spaced maps of Mars confirm that all the bolometers are calibrated to within 10% with no drop off for the low responsivity bolometers.

At this point things were looking good but we still had the issue that the sky flat takes a few minutes and really has to be done every time you do a new setup and probably at least once an hour. They also are very dependent on observing conditions as could be seen on 20100310 and a few days before hand where the sky was terribly unstable despite brilliantly low opacity (0.03 CSO tau). The middle plot below shows a sky flat on 20100310 and it is immediately obvious that the sky is varying very fast and varying the power over a much larger range than the heater is adjusting for. This flatfield failed to calibrate any bolometers at all and we had to resort to dark flatfields to get a baseline calibration (with the associated worries described above).

We had known this was going to be an issue so in the early part of the year we had been modifying the acquisition system to do fast flatfield ramps. Rather than setting the heater, doing an observation, changing the heater, doing an observation we can now change the heater value at 200 Hz (currently we take 3 measurements at each setting though). On 20100223 we enabled sky flat field ramps at the start and end of every single mapping observation and a few days later we added it to focus, pointings and sky noise observations. The bottom plot shows the flatfield ramp for the observation that immediately followed the discrete sky flatfield shown in the middle plot. There is an issue with the very last ramp but the flatfielding software in SMURF had no problem calculating a flatfield for 850 bolometers (SMURF does compensate for drift in the reference heater values). The flatfield ramps are going to help enormously with calibration.

Actually using these flatfields in the map-maker took some work but yesterday I committed changes to SMURF so that flatfield ramps will be calculated and used when flatfielding data in the map-maker (and other SMURF commands). All you need to do is give all the files from an observation to SMURF and it will sort everything out.

I have updated the /stardev and /star rsync server in Hilo (64-bit and 32-bit). There is also a new nightly build available for OSX Snow Leopard 64bit in the usual place.

One final caveat, we have not yet calibrated the resistance of each bolometer relative to the nominal 2 ohms. We have taken data by looking at a blackbody source which should give us a way of tweaking the resistances. When this happens the flatfielding will change slightly and maps will need to be remade (although how critical that is will depend on how much we tweak the bolometers). 


2010-03-09

JLS/ACSIS DR telecon – 20100304

Attendance: GAF, JH, CW, PR, JDF, TJ, FE, HST, ACC

Actions from last meeting:
  • JAC to make a more compact and readable QA report format and make this log available to observers/co-Is following nightly reduction via the OMP (as a downloadable file).
    • update: there had been correspondence between JennyH and BradC and although there had been some work done, this item had not been completed before BradC left – ONGOING
  • JLS teams to provide feedback on what should go in parameter file (once JAC have made initial list). – ONGOING
  • BradC to work through SLS pipeline and QA requirements as provided to him (and available on SLS wiki)
    • update: following BradC's departure, this will have to be picked up by his replacement (hopefully by next month) – ONGOING
  • For JLS teams to provide JAC with images/data/log of spikes when they come across them in their data.
    • update: JAC had received some spike examples from SLS. These have been forwarded to DaveB but given his current priorities we may not see any movement on this for a few months yet (summer maybe?) – ONGOING
  • ChristineW to pass on NGS QA and moment map making scripts to JAC. – DONE
    • ACTION: TimJ to look into whether BradC did code this into the pipeline?
  • AntonioC to organise the production of pipeline documentation
    • update: some minor organisation of documentation but this has fallen down the list of priorities since the start of SCUBA-2 commissioning – ONGOING/STALLED
  • AntonioC to email coords asking them to provide email contacts of people who are reducing ACSIS data and are willing to act as ACSIS DR contacts. – DONE
  • JLS coords to provide email contacts of people who are reducing ACSIS data and are willing to act as survey DR contacts (send to FrossieE). – DONE

News from JAC
  • The software group took a big hit to its available effort when BradC left for new employment in February 2010. We are working on buying in some effort from an experienced ex-Starlink programmer to pick up on many of the tasks that Brad left. This person's effort should start becoming available to us from April 2010.
  • since the commissioning of SCUBA-2 started and the subsequent onset of SCUBA-2 Shared Risks Observing, JAC has been inundated in SCUBA-2 data
  • we can now re-reduce data at the JSA across different UT nights to create so-called "project products". This is a very important step and these products are essentially the Advanced Data Products (ADPs) that the JSA will serve.
  • FrossieE continues to submit data reduction jobs to the JSA but is experiencing problems as makecube has been timing out (in project mode) when cubes are very, very large
Report from GBS
  • a meeting of the GBS was help in Leiden in February 2010; discussions were held on the QA at that meeting
  • there are still missing QA steps in the pipeline reduction. For instance, bad scans are propagating through to final product, (bad scans = one or combination of high rms, many bad pixels, high variation across spectra)
    • ACTION JAC: this is a high priority for the ACSIS pipeline at the JSA and JAC will look into this when effort materialises
Report from NGS
  • no issues reported
  • observing on the ACSIS portion of the survey has completed.
  • the team is reducing the data themselves via student effort; they will then want to upload the final products to the archive
Report from SLS
  • the team is focussing on getting all data reduced to then attack QA issue consistently
    • ACTION: GaryF to direct FrossieE to SLS wiki where QA requirements are published
  • PaulR and HollyT have enough data now to compare pipeline reduced data with hand reduced data
    • ACTION : PaulR to forward DR documentation to JAC – DONE

Next meeting May 6th.

2010-03-01

Data processing at CADC

Right now we are having trouble reducing SCUBA-2 data for the JCMT Science Archive (JSA).Briefly the problems seem to be:

  • The JSA wrapper is reporting success even in some cases of failure
  • Some maps take so long to make that MAKEMAP is timing out during long maps (this will also affect processing on private computers)
  • There is a bizarre NFS problem on the processing nodes that causes required perl modules to "disappear" when the pipeline is looking for them.
We're working through fixing these, so until then availability for products is a bit patchy. The good news is that processing throughput is great, so catching up with the backlog does not seem to  be a problem.