2008-02-29

Preliminary Wrapper script released

A first stab at a wrapper script (jsawrapdr) has been released to users in Hilo. It matches the interface specified by CADC.

What it does:

  • retrieve data files from the supplied list
  • convert them to NDF if required
  • determines the correct ORAC-DR instrument name based on the data
  • checks that PRODUCT information matches for all files
  • determines whether to run ORAC-DR or PiCARD
  • converts products back to FITS
What it doesn't do yet:

  • Provenance is not quite correct. It is possible to refer to a parent that will not be archived.
  • There is no standardised approach to logging Standard Output and Standard Error
  • dpCapture does not automatically copy products to the CADC transfer directory.
It's enough to get us started.

CADC data transfers now working again

Data transfers to CADC are functioning again.  We've had real problems reconfiguring replication to CADC and to our standby server (which use different techniques) but now everything seems to be working. Now that CADC have headers from recent observations they will again start accepting our raw data. Transfers have been initiated and are currently complete to 20080215 (there are quite a few files to transfer). Data retrieval requests from users via the OMP will shortly be redirected back to CADC.

2008-02-25

ORAC-DR: CADC+batch mode

A brief mind-dump on the processing steps ORAC-DR will probably take at CADC when run in batch mode:


  • _cube files created. These are then forgotten about by ORAC-DR but are stored by CADC.
  • Run initial steps of Remo's script on time-series data. Removes any gross time-series signal through collapsing and rudimentary linear baselining.
  • Run MAKECUBE using every member observation of a Group, creating tiles.
  • Run remainder of Remo's script on each tile, which uses a combination of smoothing and CUPID to create baseline region masks and to remove baselines.
  • Take the baseline region mask from the previous step along with the original input time-series data, and throw them through UNMAKECUBE. This will create time-series masks.
  • Apply the time-series masks to the original time-series data.
  • Run MFITTREND with a high-order polynomial (or spline, or whatever) on the masked time-series data. These cubes shouldn't have any signal and should be pure baseline.
  • Subtract the baselines determined in the previous step from the original time-series data.
  • Run MAKECUBE on the baselined time-series data for each observation to create the _reduced / _rimg / _rsp files.
  • If necessary, WCSMOSAIC the _reduced files for each observation to create an "even better" group, which can then be used to determine a better mask and then possibly iterate through the UNMAKECUBE to _reduced generation steps.

The Wrapper

Background: there is a system called, imaginatively, the wrapper. Its purpose is to wrap the data processing specifics so as to present a generic interface to the CADC data processing system that is under development.

The wrapper is on TimJ's to-do list, and so is at the mercy of his higher priority SCUBA-2 work. In an attempt to push something out to CADC before working on the SCUBA-2 translator, he is writing a prototype with the following functionality:


  • has a stub dpRetrieve, emulating the system that will eventually fetch the data needed from the CADC database
  • examining the data to determine whether it is raw or already a product
  • converts any FITS files to NDF
  • runs ORAC-DR or PiCARD as appropriate given the above information
  • converts any NDF products back to CADC-compliant FITS
  • calls a stub dpCapture (the real dpCapture imports any products into the CADC system)


The main problem that stops this from being more than a prototype is the provenance system. In our NDF based systems provenance is a time series - file A turns into file B which turns into file C.... eventually resulting in file E, the final product. So the provenance looks like this: A, B, C, D, E. In the CADC system, provenance is the nearest parent existing in the archive. So, if only A, B, D and E exist in the database (because C happens to be an intermediate file of no lasting importance) the provenance for E is D, but the provenance for D is B. Therefore, the wrapper has to make sure that at the end of any processing the provenance is correctly fixed to display only parents existing in the CADC archive.

The intended solution for this is for DavidB to commit some NDG patches to allow TimJ/the wrapper to remove C (in the previous example) from the provenance. Also, the wrapper needs to rename A, B, D and E to the CADC naming convention so they can be matched to entries in the archive.

So as not to hold other parts of the project up, the intent is for the prototype to be delivered to CADC in the next few days without this provenance-related functionality, and to come back and fix this when the SCUBA-2 translator work allows.

2008-02-13

OMP to CADC connection is down

At the tail end of last week we had a hardware failure at the summit with our primary OMP database server. We switched to the new Sybase 15 64-bit servers but they have not been configured correctly to replicate the JCMT header table to CADC (full database replication is working to the backup 64-bit server in Hilo). Until we get the CADC replication up and running there will be no transfers of raw data to CADC. This is because CADC validates transfers against it's copy of the header table and rejects observations that are unknown to them.

We hope to have replication running by early next week but in the mean time we have reconfigured OMP data retrievals to serve the raw data files from JAC. 

2008-02-11

specx2ndf now creates variance

Forget to mention that I've modified specx2ndf so that it creates a Variance component in the output NDF based on the Tsys value in the specx file. The variance values in the output NDF are constant since each specx file seems to contain only a single Tsys value.

2008-02-04

Processing 3D cubes with FFCLEAN

I've just modified kappa:ffclean so that it can:
1) process 3D cubes. It will do this either by processing the cubes as a set of independent 1D spectra, or as a set of independent 2D images (see new parameter AXES)
2) store the calculated noise level in the output variance array (see new parameter GENVAR)

This was motivated by my experiments with the new smurf:unmakecube command as a means of getting an estimate of the noise level in each residual spectrum.

2008-02-01

Creating artifical time series from a sky cube

I've just added a new command to smurf called UNMAKECUBE. It takes a (ra,dec,spec) sky cube and generates artifical time series by resampling the sky cube at the detector positions in a set of existing reference time series NDFs. It's a sort of inverse to MAKECUBE. It should be useful for investigating baselines, and for iterative map making. I'm currently playing around with it, using data from Christine Wilson.