Pipelines and Archives

2009-06-03

Pipeline now generates thumbnails

The data reduction pipeline now automatically generates PNG thumbnails of rimg and rsp files. These thumbnails are generated in three different sizes, 64x64, 256x256, and 1024x1024. Exif information is also written to these thumbnails, embedding the RA, Dec, source name, orientation, and pixel scale. Only the astro: namespace is currently used (see this ROE page for more information).

Here's a (rather boring) example from last night:

In the near future they will be sent to CADC for automatic ingest.

2009-05-12

Pipeline running on CADC grid engine

Last week I (Tim J) visited CADC to work on integrating ORAC-DR into the CADC grid engine system. This involved making sure that the pipeline wrapper interfaced properly with CADC and that the data retrieval and data capture routines were given the correct inputs.

On Wednesday 6th May we were successful in running four jobs in parallel on the compute cluster. This is a terrific result and leads the way to being able to do the night processing on CADC in short order and then to follow up with processing of project coadds. In the next few weeks I will be working on the code that will query the data archive and submit job requests to grid engine.

It also means that in principal survey teams could request that jobs be submitted to grid engine to make use of the processing nodes.

2009-05-11

CADC network outage

CADC and therefore JCMT data retrievals will be off the air on Wednesday afternoon (HST) due to scheduled network maintenance. If you are having trouble getting through, just try again later.

2008-12-19

Near minutely raw data movement

Raw observation data is being put both in jcmt database & CADC staging area not much later when it arrives.

A new program[0] endlessly checks (currently) about every minute to see if there are any observations to process. If there are, the database is fed, followed by symbolic link creations for CADC's consumption. This should help avoid massive data transfers to CADC twice a day. Note that previously involved programs will keep running concurrently until everybody involved is satisfied that raw data is being entered and/or moved as desired.

All this started yesterday slightly wet, cloudy Hawaiian evening.

[0] enterdata-cadc-copy.pl is a wrapper around JSA::EnterData & JSA::CADC_Copy modules, which were respectively generated from jcmtenterdata.pl & cadcopy.pl.

2008-12-04

QA-enabled pipeline released in Hilo

ORAC-DR has been updated in Hilo to include quality-assurance testing. Based on a number of QA tests, observations are given a pass/questionable/fail status. QA is automatically done on all science observations, and survey-specific QA parameters can be given.

This version will eventually be released to the summit, where it will give telescope operators feedback on which surveys are suitable to do, along with enhancing the JCMT Science Archive pipeline.

2008-06-27

SCUBA-2 DR pipeline

A belated announcement that the SCUBA-2 data reduction pipeline passed its "lab acceptance" earlier this month. Full report at http://docs.jach.hawaii.edu/JCMT/SC2/SOF/PM210/04/sc2_sof_pm210_04.pdf

2008-04-01

Initial results of "better" ORAC-DR reduction

As previously mentioned, ORAC-DR is improving how ACSIS data are reduced. To show the progress between "summit" and "better":

integrated intensity map, group coadd: summit better

integrated intensity map, single observation: summit better

intensity-weighted velocity map, group coadd: summit better

A few notes:

By "summit" pipeline I mean the pipeline currently running at the summit. This pipeline will be replaced by an "improved" pipeline pending JCMT support scientist approval. The "improved" pipeline will not be run at CADC, they will run the "better" pipeline that created the "better" images linked above.

The "group" summit integrated intensity map is not generated by the pipeline, it's created by manually running wcsmosaic to mosaic together the individual baselined cubes (the _reduced CADC products), then collapsing over the entire frequency range. This is how the summit pipeline would create those files, though.

Ditto for the group summit velocity map, except the pipeline wouldn't even create those in the first place, as it doesn't know which velocity ranges to collapse over to get a proper velocity map. This example is just done by naively collapsing over the entire velocity range. The "better" pipeline automatically finds these regions and creates velocity maps -- not only for the coadded group cube, but also for individual observation cubes.

The difference between the "better" pipeline (which is what will be running at CADC) and the "improved" pipeline (which is what will be running at the summit) is very small for this given dataset.