2009-11-05

Watching the data reduce

Thanks to the folks at CADC, Dustin Jenkins in particular, JAC now has a really nice interface that allows us to monitor the jobs submitted to their Grid Engine, look for faults and browse the thumbnails of the products in order to spot problems - or just sit back and admire the results :-)


2009-10-13

JLS/ACSIS DR telecon

Attendance: ACC, BEC, TJ, FE, JVB, RP, CW, JdiF
Date: 15 Oct, 2009

Minutes:
1. Review of actions from previous meeting:
  1. i. JAC will provide information on how to rsync the starlink releases to get latest patches/fixes. Information will also include for which operating system these patches/fixes are available. – DONE
  2. JAC to make a more compact and readable QA report format and make this log available to observers/co-Is following nightly reduction via the OMP (as a downloadable file). – ONGOING
  3. For SLS to provide JAC (ie Brad) with list of statistics and requirements for their QA, and also what they want for their reduction recipes to do. – ONGOING. Material has been received from SLS (also available on SLS wiki) and Brad is working through it.
  4. For JLS teams to provide JAC with images/data/log of spikes when they come across them in their data. – ONGOING
  5. ACC to organise the production of pipeline documentation. – STARTED. ONGOING.
  6. ACC to poll for a date and time for next telecon and make these meeting notes available – DONE
2. News from JAC
  • Nightly reductions are now being carried out at CADC.
  • There have been tweaks to raster maps production such that the pipeline trims off the tassled edges of maps. Note that this is only for the maps and not the cubes.
  • We can now regrid to specific wcs coords (e.g galactic).
  • The infrastructure for controlling the pipeline via a parameter system and a config file is in place. Now need to decide which parameters are to be accessible before implementing the system. JAC will come up with an initial list and then invite feedback from users.
  • There will be two relevant newsletter articles in Autumn edition of JCMT Newsletter. One on JSA and another on the ORAC-DR pipeline.
ACTION : JLS teams to provide feedback on what should go in parameter file (once JAC have made initial list).

3. Updates from survey teams
  • GBS : working towards getting a consistent reduction on the data set. Need QA format finalised first.
  • NGS : data taking almost complete. Data is currently run through manual QA - flagging of bad data and receptors is critical and it’s subtlety requires human interaction. As the results of this is documented and the recorded state of the receptors isn’t going to change, we should be able to transfer that knowledge to an automated system in the future when we have a triggered re-reduction. Reduced products generated by NGS can be uploaded to the JSA. Improvements in moments maps: based on utilising SNR and noise maps. Good for data taken in different weather conditions.
ACTION : CW to pass on NGS QA and moment map making scripts to JAC.
  • iii. SLS : biggest issue is manpower. GAF has sent information to us to kick off the SLS pipeline QA process. Meeting @ JAC with Paul Ruffle tomorrow to discuss SLS DR issues.
4. Designated DR contacts
JAC would like to have email contacts with individuals in survey teams who are regularly looking at data and have the time and inclination to correspond and work with JAC between meetings. It is important that these individuals do have the time available (e.g. students, post-docs) to run tests and feedback ideas and improvements. Although these telecons are proving useful, progress is slow if we wait for monthly meetings to get feedback on more minor (yet potentially critical) issues. Everybody agreed that this is a good idea.

ACTION : ACC to email coords asking them to provide email contacts of people who are reducing ACSIS data and are willing to act as ACSIS DR contacts.

ACTION : JLS coords to provide email contacts of people who are reducing ACSIS data and are willing to act as survey DR contacts (send to FE).

2009-10-02

Automated advanced processing at CADC

As previously mentioned, the ORAC-DR data reduction pipeline could be run at CADC to generate basic nightly products. This processing used to be run at JCMT on a nightly basis, with the products transferred to CADC.

This processing is now being run at CADC on their processing system. Processing requests are made at 0900 HST every day, and nightly products will be available sometime after that (depending on the amount of processing needed -- scans take longer to reduce than jiggles).

In the near future, effort will be undertaken to process the backlog of ACSIS data.

2009-09-28

SCUBA-2: baby steps

As most of you will know, SCUBA-2 is now on the telescope being put through its paces. Given the high data rate of the instrument, the number of people working on it (in Canada and the UK as well as here at the JAC) and the upcoming early science call, we are racing to make raw data downloadable from the JSA by the beginning of November.

This is a challenge, both because the instrument systems themselves are in flux, and because everybody is so busy with commissioning. Still, thanks to the great efforts of Sharon and her team at CADC, we have coaxed some data into the JSA and searched for it using the test interface. It might not seem very exciting, but this has given most of the infrastructure a good workout, so it is a promising sign that we can meet our target.


2009-09-04

ORAC-DR and Starlink on Twitter!

Much to Tim and Frossie's chagrin, I've created two Twitter accounts for ORAC-DR and Starlink. I haven't completely sorted out what will be done with them at this time, but for now I'll probably use them to disseminate short tips and tricks for using ORAC-DR and Starlink software.

Follow them at http://twitter.com/oracdr and http://twitter.com/starlinksoft!

2009-09-03

Thumbnails in search results



As mentioned previously, the pipeline generates little thumbnails based on the representative image and the representative spectrum products. Now, CADC can show these thumbnails in search results. This hopefully will allow people to quickly identify search results of interest prior to downloading. Clicking on the thumbnails will launch the full preview of the representative image or spectrum as before.

If you have recent data, check it out. We hope to re-reduce the backlog at some point in order to generate products and thumbnails for older data too.

2009-09-02

JLS DR telecon - 1st meeting

Attendance: A. Chrysostomou, R. Tilanus, T. Jenness, R. Plume, M. van der Wiel, J. Di Francesco, G. Fuller, B. Cavanagh, H. Thomas, D. Johnstone, H. Roberts, D. Nutter, J. Hatchell, F. Economou

- initial discussion on whether we will have a SCUBA-2 pipeline ready. There will be something in place for shared risks but basic. More development will have to wait until we have all arrays in place as it is not worth sinking any effort into this at this time.

- some people are having issues getting the pipeline installed and the fact that there is a lack of documentation. If people/institutes are having issues installing (any) Starlink software, then please inform the JAC (stardev@jach.hawaii.edu) providing the relevant details.

ACTION 1: JAC will provide information on how to rsync the starlink releases to get latest patches/fixes. Information will also include for which operating system these patches/fixes are available.

DONE(!): Instructions are available on the starlink web site (http://starlink.jach.hawaii.edu/).
To download the most recent release go to: http://starlink.jach.hawaii.edu/starlink/Releases
To keep up to date with the latest fixes and patches go to:
http://starlink.jach.hawaii.edu/starlink/rsyncStarlink


- GAF requested for more statistics to be made available from the QA. GAF will follow up with specific request to Brad (see Action 3 below)

- it was clarified that the summit pipeline (during normal night-time observing) only runs basic QA on calibrations. After the end of observing, all data taken that night is re-reduced by a “nightly pipeline” which executes the full QA and advanced processing. The reduced data products which result from this are shipped to CADC and can be downloaded with (or without) the raw data in the normal way.

ACTION 2a: JAC to make QA log available to observers/co-Is following nightly reduction via the OMP (as a downloadable file).
ACTION 2b: JAC to make a more compact and readable QA report format.

ACTION 3: For SLS to provide JAC (ie Brad) with list of statistics and requirements for their QA, and also what they want for their reduction recipes to do.

- JH raised some existing issues from the GBS: flatfielding (striping) of early HARP data; some bad baselines are not being picked up by QA; although not as prevalent as in older data, spikes are not trapped by the QA; an investigation is needed on how the gridding should best be done

+++ the flatfielding problem is on Brad’s worklist

+++ we need more feedback from the teams on which bad baselines are not been filtered out

+++ de-spiking data is not a problem that JAC has been able to tackle as yet. Part of the issue is that these do not seem to be as prevalent in data any more and observers (PI as well as JLS) are not reporting the issue any longer. GAF reported that spikes are still present but at a small level, which is an issue for SLS who are looking for weak, narrow lines.

ACTION 4: For JLS teams to provide JAC with images/data/log of spikes when they come across them in their data.

- RPT raised a few issues from the NGLS:

+++ need ability to baseline fit both wide and narrow lines in same data set

+++ need ability to restrict e.g. moments analysis to known velocity range.

+++ QA generally fails for (at least) early NGLS data. Will need to investigate this more but need an easy means to switch off in recipes. This is easy in the main recipe, but less so in the advanced interative part.

- there is a blog available for data reduction and pipeline activities (you’re problably looking at it right now!): http://pipelinesandarchives.blogspot.com/

- the issue of making the pipeline more controllable through a config file to set parameters was discussed. TJ announced that he is developing infrastructure so that the pipeline can be parameterised

- ACC received several emails prior to the meeting. A common theme was the lack of documentation explaining what the pipeline does to data, and how to use the pipeline. JH repeated this concern at the meeting.

ACTION 5: ACC took an action following the close of the meeting to organise the production of pipeline documentation. These will probably take the form of a detailed account of what the pipeline does, and a separate cookbook which explains how to run the pipeline with the different options available.

ACTION 6: ACC to poll for a date and time for next telecon and make these meeting notes available.

2009-07-27

Starlink Software Collection - Nanahope (Pollux) version released

The Nanahope version of the Starlink Software Collection was just released and can be downloaded from here. Highlights include:

  • GAIA can now visualise 2-D and 3-D clumps created by the CUPID package.
  • GAIA now has full support for the Virtual Observatory and has been modified to support the SAMP protocol to enable it to communicate with other VO tools.
  • Automated provenance propagation can now track HISTORY information in addition to provenance. The PROVSHOW command can now list the history of all of the parents in the processing history. HISLIST (and NDF history propagation) has not been changed and still only examines the history of a single path through the processing.
  • The software can now be built with gfortran 4.4.

2009-07-13

Database Replication outage

Over the weekend we lost database replication to CADC. Until we can sync up the tables we are unable to transfer any raw data to CADC (since CADC only accept files that their systems know about) so data will not be retrievable from the weekend. I'll post an update when transfers are enabled again.

UPDATE: Replication server crashed on Friday night. It's now back up and the tables have been synced with CADC. Transfers have been restarted.

2009-07-09

Hierarchical history for NDFs

The recording of processing history has been part of the NDF library for many years. When an application uses one or more input NDFs to create an output NDF, the NDF library creates a record of the application and its parameter values, and stores this record in the output NDF. It also copies all the history information from the "primary" (usually the first) input NDF into the output NDF.

Whilst it was recognised at the time that it would be nice to copy history from all input NDFs, the exponential growth of history information this could cause was seen to be prohibitive. But 16 years is a long time and we typically now have far greater computing resources. So we've taken the plunge and changed things so that history from all input NDFs is copied into the output NDF. However, to preserve backward compatibility, the new facilities are provided by the provenance routines in the NDG library - the NDF library itself remains unchanged.

This means that applications such as KAPPA:HISLIST, etc, that use the NDF library directly to manipulate history information are unchanged. Instead, the extended history information is stored in the PROVENANCE extension of each NDF, and can be examined using the KAPPA PROVSHOW command. Since there can be quite a lot of history information, it is not shown by default - set the new HISTORY parameter to "YES" when running PROVSHOW to change this default behaviour. Needless to say, NDFs created before these changes were made will not contain any extended history.

A common use for this extended history will be finding the value used for a particular parameter when a selected ancestor was created. We're toying with the idea of a GUI that would make this sort of thing easier by allowing an NDF's "family tree" to be navigated and searched, but for the moment the best thing is probably to use grep on the output of PROVSHOW.

2009-07-08

GAIA goes all clumpy

In the next release GAIA will display CUPID catalogues and masks so you can inspect your clumps in all their detail. This all works in 2 and 3D, which you can see in more detail at on the GAIA support site.

2009-07-07

CADC network outage

From John Ouellette at CADC:

The CADC will be undergoing extensive maintenance from July 18th 0800 PDT to July 19th 1800 PDT. All CADC services, including user access and etransfer, will be unavailable during this period.

During the outage, users will be redirected to a web page stating the reason for the outage and, if possible, we will provide a status update during the work.

2009-06-09

CUPID now creates STC-S polygons

I've added an option to CUPID:FINDCLUMPS to allow it to create an STC-S description (either polygonal or elliptical) for each clump it finds, and add them into the output clump catalogue. I've also modified KAPPA:LISTSHOW so that it can display the STC-S shapes over a displayed image. Below is an example. In the first image, the greyscale (and contours) are the data, and the red lines are the polygonal clump outlines. They overlap slightly in some cases because each polygon is only allowed to have up to 15 vertices, and so only approximates the clump pixel mask.



The next image is the corresponding pixel mask (each colour represents the pixels assigned to a single clump).

2009-06-03

Pipeline now generates thumbnails

The data reduction pipeline now automatically generates PNG thumbnails of rimg and rsp files. These thumbnails are generated in three different sizes, 64x64, 256x256, and 1024x1024. Exif information is also written to these thumbnails, embedding the RA, Dec, source name, orientation, and pixel scale. Only the astro: namespace is currently used (see this ROE page for more information).

Here's a (rather boring) example from last night:



In the near future they will be sent to CADC for automatic ingest.

2009-05-12

Pipeline running on CADC grid engine

Last week I (Tim J) visited CADC to work on integrating ORAC-DR into the CADC grid engine system. This involved making sure that the pipeline wrapper interfaced properly with CADC and that the data retrieval and data capture routines were given the correct inputs.

On Wednesday 6th May we were successful in running four jobs in parallel on the compute cluster. This is a terrific result and leads the way to being able to do the night processing on CADC in short order and then to follow up with processing of project coadds. In the next few weeks I will be working on the code that will query the data archive and submit job requests to grid engine.

It also means that in principal survey teams could request that jobs be submitted to grid engine to make use of the processing nodes.

2009-05-11

CADC network outage

CADC and therefore JCMT data retrievals will be off the air on Wednesday afternoon (HST) due to scheduled network maintenance. If you are having trouble getting through, just try again later.