Pipelines and Archives: 201008

2010-08-11

About the JSA Products

Judging from my inbox, there is quite a lot of enthusiasm about the S2SRO products available from the JSA. We sure love this project, and it's great to see it hitting its stride and being useful to our external users. Still, I just want to throw in a couple of cautions so that people don't get caught out.

We do not have official data releases, as most people understand them. By this I mean that the data is not vetted by anybody at the JAC - we simply don't have the effort for that. Since right now we are still developing, we do trawl for obvious problems and bugs, but there's no scientific oversight of what the processing churns out, and the data is immediately exposed for download. So of course you may take the products, and a lot of them do seem to be publication quality, but you should still work through the reduction cookbook and make sure you understand what was done to your data. The result you download shouldn't be different from what you would get if you run our latest software at home with the recommended parameters. The idea is to get you the best data we can give you now, rather than a perfect version of your data later. The processed products can also help you prioritise which datasets to spend most time on.

As our data pipeline matures, or as we fix new bugs, we do re-reduce the data - and every time we do this the new product replaces the old one. So if you intend to post-process and/or publish using downloaded products, make sure you retain the version of the product that you used in case you need to reproduce your work later.

We do have a plan to allow users to upload their own versions of products into the JSA, but this is still a way off.

2010-08-05

SMURF Update (August 5th 2010)

It's been a couple of months since the last SMURF news entry so I thought I'd bring people up to date.

All the configuration files for the iterative map-maker have been tweaked and some have been renamed. For example the '_faint' config is now called "dimmconfig_blank_field.lis". We also have a new config file for bright calibrators called "dimmconfig_bright_compact.lis".
The SMO (time series smoothing) model has been improved. A few bugs have been fixed and it's been parallelized and so is much faster. SMO is not enabled in any of the default configuration files.
The size of the apodization and padding for the FFTs can now be calculated dynamically based on the requested filter. This is now the default behaviour. A new Fourier filter has also been written that does not require apodization (which may be important for very short maps) but is still being tested.
Quality handling has been revamped inside the map-maker to allow us to report more than 8 different types of flagging. The report at the end of each iteration is now more compact and if you use the SHOWQUAL command to look at exported models you may see that the bit numbers assigned to a particular quality are no longer fixed. Additionally if more than 8 flags are used the exported model will combine related flags (for example PAD and APOD will be merged into ENDS).
Very noisy bolometers will now be discarded before the iterative map-maker starts. This can help with convergence. See the "noiseclip" config parameter to adjust this.
The map-maker now compares flatfield ramps taken at the start and end of each observation and disables bolometers that whose calibration has varied too much. This will not help data taken prior to 20100223 where flatfield ramps are not available.
The step correction algorithm continues to be improved.
SC2CLEAN will now report quality statistics in verbose mode.
SC2FFT can now be given a bad bolometer mask.

The cookbook has also been updated and can be read online.

2010-08-04

New version of the SCUBA-2 cookbook

Version 1.1 of the SMURF SCUBA-2 Data Reduction Cookbook (or SC21, as it is fondly known), is out. We recommend that everybody who works with SCUBA-2 data read through it.

In order to be able to follow the cookbook, you will need to update your local starlink release to the latest version.

You can always get to SC21 from the sidebar of this blog.

Updated to change reference from SC19 to SC21

JSA FAQ: Product grouping types

When you search the CADC archive for processed data (either public or proprietary), you will have the option selecting any of four different types of product. These are:

Simple: This is the most processed state of a single observation
Night: This is the product resulting from all observations taken in one night on the same field
Multi-night: This is the product resulting from all observations taken in one field, even if they were taken on different nights. We sometimes call this the "project" co-add, because once observing for that project is finished, it represents all the (groupable) data taken for the project in that field
Public: This is a product consisting of all public observations of a particular field, even if they were taken for multiple projects. At this time, we have not generated any products of this type.

We sometimes refer to the simple product as the "obs" product (after the filename suffix that you will get when you download it). We also refer to products 2-4 as "aggregate" products, because they consist of more than one observation. You should always get an obs product for each observation in your project; aggregate products are made excluding any observations marked as bad.

Here's where you can find this option in the JCMT Science Archive:

Bad, bad, BAD observation!

This post is an explanation of where we are at the moment with handling quality in the JSA, and what the medium and long-term plan is. Since the topic is bad observations, this should be of particular interest to S2SRO users with very early SCUBA-2 data, since happily, ACSIS observing doesn't result in many of those!

What happens right now

At this time, when we process observations in batches to create night and multi-night (project) co-adds, the system does not use any observation that has been marked as bad in the OMP obslog (observations are good by default). In the case where only one sub-system is bad (for example in the SCUBA-2 case of the 450 being good and the 850 being marked bad) both observations are excluded. Questionable and rejected observations are included in the co-add.

People who are able to set an observation's status to BAD include JAC staff, the active observer at the telescope and the data owners (PIs and co-Is associated with the project). We track who changes the status of an observation, and people are encouraged to leave an explanatory comment as to why they did so.

In the near term

The problem is right now that we do not have enough eyeballs to look closely at the S2SRO data and assess whether every observation is good. In the near term, we know people are inspecting their data and really would like the data owners to take the time to mark an observation as bad in their OMP project pages. Then you can either ask for your products to be re-generated right away, or wait for the next re-processing run (SCUBA-2 data is being reprocessed frequently as the data reduction pipeline improves).

I also think that rejected observations (those that are technically good, but failed to meet a survey's particular QA criteria) should be excluded from the aggregate products - this is an issue we will take up with the surveys.

This is how you can mark your observation as bad when you are not at the JAC:

[In this example, let's say we have already identified that we don't want observation 68 taken on UT 2010-03-06 included in our aggregate products. For the impatient: OMP home page -> Access a project -> Pick your UT date -> Click on Comment]

The real plan

Obviously the current state of affairs is sub-optimal. The two major improvements that are planned in observation quality are:

Allow individual sub-systems to be marked as bad, rather than throwing the baby out with the bathwater. The problem with this is that obslog (which long predates archive processing) only understands the observation level, and so there are significant OMP infrastructure changes that need to be made to allow this.
Develop an interface between ORAC-DR and obslog that will allow the pipeline itself to mark observations as bad (not surprisingly, the most sure-fire to find bad observations is to read what ORAC-DR is telling you).

Both of these are on the cards, but the reality is that they are a lower priority that the main SCUBA-2 work, so it's not possible to promise a timeline for their delivery.

If you zoned out reading the above:

If you find bad observations in your data, take the time to mark them as bad in your project web pages. Watch the video to find out how.

2010-08-03

JSA FAQ: Finding your products

A few folk have had trouble figuring out how to get their proprietary products (processed data). You can do this by making the appropriate JSA query.

Here's how to do this starting from the JCMT home page

And here is how to do this starting from the CADC home page:

Don't forget that you have to use your CADC credentials for this operation, and they have to be associated with your JAC/OMP userid (so that the OMP can tell the CADC system that it is okay for you to access that data). If the above instructions do not work for you, it is likely that this linking of the two accounts has not been done; contact a JCMT staff member to do the deed. You get a CADC userid by applying to their site and picking a username of your choice; your OMP userid was issued to you when you successfully applied for time and typically consists of your last name followed by your first initial.

Pipelines and Archives

2010-08-11

About the JSA Products

2010-08-05

SMURF Update (August 5th 2010)

2010-08-04

New version of the SCUBA-2 cookbook

JSA FAQ: Product grouping types

Bad, bad, BAD observation!

2010-08-03

JSA FAQ: Finding your products

Subscribe via email

Who are you?

Useful Links

Blog Archive

Contributors