Fast Offline Analysis Guide

From my_wiki
Revision as of 18:01, 20 November 2024 by Daniel.morcuende (talk | contribs) (Quick-look next-day analysis)
Jump to: navigation, search

Quick-look next-day analysis

Documentation is work in progress...

Description of the tasks for the fast offline analysis (FOA)

Perform the analysis of data from alerts, time of opportunities (ToO), and transients' observations, which comprise: 

  • Gamma-ray burst observations
  • Gravitational wave alerts
  • Neutrino alerts and ToOs
  • Galactic transients ToOs
  • AGN ToOs

Also, FOA should perform the next-day standard analysis in case any expert (hardware/software) asks for a quick look analysis of any data to assess the telescope performance or check whether there is any problem with data-taking.

For bookkeeping, please also post the results on a new wiki page following Fast_Offline_Analysis_Template.

Analysis set up

To get higher priority in SLURM use the account option of sbatch/srun (-A, --account) "foa" like:

sbatch -A foa script_name.sh

Production of DL3 files for a quick-look analysis

Analysis guide made by Abelardo Moralejo (IFAE) after the LST-1 Analysis 2024 School



Outline of the procedure:

1. Start from DL1 files automatically produced by the on-site analysis, LSTOSA (each night's data is typically ready by noon the next day).

2. Select the relevant files for a given source, period, and zenith distance range.

3. Use already existing standard RF models to create DL2 real data files: select the closest declination line.

4. Use the already existing MC test DL2 files (corresponding to the same "production" as the RF models) to create the IRF files.

5. Produce the DL3 files.

6. Perform high-level analysis (theta2 distribution, sky-maps, SED and light curve).

7. Post the results and analysis details in a new wiki page following the Fast_Offline_Analysis_Template.


1. Selection of the DL1 real data files

Use the notebook cta-lstchain/notebooks/data_quality.ipynb

  • The version of the notebook used at the school (from lstchain v0.10.7) is at:

 /fefs/aswg/workspace/analysis-school-2024/

  • Copy it to a directory of your choice in the IT cluster, and open it with jupyter notebook. (check these instructions if you don’t know how to execute a jupyter notebook on a remote machine)
  • Look for the “USER INPUT” cells in the notebook and do the following:
  • In the “path to the necessary datacheck files” cell, set “files” to load all the datacheck files of 2023 data processed with version v0.10 of lstchain (explained in the notebook)
  • check that the source is set to “Crab Nebula”
  • set max_zenith = 35 * u.deg
  • set first_date = 20230117
  • set last_date = 20230117

This will select Crab observations (in standard wobble mode) up to 35 deg in zenith taken on the night of Jan 17, 2023. At the end of the notebook you will get a list of “good runs”:

/fefs/aswg/data/real/DL1/20230117/v0.10/tailcut84/dl1_LST-1.Run11699.h5
/fefs/aswg/data/real/DL1/20230117/v0.10/tailcut84/dl1_LST-1.Run11700.h5
/fefs/aswg/data/real/DL1/20230117/v0.10/tailcut84/dl1_LST-1.Run11701.h5
/fefs/aswg/data/real/DL1/20230117/v0.10/tailcut84/dl1_LST-1.Run11702.h5
/fefs/aswg/data/real/DL1/20230117/v0.10/tailcut84/dl1_LST-1.Run11703.h5

Just copy this list and using a text editor paste it into a file called “file_list.txt” that you will need later (yes, I know, the file could be written by the notebook directly…). If you want to select data from any time in the whole sample you have to change the input in the notebook cell “path to the necessary datacheck files”. The directory below contains links to (the most recent version of) all datacheck files:

/fefs/aswg/data/real/OSA/DL1DataCheck_LongTerm/night_wise/all/
  • See this school session for more details on the data quality selection

2. Production of the real data DL2 files

  • For this you need to execute lstchain_dl1_to_dl2 script. It reads in the DL1b data (image parameters like width, length, intensity…) from the DL1 files and reconstructs the “physics parameters” (direction, energy, gammaness)
  • Besides the DL1 files (those selected by data_quality.ipynb) you will need Random Forests (RFs, built from MC simulations). We will use existing RFs, created from the standard (“base”) MC, that is, with the default level of NSB (night sky background) i.e., ~ dark sky conditions. They are stored under:
/fefs/aswg/data/models/AllSky/20240131_allsky_v0.10.5_all_dec_base/

The “training” MC used to build RFs are generated in pointings along “declination lines”. Select the declination line closest to the declination of your source. Since Crab is at 𝛅≃22 deg, we select the line:

/fefs/aswg/data/models/AllSky/20240131_allsky_v0.10.5_all_dec_base/dec_2276/

The directory above contains the RFs and also the .json configuration file that was used to create them. It is important to use the same in the DL1 to DL2 stage

  • The jobs to convert DL1 into DL2 (which are quite memory-consuming) should be launched using SLURM. You can use the following script to do it:
/fefs/aswg/workspace/analysis-school-2024/helpful_scripts/launch_dl1_dl2.sh

It has two arguments: the directory where the RFs are, and the JSON file used in their production. You must execute it in the directory where you have the file_list.txt file containing the list of DL1 files. The directory must have a DL2 subdirectory where the DL2 files will be created. Copy the script to the same directory and execute it:  

MCMODELS=/fefs/aswg/data/models/AllSky/20240131_allsky_v0.10.5_all_dec_base
./launch_dl1_dl2.sh $MCMODELS/dec_2276 \
$MCMODELS/dec_2276/lstchain_config_2024-01-31.json

You will get a few messages saying that the jobs were submitted. If you want to check the status of your jobs just do:

squeue –u xxxx.yyyy

(where xxxx.yyyy is your username at the IT cluster)

3. Exploration of the DL2 data (theta2 and significance of detection)

  • With the DL2 files you can already check for the possible detection of a source. Use the notebook below to obtain a 𝛉2 plot around a given sky direction:

   cta-lstchain/notebooks/explore_DL2.ipynb

  • In the notebook you can easily see how to load the DL2 information into Pandas dataframes, and how to access the main reconstructed parameters: gammaness, direction and energy.
  • Note that for datasets longer than a few tens of hours the notebook may be rather slow and memory-hungry. In such cases, you may just move to the DL3 level (see next pages) and just do the 𝛉2 plots from there using Gammapy.

Post DL3 analysis

Significance of the detection (theta^2 distribution)

Starting from DL2 files you can use the following notebook (global selection cuts):

https://github.com/cta-observatory/cta-lstchain/blob/main/notebooks/explore_DL2.ipynb

You can also use Gammapy to calculate the theta2 distributions:

https://indico.cta-observatory.org/event/5272/contributions/42843/attachments/25274/36920/plot_theta2_from_dl3.ipynb

Sky maps

They are important for transient alerts which may not be well localized. Follow the notebook that uses the acceptance model library

https://github.com/mdebony/acceptance_modelisation

https://indico.cta-observatory.org/event/5272/contributions/43476/attachments/25278/36935/skymap_LST_analysis_school_2D_3D_ring_FoV_pointlike.ipynb

1D spectral analysis and light curve

You can follow this notebook:

https://indico.cta-observatory.org/event/5272/contributions/42843/attachments/25274/36922/post_DL3_analysis.ipynb

or the Gammapy tutorials for 1D spectral analysis using energy-dependent directional cuts and light curves:

https://docs.gammapy.org/1.2/tutorials/analysis-1d/spectral_analysis_rad_max.html

https://docs.gammapy.org/1.2/tutorials/analysis-time/light_curve_flare.html