Difference between revisions of "Logbook CameraCommissioning ORM Oct18"

From my_wiki
Jump to: navigation, search
m (Enter comments in reverse time order)
 
(127 intermediate revisions by 7 users not shown)
Line 4: Line 4:
 
{| class="wikitable" border=1 cellpadding="5" cellspacing="0"  
 
{| class="wikitable" border=1 cellpadding="5" cellspacing="0"  
 
|- style="background:#efefef;"
 
|- style="background:#efefef;"
! style="width: 100%;" | Date !! style="width: 5%;" | Actor/Author !! style="width: 10%;" | Action summary !! style="width: 80%;" | Comments !! style="width: 70%;" | Documents  
+
! style="width: 100%;" | Date !! style="width: 5%;" | Actor/Author !! style="width: 10%;" | Action summary !! style="width: 15%;" | Comments !! style="width: 30%;" | Documents  
 +
|- valign="top"
 +
 
 +
| 2019-01-29 || Taka, Mitsunari || style="background: #CCFFCC;" | Moving to ELOG
 +
||
 +
 
 +
We have moved the logbook to ELOG system in 10.200.100.102:9090.
 +
 
 +
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-17 || Léa Jouvin, Oscar Blanch, Sunsuke Sakurai)  || style="background: #FFFFCC;" | Rate Scans
+
 
 +
| 2019-01-28 || Taka, Mitsunari || style="background: #FFCCCC;" | Checking down of Osaka ports
 
||
 
||
  
We go for the planned rate scans
+
We iterated turning off and turning on the Data switches inside the camera. Then we checked whether the ports on Osaka are alive or dead with arp-scan 20 times.
 +
At the first first trial, we found p1p1, p1p2 and p3p1 were dead. P3p1 corresponds to Data5 fibre, which we reconnected this morning. We swapped the pair (up and down) of the patch cord between the patch panel and the Maxa switch inside the drive container. Then connection in p3p1 worked.  Including the first test for p1p1 and p1p2, each port died p1p1: 5 times, p1p2: 2 times, p2p1: 1 times, p2p2: 1 times, p3p1: 0 times, and p3p2: 0 times.
  
*
+
||
  
 +
|- valign="top"
 +
 +
| 2019-01-28 || Taka, Armand, Riccardo, Mitsunari || style="background: #FFFFCC;" | Measuring the optical fibre
 
||
 
||
  
| 2018-12-17 || Léa Jouvin, Oscar Blanch, Dirk (remote), Julien (remote) and Luis Angel (remote) || style="background: #FFFFCC;" | TIB - EVB data
+
We checked the optical throughput of the DATA5 fibre and UCTS fibre. The DATA5 fibre was fine, but the throughput of both of two fibres of the UCTS cable was too low.
 +
We connected the DATA5 cable to the connector for DATA5 on the camera and the DATA5 spare cable to the connector for the DATA5 spare.  
 +
 
 
||
 
||
  
* We set Analogue random trigger, Analogue Trigger (AT) = 1500 (we got rates between 300 Hz and 3000 Hz)
+
|- valign="top"
  
* We reset and start a run
+
| 2019-01-24 || Daniel M., Elena, Mitsunari || style="background: #CCFFCC;" | Implementing a changeable anode current limit and testing the anode current limiter
 +
||
  
* p1p2 and p2p2 ports at Osaka were not running, we put them up and start a pedestal run
+
* Daniel M. implemented a function which changes the anode current limit to a lower value, which was suggested by  Riccardo R., in ClusCo. The limit cannot be changed to a higher value.
 +
* We supplied 1000 V for all module and set the limit 0.010 mA. Then the maximum anode current decreased from 0.012 mA to 0.010 mA. The anode current limiter was confirmed to be working.
 +
||
  
* Random pedestal run (AT = 1600, since with 1500 we were getting some times too high rate)
+
|- valign="top"
** Rate was about 1000 Hz, but camera rate only 100 Hz. The other 900 Hz were busy rate.
 
** The BusyMap for the modules did not shown almost any busy.
 
** The RunNumber is Run 0076 (https://cta.cppm.in2p3.fr/LSTCAM/ZFITS/)
 
  
* Periodic pedestal run (we keep same run while increasing rate)
+
| 2019-01-24 || Mitsunari || style="background: #FFCCCC;" | Reproducing dead pixel
** Rate 1 kHz, all Ok
+
||
** Rate 2 KHz, all Ok
+
 
** Rate 4 KHz, buffer filling increasing
+
* I supplied HV 400V for all pixels.
** When buffer at ~80%, rate moved to 100 Hz to free it.
+
* I opened the camera shutter. The mean anode current was 0.12 uA.
** Two slowControl triggers in the run "interleaved" at the end
+
* I increased the HV to 1000 V by 100 V steps and supplied also the nominal HV. The mean anode current was 8.31 uA with the nominal HV. The HV was successfully supplied for all pixels by a precision of about 20 V (one-side full width), except for the off module. The anode current of Pixel 2 of Module 135 did not increase.
** The RunNumber is Run00077
+
* I supplied 1000 V only for Pixel 2 of Module 135. The read HV value was OK. The max anode current was 0.24 uA, and that of the pixel was lower than or equal to this value. This pixel appeared to be out of order.
 +
 
 +
||
  
* Random pedestal with AT=1600 and 500 Hz pulse injection in module 265
+
* Log files in TCS01:/var/log/clusco/reports
** Camera rate about 500Hz from module + ~100 Hz from pedestal, the other pedestals are those mainly contributing to busy rate ~1KHz
+
** monitor-190124-195808.txt
** The RunNumber is Run00078
+
** monitor-190124-201256.txt
 +
** monitor-190124-202103.txt
 +
** monitor-190124-203336.txt
  
* Periodic pedestal rate for speed tests
+
* Screen shots of the camera monitor
** Periodic pedestal from TIB 1000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW 84.5MB/s
+
[[File:NominalHV-AllModules 2019-01-24 at 20.21.08.png|thumb|200px|Nominal HV for all modules]]
** Periodic pedestal from TIB 2000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW 170 MB/s
+
[[File:1000V-Module135-AllPixels 2019-01-24 at 20.26.54.png|thumb||200px|1000 V only for Module 135]]
** Periodic pedestal from TIB 3000 Hz, busy rate 0 Hz, buffer filling 1% stable, waiting speed ZFW 255 MB/s
+
[[File:1000V-Module135-Pixel2 2019-01-24 at 20.29.17.png|thumb||200px|1000V only for Pixel 2 of Module 135]]
** Periodic pedestal from TIB 4000 Hz, busy rate 0 Hz, buffer filling slowly increasing , waiting speed ZFW 340 MB/s
 
** Periodic pedestal from TIB 100 Hz to free buffer
 
  
||
 
 
|- valign="top"
 
|- valign="top"
| 2018-12-17 || Léa Jouvin, Oscar Blanch  || style="background: #FFFFCC;" | Rate Scans
+
 
 +
| 2019-01-24 || Elena, Mitsunari || style="background: #FFFFCC;" | Opening/closing shutter
 
||
 
||
* We do three rate scans, first one with step 10, the other two with step 2. For DAC 1, and trigger mod 1.
 
** All three look fine, with no strange behaviour.
 
  
* We reconfigure the adder and we take a rate scan, this is done two times (for the first time, the rate scan is taken twice without reconfiguring adder)
+
* We confirmed the camera shutter can be opened/closed via the OPCUA client.
** All three look fine, with no strange behaviour.
+
* Opening the shutter via CaCo v1.0 failed.
 +
 
 +
||
  
* We enable/disable the test pulse (only disable/enable, no reconfiguration) and we take rate scan:
+
|- valign="top"
** Previous ones were finishing at DT~120
+
 
** First one, module 13 stays at 300 Hz until DT~170   
+
| 2019-01-23 || Mitsunari, Oscar, Shunsuke  || style="background: #CCFFCC;" | Check on HV supply
** Second one, module 13 stays at 300 Hz until DT~170   
+
||
** Third one, module 13 stays at 300 Hz until DT~170   
 
  
* We do full reconfiguration of modules (ClusCo init) and enable again Pulse Injection and we take one rate scan
+
Checks on HV supply with the shutter closed. HV was correctly supplied for all pixel except for one off module.
** The rate scan finished back to DT~120, with model 13 being the higher one
 
  
Since we are not able to reproduce the strange results observed on Friday, we decide to go for the list of rate scans prepared by Gustavo and Elena. If at any point we see strange behaviours we will check it.
 
 
||
 
||
[https://drive.google.com/file/d/1c1AhN-HX2Wi2yRO9cbye9LXI7Z09cMYy/view?usp=sharing Files]
 
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-17 || Léa Jouvin, Oscar Blanch || style="background: #CCFFCC;" | Start up
+
 
 +
| 2019-01-23 || Elena || style="background: #FFFFCC;" | Ratescans L0
 
||
 
||
1) 2 modules OFF!: 10.1.5.16 and 10.1.5.3
 
  
2) 1 module OFF!: 10.1.5.3
+
Checks on the modules that showed wrong L0 output. From a first look it doesn't seem to affect L1 sum.  
  
3) 1 module OFF!: 10.1.5.3
+
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-17 || Léa Jouvin, Oscar Blanch || style="background: #CCFFCC;" | Camera Inspection
+
 
 +
| 2019-01-23 || Shunsuke, Mitsunari || style="background: #CCFFCC;" | ClusCo mornitoring
 
||
 
||
  
Visual inspection once the drive test have finished. We checked that nothing had fall down. We also check some connectors and screws. Everything looks fine.
+
We confirmed Shunsuke's script for monitoring a ClusCo report worked for the latest ClusCo.
  
 +
||
  
 +
|- valign="top"
  
|- valign="top"
+
| 2019-01-23 || Elena || style="background: #CCFFCC;" | Start-up
| 2018-12-14 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | Data taking
 
 
||
 
||
In 20181214:
 
  
- Run00001.0000 to Run00001.0008 ('''Run00073'''): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT
+
All modules up.  
  
Run stoppping, start again
+
||
- Run00001.0008 to Run00001.0013 ('''Run00074'''): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT
 
 
 
Run stoppping, start again
 
- Run00001.0014 to Run00001.0027 ('''Run00075'''): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT
 
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-14 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Rate scan
 
||
 
During the afternoon, shutter close
 
  
- mode 1: withthout TP
 
- mode 3: withthout TP
 
  
-mode 1: TP, 300 Hz, 5 pe -> TP saturation issue for some modules presenting a rate of 300 HZ at high DT
 
-mode 3: TP, 300 HZ, 5 pe -> module 44 strange as on the wednesday night but others fine
 
  
|- valign="top"
+
| 2019-01-22 || Elena || style="background: #FFFFCC;" | L0 ratescans
| 2018-12-14 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | Start up
 
 
||
 
||
  
1) Two modules off: 10.1.5.3 and 10.1.5.16
+
* We started with a rate scan at L0 without pulse, init0 and starting up temperature ~16. Module 45 does not show the correct behaviour.
 
+
* We power cycled the camera a couple of times as at first we got that the module 10.1.5.13 did not come back. After a L0 ratescan all modules and trigger mezzanines were fine.  
2) One module off: 10.1.5.3
+
* We inject a 300Hz pulse and took a couple of rate scans at L0 with gain 40 and 32. The scans are mostly fine, but some pixels of some modules show less signal than expected.  
 +
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-14 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Test IR On/OFF one by one by ECC
 
||
 
  
1) From safe, impossible to swith ON the relay
 
  
2) From ready ready:
 
-  (-1, false) means all IR at OFF: only IR 0,1,2,3,4 was OFF
 
  
- (-1, True) means all ON : all ON but IR 1 that remains at 0 so we did (1, True) again and then IR 1 also ON.
+
| 2019-01-22 || Elena  || style="background: #FFFFCC;" | Start-up
 +
||
  
- Then we switch them all again with (-1,off): all off but IR 6. we did again (-1,off) and then all OFF.
+
At the first start-up the module 10.1.6.28 did not come up. Power cycled ad all the modules powered up. Taking a L0 rate scan to check the behaviour of the trigger.
  
- Again (-1,True): all OFF but IR 6 so we did again (-1,True) and then ECC to error to to the PDB
+
||
 +
 
 +
|- valign="top"
  
-> hard reset
 
  
From ready:
+
| 2019-01-21 || Elena, Riccardo  || style="background: #FFFFCC;" | L0 ratescans
- OFF one by one: from 0 to 7, all ok
+
||
- ON one by one: all ok until IR 7. We start switching ON from 0 to 7 and fro 7 ECC went to error again with the PDB
 
  
-> hard reset
+
* We started with a rate scan at L0 without pulse, init0 and starting up temperature ~13. Modules 25, 64, 80, 108, 123, 144, 155, 166, 174, 180, 196, 206, 210, 225, 232, 250 do not show the correct behaviour.
+
* We moved to the icrr_dev ClusCo version (in the validation folder). We did a L0 scan without pulse (and init8 reloaded) to confirm the results seen before.
- ECC still in Error and error say PDB communication error so ->hard reset again
+
* We cheched the registers 23 and 31 of the L0 ASIC. All the modules had them set to the same values: 23: x970000 and 31: x9f0000.
 +
* We finally took a L0 rate scan with the pulse injected to all the modules. Modules 25, 34, 64, 80, 108, 110, 123, 144, 155, 166, 174, 180, 196, 206, 210, 225, 232, 250 did not show a good behaviour (2 more than the previous rate scan).
  
- ECC safe but then after one minute went to error
 
  
- hard reset again
+
||
  
- ECC safe ans seems stabilized
+
|- valign="top"
  
  
 +
| 2019-01-21 || Elena, Riccardo  || style="background: #FFFFCC;" | Start-up
 +
||
  
 +
At the first start-up the module 10.1.6.28 did not come up. Power cycled, but the module 10.1.5.16 did not came up. At the third start-up all the modules powered up.
  
 +
||
  
 
|- valign="top"
 
|- valign="top"
  
| 2018-12-13 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | Start up
+
 
 +
| 2019-01-18 || Oscar, Elena || style="background: #CCFFCC;" | Chiller
 
||
 
||
  
1) one module off: 10.1.5.3
+
Carlos Diaz called us to check and alarm of the chiller. We found in the display a blinking orange message saying: "b 1AC". Carlos indicated (and sent us) a "Chiller Setting Procedure" that we followed to reset the chiller.
 +
Once reset we restarted the camera and waited for a while for the temperature to rise and the Chiller to reach the point of operation. All worked well.  
 +
***The Chiller setting Procedure is now hanging on the wall behind the computers but we should collect all these kinds of documents in a useful folder.
  
2) one module off: 10.1.5.3
+
Checking the historical monitoring we saw that the humidity has risen during the bad weather days. We told Carlos Diaz who replied he is looking into it and he thinks it is that the chiller was not working.  
  
3) Two modules off: 10.1.5.3 and 10.1.6.28
 
  
4) Two modules off: 10.1.5.3 and 10.1.6.28
+
||
  
5) Two modules off: 10.1.5.3
+
|- valign="top"
  
 
+
| 2019-01-18 || Oscar, Elena || style="background: #FFFFCC;" | ECC
|- valign="top"
 
| 2018-12-13 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | Datataking
 
 
||
 
||
  
 +
* We are forcing ECC to go in error while in Safe by changing minimum Temperature for Safe. We leave the ECC in error for defined time and the we recover and try to go to ready
 +
** 10 seconds -> recover to safe ok -> go to ready ok
 +
** 1 minute -> recover to safe ok -> go to ready ok
 +
** 5 minute -> recover to safe ok -> go to ready ok
 +
** 20 minute -> recover to safe ok -> go to ready ok
 +
** 60 minutes -> recover to safe ok -> got to ready ok
  
In 20181213
 
  
- Run0001.0000 - Run0001.0001 : TP in all modules, 300Hz, change module by module generated trigger. But TIB didn't send any trigger during the test, so we tried this test again
+
||
  
- Run0001.0002 - Run0001.0009 :  TP in all modules, 300Hz, change module by module generated trigger.
+
|- valign="top"
  
- Run0001.0010: we wanted to to some random trigger test from TIB but we don't know why EVB didn't receive anything trigger whereas collected rate was 600 Hz
+
| 2019-01-18 || Oscar, Elena  || style="background: #FFFFCC;" | L0 Trigger
 +
||
  
From here, we copy again the config xml file in ~hoffman/20181212
+
* L0 Rate scan after starting the modules at 7 deg (L0_scan_Init8_PIOff_HVOff_7deg_b.dat)
- - Run0001.0010: one test EVB recorData=False to confirm that for pedestalfrequency higher than 6500 HZ, we have a collected rate that doesn't match the camera rate and busy rate
+
** Modules 42, 45, 180. 225 and 231 show problems
 +
** Main register and register 4 for L0 is the expected one for all modules/pixels (19988 and 8651958)
 +
** After reconfiguring L0 to sent majority to L1 (register 4 to 8650870), a L1 scan is done for module 45 with DT599, which gives 65KHz rate for one pixel
 +
** L1 scan show rate of 65KHz until DT~15
 +
* Power off/on and init at 10 deg (L0_scan_Init8_PIOff_HVOff_10deg.dat)
 +
** Module 166 and 210 show problems
 +
** After reconfiguring L0 to sent majority to L1:
 +
*** L1 scan is done for module 45 with L0DT599, which gives 0 Hz for all pixels -> L1 scan goes to 0 Hz at L1DT~2
 +
*** L1 scan is done for module 45 with L0DT517, which gives 65 KHz for one pixel -> L1 scan goes to 0 Hz at L1DT~15
 +
*** L1 scan is done for module 210 with L0DT599, which gives 65 KHz for three pixel ->  L1 scan goes to 0 Hz at L1DT~15
 +
* Power off/on and init at 13 deg (L0_scan_Init8_PIOff_HVOff_13deg.dat)
 +
** All modules fine at L0 level
 +
** L1 scan done for module 45 and 210 with L0DT=599, which provide L0 rate 0, for both rate goes to 0 at L1DT~2, for module 45 it is consistent with previous one
 +
** L1 scan done for module 210 with L0DT=520, which provide 65KHz for three pixels -> L1 scan goes to 0 Hz at L1DT~20
  
- Run0001.0011 - Run0001.0044: AnalogPedestal Run
+
This shows that the problem is already at the level of the discriminator, not the LVDS copy. Being discussed with L0 experts.
  
In 20181212
+
||
- Run0001.0000 : TP in all modules, 1HZ, change module by module generated trigger
 
  
- Run0001.0000: test new pixel id scheme -> Run confusion
+
|- valign="top"
  
- Run0001.0001: Park position, shutter open, HV 1000 V, threshold for L1 DT was 10% of NSB level,
+
| 2019-01-18 || Oscar, Elena || style="background: #CCFFCC;" | Start Up
   
+
||
- Run0001.0002, Run0001.0003, 00001.00004: Park position, shutter open, Nominal HV 1000 V,first we trigger on noise for the EVB to receive high rate (1000 HZ) and then we move threshold for L1 DT at 10% of NSB level so trigger rate was between 5 and 10 Hz
 
  
 +
Camera was in safe and modules power up at once.
  
- Run00001.00005: Park position, shutter open, Nominal HV 1000 V,threshold for L1 DT at 10% of NSB level so trigger rate was between 5 and 10 Hz
+
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-12 || || style="background: #CCFFCC;" | UCTS:new MOS on tcs01
+
 
 +
| 2019-01-17 || Oscar, Elena || style="background: #CCFFCC;" | UAExpert
 
||
 
||
  
- stop the MOS on Osaka, is it now on tcs01 and we can well configure the UCTS
+
Direction of UCTS has been changed from 10.1.4.12:48010 to 10.1.4.1:48010 in the project CACO_ECC_TOB_UCTS_2, the one recommend to be used.
 +
This allow to connect to the OPC-UA for the UCTS running in TCS01
 +
 
 +
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-12/2018-12-13 || Léa, Nadia, Julie, Jan Luc || style="background: #CCFFCC;" | ECC test
+
 
 +
| 2019-01-17 || Oscar, Elena || style="background: #FFFFCC;" | ECC
 
||
 
||
  
Short summary of tests done at ORM this week on the ECC:
+
* We are forcing ECC to go in error while in Safe by changing minimum Temperature for Safe. We leave the ECC in error for defined time:
 +
** 10 seconds -> recover to safe ok
 +
** 1 minute -> recover to safe ok
 +
** 5 minute -> recover to safe ok
 +
** 20 minute -> recover to safe ok
 +
** 60 minutes -> recover to safe ok. Then tried to go to Ready and:
 +
*** ECC when to error indicating "IR RS485 Error". Fans were running.
 +
*** We acknowledge the error and the error changed to "Switch, PDB, Cable Error". Fans stopped running.
 +
** Tomorrow we will check to force the error and go to ready for the shorter intervals of time.
  
-Remote loading is now understood and available.
+
||
  
-Release “V32 patch” is available. This version is similar to the one used since September, however it:
+
|- valign="top"
  
·        Corrects the issues met with the intelligent relays in the transition alarm to safe.
+
| 2019-01-17 || Oscar, Elena  || style="background: #FFFFCC;" | L0 Trigger
 +
||
  
·        Adds more understandable shutter datapoints
+
* We took several L0 scans with no Pulse Injection, no HV:
 +
** L0_scan_Init6_PIOff_HVOff.dat and L0_scan_Init8_PIOff_HVOff.dat, only running init without powering off-> modules 42, 45, 157, 180 and 225 showed problems
 +
** L0_scan_Init8_PIOff_HVOff_b.dat: powering on/off (and init) -> module 231 showed problems. Reconfiguring, reseting L0, reseting L0 delays, ... did not solve it
 +
** L0_scan_Init8_PIOff_HVOff_{c,d,e,f,g}.dat: powering on/off (and init)-> all modules fine. Temperatures between 20 and 25 degrees.
 +
* We stop and let the camera cool down and take one scan powering up at 7 degrees
 +
** Module 225 showed problems
  
·        Heart beat with CaCo is temporary disabled to avoid disconnection with CaCo as seen last week.
+
||
  
This version has been extensively tested on Thursday and was used during the previous night.
+
[https://drive.google.com/file/d/1KEbIzwQhhGjncZtz7riYAQnFEG6x0e4V/view?usp=sharing RateScanFiles]
  
+
|- valign="top"
  
-A new ECC version called V34 has been finalized. In addition to the “V32 patch” features,
+
| 2019-01-17 || Oscar, Elena  || style="background: #CCFFCC;" | Start Up
 +
||
  
·        More data points are available to improve the camera monitoring & control (individual power control of IR, PSB (avoid), TIB, UCTS, data switches, …)
+
When arriving all was fine and everything worked smoothly to get the camera in ready. No error happened.
  
·        A configuration file is available. It allows to configure different parameters with recompiling the ECC (delays, CaCo heart beat …)
+
||
  
·        Better alarm identification & recovery is also available
+
|- valign="top"
  
This version has been tested ~1h on the camera. More tests will be done the coming days to take enough insurance before using it in the night runs.
+
| 2019-01-16 || Oscar, Elena  || style="background: #FFFFCC;" | ECC
 +
||
  
+
* Individual Switch On Vertical Bus bar relays
 +
** Off one by one, ok
 +
** On one by one, ok
 +
** Move to Safe and back to ready, ok
 +
** Off/On one by one. ok
 +
* PSB relays
 +
** TIB and UCTS, OK
 +
** General ShutDown (done twice with same result)
 +
*** Only half camera on
 +
*** PSB1 and PSB2 -> 24 V went off
 +
*** But, ECC went in error saying: "PDB Com Error:TRUE PDB description error: IR RS485 Error", Error Number 0
 +
*** Acknowledge Error, ECC went to Safe for 1 second and then to error again with : "PDB Com Error:TRUE PDB description error: Switch, PDB, Cable Error", Error Number 0
 +
*** Not able to recover
 +
* PDB relays
 +
** Ethernet switches, ok
 +
** Front Fans, ok
 +
 
 +
||
  
-The 3 exe versions (V32, V32 patch, V34) are available for the shifters. A script will be given to the shift leader to facilitate the exe loading.
 
  
 
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-11 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | start up
+
 
 +
| 2019-01-16 || Oscar, Patricia, Elena || style="background: #CCFFCC;" | Ethernet Cabling inside camera
 
||
 
||
1) one module off: 10.1.5.3
 
  
2) one module off: 10.1.5.3
+
* Switch 1:
 +
** Changed Ethernet Cables
 +
*** BP1015 moved from port 3 to port 5
 +
*** BP1008 moved from port 5 to port 7
 +
*** BP1009 moved from port 7 to port 9
 +
*** BP1011 moved from port 9 to port 11
 +
*** BP1013 moved from port 11 to port 13
 +
** Port 13 was free, now port 3 is free, so both normal and spare optical fibre can be used.
 +
** Port 2 is now connected to switch 3 instead of switch 2
 +
* Switch 2:
 +
** Changed Ethernet Cables
 +
*** BP0716 moved from port 3 to port 4
 +
*** Ethernet cable to Switch 3 In port 2, removed
 +
*** Ethernet cable to Switch 1 in port 4 removed
 +
*** Additional cable to Control Switch (across the camera) connected in port 2
 +
** Port 3 is free, so both normal and spare optical fibre can be used.
 +
* Switch 3:  
 +
** Changed Ethernet Cables:
 +
*** BP0401 moved from port 3 to 6
 +
*** BP0408 moved from port 6 to 8
 +
*** BP0409 moved from port 8 to 10
 +
*** BP0411 moved from port 10 to 12
 +
*** BP0405 moved from port 12 to 14
 +
*** BP0407 moved from port 14 to 16
 +
*** BP0403 moved from port 16 to 18
 +
*** BP0309 moved from port 18 to 20
 +
*** BP0312 moved from port 20 to 22
 +
*** BP0305 moved from port 22 to 24
 +
*** BP0310 moved from port 24 to 26
 +
*** BP0306 moved from port 26 to 28
 +
*** BP0308 moved from port 28 to 30
 +
*** BP0107 moved from port 30 to 32
 +
*** BP0108 moved from port 32 to 34
 +
*** BP0203 moved from port 34 to 36
 +
*** BP0102 moved from port 36 to 38
 +
*** BP0103 moved from port 38 to 40
 +
*** BP0205 moved from port 40 to 42
 +
*** BP0105 moved from port 42 to 44
 +
*** BP0106 moved from port 44 to 46
 +
*** BP0202 moved from port 46 to 48
 +
** Port 48 was free, now port 3 is free, so both normal and spare optical fibre can be used.
 +
** Port 4 is now connected to switch 1 instead of switch 2
  
3) Two modules off: 10.1.5.3 and 10.1.6.28
+
|
  
4)  one module off: 10.1.5.3
+
[[Media:NewCablingInsideCamera.jpg]]
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-11 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | Operation with HV
+
 
 +
| 2019-01-16 || Oscar, Elena || style="background: #FFFFCC;" | ECC
 
||
 
||
  
- Shutter close: 265 modules to 400 V and then nominal HV. Everything went smooth so then:
+
* ECC when arriving in the morning was in error:
 +
** Error Description: "Control Regulation Chain", minimum temperature for safe was set to 5, at the moment we arrived temperatures were between 6 and 10 in the web monitoring
 +
** Looking at the history in UA-Expert, there is the sensor 8, that is and has been below 5, all the other are and were not
 +
** Acknowledge error -> ECC stays on error and Error Number oscillates between 0 and 2
 +
** Minimum Temperature set to 0 degrees and Acknowledge error, ECC went to safe. It was due to sensor temperature 8 that it seems not to be displayed in the web monitoring.
 +
* Few seconds after recovering ECC to safe, ECC went again to error
 +
** Error Description: ""PDB Com Error:TRUE PDB description error: IR RS485 Error", Error Number 0
 +
** Set Acknowledge -> ECC to Safe for 1 second and back to Error, and fans went off.
 +
** Not able to recover it
 +
* Hard reboot
 +
** ECC at recovering when back to error with Error Description: "Control Regulation Chain" and Error Number oscillates between 0 and 2, Fans off
 +
** SetMinimum temperature to 0 and acknowledge error brings ECC to safe with Fans turning
 +
** ECC to ready, ok
 +
 +
 
  
- Shutter open: 1 module (central one): 400 V then 800 V then nominal HV. Everything smooth so we went for 265
+
||
  
- L1 and L0 scan. For Data taking we went from 60 to 40 in the DT by step of 5. From 40, some modules present to high rate for L1
 
  
|- valign="top"
 
| 2018-12-11 || Léa, Seiya, Sunsuke (Daniel, Daniela from remote) || style="background: #CCFFCC;" | Monitoring fix
 
||
 
Now we use only CLusCo on tcs01 and L0 and L1 internal and external are also monitored
 
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-11 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | Data Taking
 
||
 
In 20181211:
 
  
- All TP synchronised, 1 Hz, trigger sent module by module, 10 ns additional delay in TIB and 40 ns in the UP trigger propagation of the trigger for the central BP: from run0001
 
  
- run0002 should be delete it wad to test ZFW writting
 
  
- At night with HV ON and shutter open: run 0004
+
| 2019-01-15 || Oscar, Elena  || style="background: #FFFFCC;" | Test ECC version
 +
||
 +
 
 +
* ECC V34
 +
** Installed and started, running for 3 hours without problems
 +
** Checking functionalities:
 +
*** Error recover, due to too  high minimum temperature from safe: OK
 +
**** ECC in safe
 +
**** Minimum temp for Safe set above current temperature->ECC goes to error
 +
**** Minimum temp for Safe set below current temperature -> ECC stays in error
 +
**** Acknowledge error -> ECC goes to Safe
 +
*** Error recover, due to too high minimum temperature from ready:
 +
**** Minimum temp for Ready set above current temperature->ECC goes to error
 +
**** Minimum temp for Ready set below current temperature -> ECC stays in error
 +
**** Acknowledge error -> ECC stays in error while it should go to safe, ErrorResolution says COntrol Rrgulation Chain, ErrorNumber 1, ErrorDescription ...empty
 +
**** Rearme 400V -> ECC stays in error
 +
**** Acknowledge error -> ECC goes to safe
 +
*** Repeat error recoverr, due to too high minimum temperature from ready two more times: Ok, recovering to safe and then ready without the need of rearming.
 +
*** Turn off, relay by relay:
 +
**** ok for 0,1,2,3
 +
**** When turning of relay 4, ECC went to error, saying: "PDB Com Error:TRUE PDB description error: IR RS485 Error"
 +
**** Not able to recover from that error doing rearm 400 V. Fans were turning and they went off when rearming 400V. Neither using the setPDB_contactors method
  
|- valign="top"
 
| 2018-12-11 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | start up
 
 
||
 
||
  
  
1) Two modules off: 10.1.5.3 and 10.1.6.28
 
  
2) one module off: 10.1.5.3
+
|- valign="top"
 +
 
  
3) Two modules off: 10.1.5.3 and 10.1.6.28
 
  
4) Two modules off: 10.1.5.3 and 10.1.6.28
+
| 2019-01-15 || Oscar, Elena, Patricia  || style="background: #FFFFCC;" | Preparing Access to Camera
 +
||
  
5) Two modules off: 10.1.5.3 and 10.1.6.28
+
* Check how to access the rear part of the camera safely to re-cable the ethernet cables so that the spare fibres are functional at the same time that main ones.
 +
* Switch on was easy to re-arrange and ti has already been done.  
 +
* Work will continue tomorrow
  
6) Two modules off: 10.1.5.3 and 10.1.6.28
+
||
  
7) one modules off: 10.1.5.3
 
  
8) one modules off: 10.1.5.3
 
  
9) Two modules off: 10.1.5.3 and 10.1.6.28
+
|- valign="top"
  
10) Two modules off: 10.1.5.3 and 10.1.6.28
 
  
|- valign="top"
+
| 2019-1-9 || Pepa, Elena, Cristobal (remote)  || style="background: #FFFFCC;" | validation of calibration software
| 2018-12-10 || Shunsuke, Léa, Seiya || style="background: #CCFFCC;" | HV
 
 
||
 
||
We perform tests as follows.
+
We tested the IPRscan calibration, it seems to work but some more output is needed to fully validate it. We also understood the problem of no module and pixel monitoring while working with caco.  
  
- HV Supplying Test for Central Module with shutter closed.
+
||
  
--We supplied 400 V, 500 V, 600 V, 700 V, 800 V, 900V, 1000 V and Nominal Voltage to central module (module:133).
 
  
--In the test, HV are put off by script. It came from Shunsuke's mistake. But we confirmed his script works well. Any other problems were found.
 
  
- HV Supplying for central 19 modules as before test.
+
|- valign="top"
 +
 
 +
| 2019-1-10 || Pepa, Elena || style="background: #FFCCCC;;" | turning on camera
 +
||
 +
We turned on the camera 4 times because the module 10.1.6.28 did not come up for 3 times.
  
- HV Supplying for all modules as before test.
+
||
  
- .L0 & L1 rate scan with all modules applied with nominal HV with shutter closed
 
  
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-10 || Léa, Seiya, sunsuke || style="background: #CCFFCC;" | Data taking
+
 
 +
 
 +
| 2019-1-9 || Pepa, Elena, Cristobal (remote)  || style="background: #FFFFCC;" | validation of calibration software
 
||
 
||
In 2018/12/10
+
We turned on the camera and the first time the module 27 (IP: 10.1.6.28) did not come up; the second time the module 10.1.5.13 did not come up; while the third time all turned on.
  
-TP synchronised in all the modules, trigger sent by all the modules, 10 ns external delay add in the TIB: 0001.0000, 0001.0001 and 0001.0002.
+
We tested the DTN calibration over all the camera and it works. Although to make it working we had to activate the InitModules(0,0) function. We know that this function is not activated (with a valid reason) in the master version, but we need it to test the calibration software.
  
-TP synchronised in all the modules, trigger sent by all the modules, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0004, 0001.0005 and 0001.0006.
+
The DTN calibration could not find the noise region for some pixels in the modules 102, 182, 188 and had some troubles with other pixels. We then run a ipr_scan from ClusCo for the pixels of the module 188 and found out that some pixels have 0 rate in all the DT range.  
  
-TP synchronised in all the modules, trigger sent only by module 265, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0007 to 0001.0009
 
  
-TP synchronised in all the modules, trigger sent only by module 100, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0010 to 0001.0012
 
|- valign="top"
 
| 2018-12-10 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | start up
 
 
||
 
||
1) one module off: 10.1.5.3
 
  
2) Two modules off: 10.1.5.3 and 10.1.6.28
 
  
3) Two modules off: 10.1.5.3 and 10.1.6.28
 
  
4) Two modules off: 10.1.5.3 and 10.1.6.28
+
|- valign="top"
 +
 
  
5) one module off: 10.1.5.3
+
| 2018-12-19 || Oscar, Lea, Shunsuke  || style="background: #FFFFCC;" | Rate Scans
 +
||
  
6) one module off: 10.1.5.3
 
  
7) one module off: 10.1.5.3
+
(Files available through the Documents field)
  
 +
We do several rate scans with pulse injection. We aim to cross-check results from 2018-12-18. To make sure of the configuration, register 100, 101 and 102 of L1 are read after each L1 adder configuration (RegistersCheckPulseInjection.txt and RegistersCheckHVNominal.txt)
  
|- valign="top"
+
* Rate Scans with Pulse Injection with gain 20 (PI14 was meant to mean gain 14, but finally we used 20)
| 2018-12-07 || Léa, Seiya, Shunsuke || style="background: #CCFFCC;" | Rate scan
+
** They were launched through the dic file: L1Scan_InputsInAdders_20181219.uic
||
+
** No Input in any adder
- L0 and L1 scan
+
*** File Name: L1_PI14_step2_OnlyReset_AdderN_20181219_1.dat, N={A,B,C}
 +
** Local Input only in one adder
 +
*** File Name: L1_PI14_step2_LocalInM_AdderN_20181219_1.dat, M={A,B,C}, N={A,B,C}
 +
** LocalMode
 +
*** File Name: L1_PI14_step2_LocalMode_AdderN_20181219_1.dat,  N={A,B,C}
 +
** Other Neighbours than local needed for Mod 3, individually added.
 +
*** File Name: L1_PI14_step2_NXInY_AdderY_20181219_1.dat, (X,Y)={(A,2),(A,4)(B,4),(B,5)}
 +
** Registers were fine for all settings and signal is observed in all adders where it is expected.
  
- With No TP, L0 from 300 to 650 step=5 and L1 from 0 to 200 step=2
+
* Rate Scans with nominal HV and shutter close
 
+
** They were launched through the dic file: L1Scan_Mode3_ShutterClose_HVNominal_20181219.uic
- TP, 300 Hz, gain=40 (50 p.e.): L0 from 400 to 900 step=5 and L1 from 0 to 200 step=2
+
** Mode 3:
 +
*** FIle Name: L1_HVNominalShutterClose_step2_Mode3_AdderX_20181219_1.dat, X=(A,B,C)
 +
*** Adder C shows nothing as expected, since no input is linked to it.
  
- TP, 300 Hz, gain=20 (5 p.e.): L0 from 400 to 700 step=5 and L1 from 0 to 200 step=2
+
Everything seems reasonable this time. Strange feature observed last night (never something fully crazy), may have been due to bad configuration. Sometimes it may have been due to human mistakes, but at least the fact that some rates were observed for adder C in mode 3 does not seem so, since it was executed launching the file L1Scan_ShutterClose_20181214.uic.
  
|- valign="top"
 
| 2018-12-07 || Léa, Seiya, Shunsuke || style="background: #CCFFCC;" | Datataking
 
 
||
 
||
  
- TP synchronisation test with 1 us widown of legacy daq -> adding 10 ns delay for 10 MHz clock to all module seems to fix the problem
+
[https://drive.google.com/file/d/1_qZT7axv7zR-ZXDvcCw_F4-tXbN5XPp2/view?usp=sharing RateScanFiles]
 
+
[https://drive.google.com/file/d/15uN8f_NNUk8M0X_oto2DfvCrhKkkmNkK/view?usp=sharing RegistersPI]
- Random trigger.
+
[https://drive.google.com/file/d/1gQw8bAq2DlijaBHAk_CCkGlN1QtvkjV2/view?usp=sharing RegistersHV]
1) Run for 2 minutes and several module remains busy, no more trigger rate comming ->restart
+
[https://drive.google.com/file/d/1rqjjAPJvFrjNqpqCy-vB9fzS94jQztJp/view?usp=sharing MacroPI]
 +
[https://drive.google.com/file/d/1GF0Ouq8KdyM-qIHmizuFq6uBGUuasnrW/view?usp=sharing MacroHV]
 +
[https://drive.google.com/file/d/1dtqe4rmI5NcVX-7AKq60G0XpAYh99fld/view?usp=sharing MacroHV20181218]
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-06 || Léa, Seiya, Shunsuke|| style="background: #FFCCCC;" | Osaka interface p1p2
 
||
 
  
Something strange happens. As often, p1p2 was done. but it was impossible to have it up again. We had to switch OFF the camera.
 
  
|- valign="top"
+
| 2018-12-19 || Oscar, Lea, Shunsuke and Taka (remote) || style="background: #FFFFCC;" | DataTaking for capacitor calibration
| 2018-12-07 || Léa, Seiya, Daniel and Daniela (from remote) || style="background: #FFFFCC;" | Test of slow control from Japan
 
 
||
 
||
- Pb monitor/slow control seems solved
 
  
- temperature monitoring of each pixel in the monitor function
+
* Taking 1kHz Testpulse Data with ROI 40, DRS internal clock, using Externall trigger.
 +
Trigger was created from module ID 265.
 +
* I performed short run first (-20 second). This is (.../onsite/data/20181218/Run00001.0004 and Run00001.0005)
 +
* The long run performed with EVB, files are in .../onsite/data/20181219/Run00001.0000 - Run00001.0020.
 +
* According to Taka quick analysis, These was no pulse in ROI. So Another data taing was perforemd with Legacy DAQ just 30 event.
 +
 
  
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-12-07 || Léa, Seiya, Daniel and Daniela (from remote) || style="background: #CCFFCC;" | start up
+
| 2018-12-19 || Oscar, Lea, Shunsuke || style="background: #FFFFCC;" | ECC version: v32 patch
 
||
 
||
1) Two modules off: 10.1.5.3 and 10.1.6.28
 
  
2) One module off: 10.1.5.3
+
ECC was in error due to PDB error communication when we arrive this morning
  
3) One module off: 10.1.5.3
+
-> Decided to go back to the version V32patch
  
4) Two modules off: 10.1.5.3 and 10.1.6.40
 
  
5) One module off: 10.1.5.3
+
|- valign="top"
  
6) Two modules off: 10.1.5.3 and 10.1.6.40
 
  
7) One module off: 10.1.5.3
+
| 2018-12-19 || Oscar, Lea, Shunsuke, Cristobal (remote) and Elena (remote) || style="background: #FFFFCC;" | Filling MongoDB
 +
||
  
8) One module off: 10.1.5.3
+
* Test with one BP and one pixels variables successful
 +
* ECC variable being filled to the MongoDB
  
9) One module off: 10.1.5.3
+
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | Module remaining OFF during the whole day
+
| 2018-12-19 || Oscar, Lea, Shunsuke || style="background: #FFFFCC;" | Rate Scans
||  
+
||
 +
We continue the planned rate scans
  
10.1.5.3
+
*  L1 rate scan, step 2, Pulse Injection (PI) 20 (~5Phe), HV OFF:
 +
** Only local L0, adders A, B and C
 +
*** No crazy things, all seems reasonable, Adder A finishes at DT ~120
 +
*** Details to be checked
 +
*** File names: L1_NoHV_PI5Phe_ModeLocal_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
 +
*** Third repetition for A and first for B and C are full range, others until it stops
 +
*** Adder B and C have no signal, rate scan done only once. But they should have if the local mode was correctly set.
 +
** Trigger Mode 3, adders A, B and C
 +
*** No crazy things, all seems reasonable, Adder A and B finishes at DT ~220 and ~230 respectively
 +
*** Details to be checked
 +
*** File names: L1_NoHV_PI5Phe_Mode3_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
 +
*** First repetition is always full range, others until it stops
 +
*** Adder C no signal as expected, hence only one rate scan done
  
|- valign="top"
+
*  L1 rate scan, step 2, Shutter Close, HV Nominal:
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | ECC/CAco: Error and undefined
+
** Only local L0, adders A, B and C
||
+
*** No crazy things, all seems reasonable
- after one hour module ON, ECC went to error and Caco and undefined state. Then in implies hard reset for ECC to have current back in the bus bar
+
*** Details to be checked
 +
*** File names: L1_shutterclose_step2_modelocal_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
 +
*** First repetitions are full range, others until rates are 0
 +
** Trigger Mode 3, adders A, B and C
 +
*** No crazy things, all seems reasonable
 +
*** Details to be checked
 +
*** File names: L1_shutteroclose_step2_mode3_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
 +
*** First repetition is always full range, others until it stops
 +
*** Adder C no signal expected, hence only one rate scan done. Still, 1 Hz rate is observed from time to time for basically any DT value.
  
- SwitchOn() from Caco, Caco fin in state 2 but ECC in error state...
+
*  L1 rate scan, step 2, Shutter Open, HV = 800 V (moon quite bright):
 +
** Only local L0, adders A, B and C
 +
*** No crazy things, all seems reasonable
 +
*** Details to be checked
 +
*** File names: L1_shutterclose_step2_modelocal_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
 +
*** First repetitions are full range, others until rates are 0
 +
** Trigger Mode 3, adders A, B and C
 +
*** No crazy things, all seems reasonable
 +
*** Details to be checked
 +
*** File names: L1_shutteroclose_step2_mode3_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
 +
*** First repetition is always full range, others until it stops
 +
*** Adder C no signal expected. Still, 1 Hz rate is observed from time to time for basically any DT value.
  
- ECC went to error and this time we saw Caco fine at state 3 whereas ECC in error and then Caco going to undefined as expected since ECC was OFF. This error state of the ECC appeared twice today in the middle of Data taking and after around 1 hours and half of camera switch on.
+
We will continue tomorrow. One thing we want to check is rate scan with HV with no input to any Adder, to see if the the 1 Hz is still there.
  
|- valign="top"
+
||
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | Data taking
 
||
 
- With legacy DAQ to have the 1024 ns window to work on BP synchronisation
 
  
- Try random trigger with writting with EVB during 7 minutes fine even if at one moment TIB rate went to 0 but it get back. ECC error so camera stop we couldn't test more
+
[https://drive.google.com/file/d/1c7n5ffrysFe9usXUVAHr-D1qf_DpXCV9/view?usp=sharing Files]
  
- Try legacydaq run with random trigger without writtin but module 10.1.6.38 reach a connected but busy state and so no trigger was sent anymore.. Try again.
 
Legacy daq data present same result of too busy rate, including higher than EVB
 
  
- Try long run of data with EVB and randon trigger: we run for one hour run in 20181206
+
|- valign="top"
 +
| 2018-12-18 || Oscar, Mau, Michele || style="background: #CCFFCC;" | Calibration Box On ground
 +
||
  
 +
* Calibration IP address and mask set according to [https://docs.google.com/spreadsheets/d/16FzSagSH0YlhC03u8K3C20uvV3Rxcpo3e8fHASWkAQs/edit#gid=0 LST1NetworkOrganization]
 +
** IP : 10.1.4.69
 +
** Mask: 255.255.255.128
 +
** Gateway: 10.1.4.1
  
 +
* Communication with UA-expert stablished and wheels moved
 +
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | start up
+
| 2018-12-18 || Léa, Oscar, Sunsuke, || style="background: #FFFFCC;" | Datataking
 
||
 
||
1) One module off: 10.1.5.3
+
In 20181218
  
2) One module off: 10.1.5.3
+
*Run00079: Test ZFW speed writting, PEriodic frequency from the TIB 1000 Hz
  
3) One module off: 10.1.5.3
+
*Run00080: Test ZFW speed writting
 +
** Periodic pedestal from TIB 1000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW  ??? MB/s
 +
** Periodic pedestal from TIB 2000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW ??? MB/s
 +
** Periodic pedestal from TIB 3000 Hz, busy rate 0 Hz, buffer filling 1% stable, waiting speed ZFW ??? MB/s
 +
** Periodic pedestal from TIB 4000 Hz, busy rate 0 Hz, buffer filling 1% stable , waiting speed ZFW ??? MB/s
 +
** Periodic pedestal from TIB 5000 Hz, busy rate 0 Hz, buffer filling 1% stable , waiting speed ZFW ??? MB/s
 +
** Periodic pedestal from TIB 6000 Hz, busy rate 0 Hz, buffer filling slowly  , waiting speed ZFW ??? MB/s
 +
** Periodic pedestal from TIB 100 Hz, to free the buffer.
  
4) One module off: 10.1.5.3
+
* It looks like the IP and ports were mixed in the configuration, they should be:
 +
**tcs03:* 10.200.100.70:13820, 10.200.100.72:13840
 +
**tcs04:* 10.200.100.71:13830, 10.200.100.73:13850
 +
**But it did not work. It needs to be investigated.
  
5) One module off: 10.1.5.3: Today we will perform operation without this module
+
||
  
Caco to undefined, ECC to error state -> hard reset
+
|- valign="top"
6) Two modules OFF: 10.1.5.3 and 10.1.6.28
+
| 2018-12-18 || Oscar, Shunsuke, Lea, Elena (remote) and Cristobal (remote) || style="background: #FFFFCC;" | Internal Camera Calibrations
 +
||
  
7) One module off: 10.1.5.3
+
* Lunching calibrations from CaCo: some initialisation problems. Cristobal will check with modules at IFAE and we try again tomorrow.
 +
||
  
8) ECC went to error at the switchon() -> hard reset then One module off: 10.1.5.3
+
|- valign="top"
 +
| 2018-12-18 || Léa, Oscar, Sunsuke,  || style="background: #CCFFCC;" | Disconnect 10.1.5.3
 +
||
 +
- IR 2 switch on from safe: 23.4, 23.5, 23.6
  
9) After one hour and half of Camera on, ECC went to error suddenly -> hard reset. Then Two modules OFF: 10.1.5.3 and 10.1.6.28
+
- On ready, all IR ON, current in IR 2: 25.1 - 25.2, both before and after unplugging the module.
 +
||
 +
[[Media:UnpoweredCluster.jpg]]
 +
|- valign="top"
 +
| 2018-12-18 || Léa, Oscar, Sunsuke, Jean luc and Nadia from remote  || style="background: #CCFFCC;" | ECC intelligence relay test
 +
||
  
10) One module off: 10.1.5.3
+
- From safe, not possible to switch one the relay
  
11) Human mistake -> current went to high because dragon was reset before DAQ deconnection... -> switch off/On the camera
+
- GotoReady: IR, 2 3 6 not switch on
 +
 +
- switch them OFF one by one: 0, 1, 4, 5, 7 went to OFF
  
12) One module off: 10.1.5.3
+
- switch them all ON: all the IRs ON
  
|- valign="top"
+
- switch them all OFF (-1, false): All off
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | start up
+
 
||
+
- switch them all On (-1, true): All on
1) Multiple this morning but with current busbar to 0
 
  
2) startup in the afternoon with normal version of ECC: ALL modules ON, All pixels ON
+
- GO to safe then Ready: IR 1, 2, 4, 5 OFF. switch them all ON (-1, true) -> All ON
  
3) One module OFF: 10.1.6.28
+
- GO to safe then Ready: IR 1, 2, 6 OFF, switch them all ON (-1, true) -> All ON
  
4) ALL modules ON, All pixels ON
+
- Bug fixed, running with the new version now
 +
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | Data taking
+
| 2018-12-17 || Léa Jouvin, Oscar Blanch, Sunsuke Sakurai)  || style="background: #FFFFCC;" | Rate Scans
||  
+
||
With legacy DAQ to have 1024 ns window to work on BP synchronisation
 
- TP synchronised in all modules and all sending trigger. Some BP setting was updated since it was previously not done
 
  
- TP synchronised in all modules and all sending trigger with default BP setting from the ring distribution
+
We go for the planned rate scans
  
- Random Trigger with EVB, all runs in 20181205
+
*  L1 rate scan, step 2, Pulse Injection (PI) OFFl, HV OFF:
+
** Only local L0, adders A, B and C
- Random trigger with writing: some TIB crashs where all the rate and digital pedestal frequency go to 0 at one moment. It really depends each trial but sometimes we can reach 6500 Hz.
+
*** No crazy things, all seems reasonable
 +
*** Details to be checked
 +
*** File names: L1_NoHV_NoPI_ModeLocal_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
 +
** Trigger Mode 3, adders A, B and C
 +
*** Three first rate scans for Adder A showed crazy modules
 +
*** We reconfigured L1 manually:  the reset and set of the three adders was done in BBMenu
 +
*** Afterwards, rate scans had not crazy issues.
 +
*** Details to be checked
 +
*** File names: L1_NoHV_NoPI_Mode3_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
  
- Random trigger without writing: running for 13 minutes without any crash reach 6500 camera rate with 1700 HZ of busy rate.  
+
||
 +
[https://drive.google.com/file/d/1c1AhN-HX2Wi2yRO9cbye9LXI7Z09cMYy/view?usp=sharing Files]
  
- last try random trigger with writting and no crash for more than 10 minutes so we are lost now...
+
|- valign="top"
 +
| 2018-12-17 || Léa Jouvin, Oscar Blanch, Dirk (remote), Julien (remote) and Luis Angel (remote)  || style="background: #FFFFCC;" | TIB - EVB data
 +
||
 +
( You can find the [https://portal.cta-observatory.org/WG/lst/DAQ/SitePages/RunCatalog.aspx Catalogue] for the runs taken)
  
- 3 initialization of modules without monitoring and no problem of busy modules
+
* We set Analogue random trigger, Analogue Trigger (AT) = 1500 (we got rates between 300 Hz and 3000 Hz)
  
|- valign="top"
+
* We reset and start a run
| 2018-12-05 || Léa + Jean Luc and NAdia from remote || style="background: #CCFFCC;" | ECC version
 
||
 
- install current version v32+small update (delay applied when coming back from error state) with an executable produced by their own. ECC to ready but current un the busbars to 0
 
->hard reset but then fan off so second hard reset
 
  
- Second try, still 0 in the bus bars -> hardreset and installation of the new ECC version v34 -> fan at 0 so new hard reset -> fan still at 0
+
* p1p2 and p2p2 ports at Osaka were not running, we put them up and start a pedestal run
  
- Went back to the old version v32 we are using since two months and fan ON but current in the busbars to 0 ->hardreset
+
* Random pedestal run (AT = 1600, since with 1500 we were getting some times too high rate)
 +
** Rate was about 1000 Hz, but camera rate only 100 Hz. The other 900 Hz were busy rate.
 +
** The BusyMap for the modules did not shown almost any busy.
 +
** The RunNumber is Run 0076 (https://cta.cppm.in2p3.fr/LSTCAM/ZFITS/)
  
- Fan ON. Try to kill the ECC programm and start it again -> Fan OFF. This test indicates that there is a communication issue between ECC and PDB and when PDB lost ECC connection, goes to a safe mode
+
* Periodic pedestal run (we keep same run while increasing rate)
 +
** Rate 1 kHz, all Ok
 +
** Rate 2 KHz, all Ok
 +
** Rate 4 KHz, buffer filling increasing
 +
** When buffer at ~80%, rate moved to 100 Hz to free it.
 +
** Two slowControl triggers in the run "interleaved" at the end
 +
** The RunNumber is Run00077
  
- Try again kill ECC and start again - > Fan ON so PDB get back to ECC connection without hardreset. Then try again kill/start ECC -> fan ON few seconds and went to 0.  Then try again kill/start ECC -> fans still at 0 ->  hard reset
+
* Random pedestal with AT=1600 and 500 Hz pulse injection in module 265
 +
** Camera rate about 500Hz from module + ~100 Hz from pedestal, the other pedestals are those mainly contributing to busy rate ~1KHz
 +
** The RunNumber is Run00078
  
- new version v32 that cut the communication with the PDB -> fan ON after the hard reset. Two attemps of kill/start ECC and Fan still ON so clearly this indicates than Fan OFF comes from a problem of communication between ECC and PDB.
+
* Periodic pedestal rate for speed tests (not kept on disk!)
+
** Periodic pedestal from TIB 1000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW 84.5MB/s
- Now coming back to the old version we are using since two months. Restart -> Fan OFF -> hard reset -> Fan OFF -> second hard reset > fan ON
+
** Periodic pedestal from TIB 2000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW 170 MB/s
 +
** Periodic pedestal from TIB 3000 Hz, busy rate 0 Hz, buffer filling 1% stable, waiting speed ZFW 255 MB/s
 +
** Periodic pedestal from TIB 4000 Hz, busy rate 0 Hz, buffer filling slowly increasing , waiting speed ZFW 340 MB/s
 +
** Periodic pedestal from TIB 100 Hz to free buffer
  
Two things learn:
+
||
- Fan OFF seems to be due to no communication between PDB and and ECC. It is possible that PDB doens't connect to ECC after first hard reset and then go is safe mode. Maybe be due to a time connection problem of ECC now, to investigate since this problem appears only since one week... More over normally after a hard reset, this is the first connection from ECC to PDB so PDB should wait as time as it needs and not go in safe mode.
+
|- valign="top"
 
+
| 2018-12-17 || Léa Jouvin, Oscar Blanch  || style="background: #FFFFCC;" | Rate Scans
- Executable of ECC created from remote seems to work from some instances like controlling the mode, controlling the fans etc... but for the current in the bus bars for example it always went to 0
+
||
 +
* We do three rate scans, first one with step 10, the other two with step 2. For DAC 1, and trigger mod 1.
 +
** All three look fine, with no strange behaviour.
  
 +
* We reconfigure the adder and we take a rate scan, this is done two times (for the first time, the rate scan is taken twice without reconfiguring adder)
 +
** All three look fine, with no strange behaviour.
  
 +
* We enable/disable the test pulse (only disable/enable, no reconfiguration) and we take rate scan:
 +
** Previous ones were finishing at DT~120
 +
** First one, module 13 stays at 300 Hz until DT~170   
 +
** Second one, module 13 stays at 300 Hz until DT~170   
 +
** Third one, module 13 stays at 300 Hz until DT~170   
  
 +
* We do full reconfiguration of modules (ClusCo init) and enable again Pulse Injection and we take one rate scan
 +
** The rate scan finished back to DT~120, with model 13 being the higher one
  
 +
Since we are not able to reproduce the strange results observed on Friday, we decide to go for the list of rate scans prepared by Gustavo and Elena. If at any point we see strange behaviours we will check it.
 +
||
 +
[https://drive.google.com/file/d/1c1AhN-HX2Wi2yRO9cbye9LXI7Z09cMYy/view?usp=sharing Files]
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | Data taking
+
| 2018-12-17 || Léa Jouvin, Oscar Blanch  || style="background: #CCFFCC;" | Start up
||  
+
||
- Data taken with TP synchornised but trigger sent by only one module
+
1) 2 modules OFF!: 10.1.5.16 and 10.1.5.3
  
- Data taken with TP only in central. until 15000 Hz, no busy since we are below the maximal writting speed. Then at 15 Khz, we have: 265 modules * 1344 kOctet * 15 khz= 5.3 GB = 40 Gbits/s. what we expect with the four link at 10 Gbits/s.
+
2) 1 module OFF!: 10.1.5.3  
  
In /fefs/onsite/data/20181204
+
3) 1 module OFF!: 10.1.5.3
- Run0001.0000 to .00101: one TP in the central BP, triggering the whole camera
+
||
 +
|- valign="top"
 +
| 2018-12-17 || Léa Jouvin, Oscar Blanch  || style="background: #CCFFCC;" | Camera Inspection
 +
||
  
- Run0001.00103, .00104, .00105: TP in every modules synchronised, only module 265 sent trigger, new BP delay: it seems now we can see the pulse in the central module but we don't see it in other, to investigate...
+
Visual inspection once the drive test have finished. We checked that nothing had fall down. We also check some connectors and screws. Everything looks fine.
  
- Run0001.0106 to .00107, .00108: old BP delay, TP in every modules synchronised, all modules sending trigger: no TP visible in the data, to investigate....
+
||
  
- Run0001.0109 to Run0001.0154: random trigger pedestal run. At the pedestal frequency of 3100 Hz, all pedestal frequency, collected rate, Camera rate and busy rate went to 0 at the same time.
+
|- valign="top"
 +
| 2018-12-14 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Data taking
 +
||
 +
In 20181214:
  
 +
- Run00001.0000 to Run00001.0008 ('''Run00073'''): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT
  
 +
Run stoppping, start again
 +
- Run00001.0008 to Run00001.0013 ('''Run00074'''): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT
 +
 +
Run stoppping, start again
 +
- Run00001.0014 to Run00001.0027 ('''Run00075'''): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | Caco-ECC error/undefined state
+
| 2018-12-14 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Rate scan
||  
+
||
- First power up, ECC went to error state whereas Caco was fine
+
During the afternoon, shutter close
  
- For one of the power up in which we stay 1h30 on ready mode ECC went to error and Caco to undefined (we don't know in which order).
+
- mode 1: withthout TP
ECC to state 1 after disabling the _error_heart_bit but then no current in the busbar->hardreset
+
- mode 3: withthout TP
Then impossible to get Caco back in a normal mode even after ECC to state 1-> kill and restart Cacolaucher...
 
 
 
- After the second hard reset, fan was off so we add to do another hardreset... Then everything ok
 
  
 +
-mode 1: TP, 300 Hz, 5 pe -> TP saturation issue for some modules presenting a rate of 300 HZ at high DT
 +
-mode 3: TP, 300 HZ, 5 pe -> module 44 strange as on the wednesday night but others fine
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | TIB 255 issue fixed
+
| 2018-12-14 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Start up
||  
+
||
  
- We forgot to call the reset() method between each initialization of the modules. Now it works fine.
+
1) Two modules off: 10.1.5.3 and 10.1.5.16
  
 +
2) One module off: 10.1.5.3
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | start up
+
| 2018-12-14 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Test IR On/OFF one by one by ECC
||  
+
||
  
3)All modules ON and all pixel ON (data taking but then TIB state 255)
+
1) From safe, impossible to swith ON the relay
  
4)All modules ON and all pixel ON (data taking but then TIB state 255)
+
2) From ready ready:
 +
(-1, false) means all IR at OFF: only IR 0,1,2,3,4 was OFF
  
5)All modules ON and all pixel ON (Caco went to undefined and ECC to error state, we don't know in wich order): For this start up we stayed one hour and half on ready mode
+
- (-1, True) means all ON : all ON but IR 1 that remains at 0 so we did (1, True) again and then IR 1 also ON.
  
7)All modules ON and all pixel ON
+
- Then we switch them all again with (-1,off): all off but IR 6. we did again (-1,off) and then all OFF.
  
 +
- Again (-1,True): all OFF but IR 6 so we did again (-1,True) and then ECC to error to to the PDB
  
|- valign="top"
+
-> hard reset
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | start up
 
||
 
1) SiwthOn() from Caco, went to state 2 normal, but ECC to state 4 -> hard reset
 
  
2) Caco and ECC fine but one module OFF: 10.1.6.28 -> OFF/ON again
+
From ready:
 +
- OFF one by one: from 0 to 7, all ok
 +
- ON one by one: all ok until IR 7. We start switching ON from 0 to 7 and fro 7 ECC went to error again with the PDB
  
6) ECC went to error state and Caco to state undefined (We don't know in wich order...). Before hard reset, we try directly from ECC to go to state ready but current to 0 in the bus bars ->hard reset
+
-> hard reset
Then The fan speed were at 0 so hard reset again. Then it works fine
+
|- valign="top"
+
- ECC still in Error and error say PDB communication error so ->hard reset again
| 2018-12-03 || [[User:DirkHoffmann|Dirk]] || style="background: #CCFFCC;" | CamerasToACTL v1.7
+
 
||
+
- ECC safe but then after one minute went to error
  
Installed on tcs03 and tcs04 from repository (https://cta.cppm.in2p3.fr/repo/x86_64/) and tested with/by Léa and Seiya.
+
- hard reset again
  
|- valign="top"
+
- ECC safe ans seems stabilized
| 2018-12-03 || Léa,Seiya || style="background: #FFCCCC;" | increase of the bus bars current
 
||
 
  
Due to the BP reset wheread EVB was still connected to the modules, current increase to 40. It is now know than the current increase when DAQ is connected and no clock distributed... We still don't know why, in study!!!
 
  
|- valign="top"
 
| 2018-12-03 || Léa, Seiya || style="background: #CCFFCC;" | Data Taking
 
||
 
- TP synchronised with legacy and EVB data
 
  
- Too many files created compared to the number of ZFW instances
+
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-12-03 || Léa,Seiya || style="background: #FFCCCC;" | TIB/UCTS
 
||
 
  
- TIB went to state 255 even after reset. so we shut down and off the camera...
+
| 2018-12-13 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Start up
 +
||
  
- Again, TIB went to 255 5 secdondes after reaching state 5, all rate at 1444O.
+
1) one module off: 10.1.5.3
  
- TIB went to 255 from state 4 directly. The feeling in one day is that after 3 cycle of TIB going to 5 and then reset, it is going to 255 and we have to switch off the camera
+
2) one module off: 10.1.5.3  
  
|- valign="top"
+
3) Two modules off: 10.1.5.3 and 10.1.6.28
| 2018-12-03 || Léa || style="background: #CCFFCC;" | Fix UCTS configuraiton
 
||
 
- a virtual machine was using the IP 10.4.8.4 of the UCTS.... this is why it was not possible to configure it.
 
  
- I change to the IP it should take in the future: 10.1.4.4 and now it works
+
4) Two modules off: 10.1.5.3 and 10.1.6.28
  
 +
5) Two modules off: 10.1.5.3
  
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-12-03 || Léa || style="background: #CCFFCC;" | Power up
+
| 2018-12-13 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Datataking
||  
+
||
- 1): All modules ON
 
  
-2) All module ON
 
  
-3) All module ON
+
In 20181213
  
-4) All module ON
+
- Run0001.0000 - Run0001.0001 : TP in all modules, 300Hz, change module by module generated trigger. But TIB didn't send any trigger during the test, so we tried this test again
  
-5) All modules ON
+
- Run0001.0002 - Run0001.0009 :  TP in all modules, 300Hz, change module by module generated trigger.
  
-6) module 10.1.6.40 off
+
- Run0001.0010: we wanted to to some random trigger test from TIB but we don't know why EVB didn't receive anything trigger whereas collected rate was 600 Hz
  
-7) All module ON
+
From here, we copy again the config xml file in ~hoffman/20181212
 +
- - Run0001.0010: one test EVB recorData=False to confirm that for pedestalfrequency higher than 6500 HZ, we have a collected rate that doesn't match the camera rate and busy rate
  
-7) All module OFF (I think due to the previous increase of the current in the bus bars)
+
- Run0001.0011 - Run0001.0044: AnalogPedestal Run
  
-8) All moduls ON
+
In 20181212
|- valign="top"
+
- Run0001.0000 : TP in all modules, 1HZ, change module by module generated trigger
| 2018-11-30 || Léa || style="background: #FFCCCC;" | TIB/UCTS
 
||
 
  
- TIB remains in state 2 even when UCTS is configured
+
- Run0001.0000: test new pixel id scheme -> Run confusion
  
|- valign="top"
+
- Run0001.0001: Park position, shutter open, HV 1000 V, threshold for L1 DT was 10% of NSB level,
| 2018-11-29 || Léa || style="background: ##FFFFCC;" | Data Taking
+
||
+
- Run0001.0002, Run0001.0003, 00001.00004: Park position, shutter open, Nominal HV 1000 V,first we trigger on noise for the EVB to receive high rate (1000 HZ) and then we move threshold for L1 DT at 10% of NSB level so trigger rate was between 5 and 10 Hz
- Procedure of TP synchronised in all the modules
 
  
- EVB configuration file in /home/dragon/EVB/20181130
 
  
- First try, EVB conected but one module busy: 10.1.6.10 -> initialisation again of the modules
+
- Run00001.00005: Park position, shutter open, Nominal HV 1000 V,threshold for L1 DT at 10% of NSB level so trigger rate was between 5 and 10 Hz
 +
||
 +
|- valign="top"
 +
| 2018-12-12 || Lé  || style="background: #CCFFCC;" | UCTS:new MOS on tcs01
 +
||
  
- Second try, EVB connected, all modules no busy but TIB remains at state 2 even if UCTS configuration seems ok -> I switched OFF and ON the camera...
+
- stop the MOS on Osaka, is it now on tcs01 and we can well configure the UCTS
 +
||
 +
|- valign="top"
 +
| 2018-12-12/2018-12-13 || Léa, Nadia, Julie, Jan Luc  || style="background: #CCFFCC;" | ECC test
 +
||
  
- Third try, same than before. Try now to disconnect the cable from the WR switch to TCS07. same problem TIB remains at state 2 when UCTS is configured.
+
Short summary of tests done at ORM this week on the ECC:
  
 +
-Remote loading is now understood and available.
  
|- valign="top"
+
-Release “V32 patch” is available. This version is similar to the one used since September, however it:
| 2018-11-29 || Léa || style="background: #CCFFCC;" | Power up
 
||
 
- Fourth startup: ECC and Caco works well, All modules ON
 
  
- Five startup: ECC and Caco works well, All modules ON
+
·        Corrects the issues met with the intelligent relays in the transition alarm to safe.
  
- sixth startup: ECC and Caco works well, All modules ON
+
·        Adds more understandable shutter datapoints
  
|- valign="top"
+
·        Heart beat with CaCo is temporary disabled to avoid disconnection with CaCo as seen last week.
| 2018-11-30 || Léa || style="background: #FFCCCC;" | Power up
 
||
 
- second start up: ECC and Caco works well but module 10.1.6.28 was OFF so I started again
 
  
-Third startup:  ECC and Caco works well, all module ON. But humane mistake (mine), ECC went to Error state and then no current in the pulse bar -> hard reset
+
This version has been extensively tested on Thursday and was used during the previous night.
  
|- valign="top"
+
   
| 2018-11-30 || Léa || style="background: #FFCCCC;" | Power up
 
||
 
  - From Caco, switchON() it went to his state 2, then good communication with the ECC. Then GetCameraStanby(), ECC was fine and went to state ready, all modules ON but Caco was in an undefined state so I did a sleep(), Caco recover his state ready (state=3) and ECC was still ready. I did a second call of the sleep() method to start from a clean environment and everything was fine Caco went to state safe and ECC also.
 
  
|- valign="top"
+
-A new ECC version called V34 has been finalized. In addition to the “V32 patch” features,
| 2018-11-29 || Léa || style="background: #CCFFCC;" | Power up
 
||
 
- after the second hardreset, Fan ON and ECC went to ready from Cacoo day finished(-:
 
  
- All modules ON, configuration for TP synchronisation in all the modules seems fine. EVB segfault in GOTOREADY s
+
·        More data points are available to improve the camera monitoring & control (individual power control of IR, PSB (avoid), TIB, UCTS, data switches, …)
|- valign="top"
 
| 2018-11-29 || Léa || style="background: #FFCCCC;" | Fan Off
 
||
 
  
Following the hard reset since ECC went to error, again as yesterday morning the fan were down... It is a problem of heart_beat between the PDB and ECC
+
·        A configuration file is available. It allows to configure different parameters with recompiling the ECC (delays, CaCo heart beat …)
  
- Second hard reset
+
·        Better alarm identification & recovery is also available
  
|- valign="top"
+
This version has been tested ~1h on the camera. More tests will be done the coming days to take enough insurance before using it in the night runs.
| 2018-11-29 || Léa || style="background: #FFCCCC;" | Power up
 
||
 
1) Power up from Caco, powerON and GetCameraStandby(), ECC to Ready and all modules ON. Monitoring issue so go back to safe the time it is fixed
 
  
2) From Caco, SwithON, then ECC goes to errorstate 4 with _error_heart_bit to 4 without any clear reason. Caco was ok on state 2
+
  
3) Fixe the _error_heart_bit issue of ECC and try again to swtich on from Caco. Same issue, Caco state fine but ECC went to error state 4 due to _error_heart_beat at true.
+
-The 3 exe versions (V32, V32 patch, V34) are available for the shifters. A script will be given to the shift leader to facilitate the exe loading.  
 
 
4) Fixed _error_heart_beat and try directly to switch on from ECC. Works fine, ECC went to Ready but no current in the pulse bar, only the 4 one had current.
 
  
I did a hardreset
+
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-11-29 || Léa || style="background: #CCFFCC;" | WR switch
+
| 2018-12-11 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | start up
||  
+
||
RJ45 port installed on port 9 of the WR switch for the connection to tcs07
+
1) one module off: 10.1.5.3
 +
 
 +
2) one module off: 10.1.5.3
  
|- valign="top"
+
3) Two modules off: 10.1.5.3 and 10.1.6.28
| 2018-11-28 || Daniel K., Léa || style="background: #FFFFCC;" | test of cluscolauncher
 
||
 
We tried the connection between Caco and Clusco: all fine. The current monitoring was not active because not the same files were updated. Will be fixed soon and then tested again.
 
  
 +
4)  one module off: 10.1.5.3
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-28 || Daniel K., Léa || style="background: #FFCCCC;" | fans stopped
+
| 2018-12-11 || Léa, Seiya, Sunsuke  || style="background: #CCFFCC;" | Operation with HV
||  
+
||
This morning around 8:45am the fans stopped running before we arrived on site. When we arrived we noticed the ECC was still in safe state (we expected error satte but it was not the case). We checked the rest of the ECC variables and everything looked fine. Using a multimeter we checked the 400 was properly arriving to the PDB inside the camera. We contacted the ECC experts that asked for screenshots of the ECC datapoints regarding the PDB for later evaluation of the problem. Then we hard reset the ECC and fans started just fine.
 
  
 +
- Shutter close: 265 modules to 400 V and then nominal HV. Everything went smooth so then:
  
|- valign="top"
+
- Shutter open: 1 module (central one): 400 V then 800 V then nominal HV. Everything smooth so we went for 265
| 2018-11-27 || Daniel K., Cristobal (remote) || style="background: #CCFFCC;" | fix of a compilation problem for ClusCo
 
||
 
Small fix for compilation, tested and merged to the master branch. Compilation on site works again.
 
  
 +
- L1 and L0 scan. For Data taking we went from 60 to 40 in the DT by step of 5. From 40, some modules present to high rate for L1
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-27 || Daniel K. || style="background: #FFCCCC;" | test of new ECC version
+
| 2018-12-11 || Léa, Seiya, Sunsuke (Daniel, Daniela from remote) || style="background: #CCFFCC;" | Monitoring fix
||  
+
||
Following and more extensive tests of the control of the individual intelligent relays with the new version of the ECC. No improvment. Detailed description of the tests performed will be emailed to the experts. As a consequence the old ECC version was reinstalled for now.
+
Now we use only CLusCo on tcs01 and L0 and L1 internal and external are also monitored
 +
||
 +
|- valign="top"
 +
| 2018-12-11 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | Data Taking
 +
||
 +
In 20181211:
  
|- valign="top"
+
- All TP synchronised, 1 Hz, trigger sent module by module, 10 ns additional delay in TIB and 40 ns in the UP trigger propagation of the trigger for the central BP: from run0001
| 2018-11-27 || Daniel K., Otger (remote) || style="background: #CCFFCC;" | installation of new version of libcluster
 
||
 
Following successful test of last week the fixes of libcluster were merged in the master branch and install on site.
 
  
|- valign="top"
+
- run0002 should be delete it wad to test ZFW writting
| 2018-11-26 || Daniel K. || style="background: #FFCCCC;" | test of new ECC version
 
||
 
After some small fixes of data point "Error description" and for control of the fans, the new version of the ECC version was tested. The control of the individual intelligent relays (main update with this version) was unstable. As a consequence the old ECC version was reinstalled for now.
 
  
 +
- At night with HV ON and shutter open: run 0004
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" | too high temprature
+
| 2018-12-11 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | start up
||  
+
||
The status of ECC monitoring went to "red" from "green" around 16:30pm.
 
The change of temperature we are monitoring was quite different from as usual.
 
It may be related with the water pressure of chiller. It is above 1 and stable as usual, but it was too low at the morning and rising during the day.
 
  
|
 
[[Media:bptemp.JPG]]
 
[[Media:Tempbad.JPG]]
 
[[Media:ECCTemp.png]]
 
[[Media:CameraPressure14-16.png]]
 
  
|- valign="top"
+
1) Two modules off: 10.1.5.3 and 10.1.6.28
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" |some network interface of osaka sometimes not running
 
||
 
some network interface of the osaka server doesn't start running at first every day... We activated p1p2 manually.
 
  
 +
2) one module off: 10.1.5.3
  
|
+
3) Two modules off: 10.1.5.3 and 10.1.6.28
  
|- valign="top"
+
4) Two modules off: 10.1.5.3 and 10.1.6.28
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" |bad behaviour of mezzanine
 
||
 
After configuration of modules(init7 & pulse_injection_all), bad behavior of mezzanine was shown at three modules.
 
* mod115: L0 & L1 trigger rate was 0, it has been no problem until yesterday.
 
* mod167: L0 & L1 trigger rate was 0, it has been no problem until yesterday.
 
* mod226: L0 trigger rate was 65535, but L1 trigger rate had no problem. So there may be a problem only at the line for IPR. It sometimes happened in this week.
 
  
|
+
5) Two modules off: 10.1.5.3 and 10.1.6.28
  
|- valign="top"
+
6) Two modules off: 10.1.5.3 and 10.1.6.28
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFFFCC;" | take data for TP synchronization
 
||
 
We took data with ClusCo monitoring for the test pulse synchronization.
 
  
1) 300Hz, ROI=1024, trigger was generated by mod265, 3000events
+
7) one modules off: 10.1.5.3
* This was almost all the same condition as yesterday, the difference was only that ClusCo monitoring was being done.
 
* The file is /mnt/cs1/store/DragonDaqData/Data20181123/TP300HzTrigMod265RD1024Delay3028_RD1024_FEB...
 
  
2) 300Hz, ROI=40, trigger was generated by all modules, 3000events
+
8) one modules off: 10.1.5.3
* During the operation, I did some mistake(mistake about the DAQ setting, so I restarted again after TIB state goes to 5. So PPS and 10MHz couter shoud be synchronized), so this result may be worse
 
* The file is /mnt/cs1/store/DragonDaqData/Data20181123/TP300HzTrigModAllRD40Delay3528_again_RD40_FEB...
 
  
 +
9) Two modules off: 10.1.5.3 and 10.1.6.28
  
|
+
10) Two modules off: 10.1.5.3 and 10.1.6.28
 +
||
 +
|- valign="top"
 +
| 2018-12-10 || Shunsuke, Léa, Seiya || style="background: #CCFFCC;" | HV
 +
||
 +
We perform tests as follows.
  
|- valign="top"
+
- HV Supplying Test for Central Module with shutter closed.
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFFFCC;" | validation test of ClusCo
 
||
 
- the strange value of humidity
 
  
- SiTCP reset
+
--We supplied 400 V, 500 V, 600 V, 700 V, 800 V, 900V, 1000 V and Nominal Voltage to central module (module:133).
* This function worked well, but due to the bug in DragonFPGA it worked with only half of the camera and took too much time(~5min) to finish this command.
 
* Seiya will fix this problem from DragonFW side.
 
|
 
  
|- valign="top"
+
--In the test, HV are put off by script. It came from Shunsuke's mistake. But we confirmed his script works well. Any other problems were found.  
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" | new version of ECC
 
||
 
We implemented the new version of ECC. After reboot of ECC the fan didn't start working.
 
* We did hardware reset(13:40pm). But this problem was still remain and ECC state went to 4(error state).
 
* We changed the default setting of ECC, 1) T_safe_min -> 5, 2) disable light sensor. After hardware reset(13:50pm), the situation was same...
 
* Moreover we changed the default value of T_safe_min to 2. After hardware reset(14:00), the result was same(error state and fan was still stopped)
 
  
As a result we decided to replace it with the current version of ECC. After reboot of ECC, all function worked well.
+
- HV Supplying for central 19 modules as before test.
  
|
+
- HV Supplying for all modules as before test.
  
|- valign="top"
+
- .L0 & L1 rate scan with all modules applied with nominal HV with shutter closed
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" | monitoring plots were not updated
 
||
 
ECC monitoring plots were not updated after 9:30am. We can get various values(temperature etc.) in OPCUA client, only monitoring plots were not updated.
 
After reboot of ECC for the update of ECC version monitoring plots started to be updated again.
 
|
 
  
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-22 || Yuki, Seiya || style="background: #CCFFCC;" | take data for TP synchronization study
+
| 2018-12-10 || Léa, Seiya, sunsuke || style="background: #CCFFCC;" | Data taking
||  
+
||
I discussed with Taka, then I tried to take data as below;
+
In 2018/12/10
* set test pulse frequency for external reference clock
 
* after that start TP synchronization
 
  
We took data with the following conditions and managed to synchronize test pulse at all modules finally.
+
-TP synchronised in all the modules, trigger sent by all the modules, 10 ns external delay add in the TIB: 0001.0000, 0001.0001 and 0001.0002.
  
0)
+
-TP synchronised in all the modules, trigger sent by all the modules, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0004, 0001.0005 and 0001.0006.
* I wanted to take data with 300 Hz at first, but L1_local trigger rate was ~22Hz after initialization even though we set 300Hz as TP frequency.
 
* So I decided to take data without changing test pulse frequency from the default one(444 444 counts for 10MHz = 22Hz)
 
  
1) 22Hz, ROI=1024, trigger was generated by mod265, 1000events
+
-TP synchronised in all the modules, trigger sent only by module 265, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0007 to 0001.0009
* During initialization we didn't change test pulse frequency, so TP frequency was 22 Hz at that time.
 
* The data is in /mnt/cs1/store/DragonDaqData/Data20181122/Trigger22HzRD1024...
 
  
2) 300Hz, ROI=1024, trigger was generated by mod1, 3000events
+
-TP synchronised in all the modules, trigger sent only by module 100, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0010 to 0001.0012
* Before TP synchronization, we changed the test pulse frequency with "SET_TP_FREQUENCY 0 Off 33333" from ClusCo instead of "SET_TP_FREQUENCY 0 On 300". Then L1_local trigger was 300Hz with external reference clock.
+
||
* The data is in /mnt/cs1/store/DragonDaqData/Data20181122/Trigger300HzRD1024...
+
|- valign="top"
 +
| 2018-12-10 || Léa, Seiya, Sunsuke || style="background: #CCFFCC;" | start up
 +
||
 +
1) one module off: 10.1.5.3
  
|
+
2) Two modules off: 10.1.5.3 and 10.1.6.28
[[Media:TPSynchNov22nd22Hz.gif]]
 
[[Media:TPSynchNov22nd300Hz.gif]]
 
  
|- valign="top"
+
3) Two modules off: 10.1.5.3 and 10.1.6.28
| 2018-11-21 || Seiya || style="background: #FFFFCC;" | home directory of osaka server was full
 
||
 
Home directory of osaka (/home) went to be full today.
 
<pre>
 
Osaka ~ > df -h
 
Filesystem                  Size  Used Avail Use% Mounted on
 
/dev/mapper/scientific-root  50G  22G  29G  43% /
 
devtmpfs                    252G    0  252G  0% /dev
 
tmpfs                        252G    0  252G  0% /dev/shm
 
tmpfs                        252G  50M  252G  1% /run
 
tmpfs                        252G    0  252G  0% /sys/fs/cgroup
 
/dev/sdb                      15T  8.5T  5.3T  62% /mnt/cs1
 
/dev/sda1                    497M  272M  226M  55% /boot
 
/dev/mapper/scientific-home  504G  504G  20K 100% /home
 
tmpfs                        51G  12K  51G  1% /run/user/42
 
tmpfs                        51G  4.0K  51G  1% /run/user/1000
 
tmpfs                        51G    0  51G  0% /run/user/1001
 
tmpfs                        51G    0  51G  0% /run/user/1002
 
</pre>
 
  
Almost all of files(~80%) are the data taken by LegacyDAQ for the tests and in /home/dragon/IACMiniCamSetup/DragonDaqM
+
4) Two modules off: 10.1.5.3 and 10.1.6.28
<pre>
 
Osaka DragonDaqM > du -sh .
 
417G
 
</pre>
 
So I moved the data taken by LegacyDAQ to /mnt/cs1/store/DragonDaqData temporary. (We could transfer those data on the Lustre sytem (/fefs/ on tcs) later.)
 
  
|- valign="top"
+
5) one module off: 10.1.5.3
| 2018-11-21 || Daniel K., Seiya || style="background: #FFFFCC;" | take data with LegacyDAQ for EVB tests
 
||
 
Julien wants to use raw data of full camera for EVB debug tests.
 
We took data with LegacyDAQ by random trigger(300Hz), which is digital pedestal trigger TIB generated.
 
These files are in /mnt/cs1/store/DragonDaqData/Data20181121.
 
  
I wanted to take 30min data(300Hz*(60*30)=540,000 events), but the disk in osaka server went to be full during the test.
+
6) one module off: 10.1.5.3
The size of each file is ~219MB, which is equivalent to ~168,000 events and ~10min data.
 
  
 +
7) one module off: 10.1.5.3
 +
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-11-21 || Seiya || style="background: #FFFFCC;" | how to run again the network interface
+
| 2018-12-07 || Léa, Seiya, Shunsuke || style="background: #CCFFCC;" | Rate scan
||  
+
||
Some network at osaka server sometimes stopped running.
+
- L0 and L1 scan
+
 
<pre>
+
- With No TP, L0 from 300 to 650 step=5 and L1 from 0 to 200 step=2
p2p2: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
 
        inet 10.1.6.192  netmask 255.255.255.128  broadcast 10.1.6.255
 
        inet6 fe80::a236:9fff:fef0:ccd6  prefixlen 64  scopeid 0x20<link>
 
        ether a0:36:9f:f0:cc:d6  txqueuelen 1000  (Ethernet)
 
        RX packets 68478703  bytes 95858361688 (89.2 GiB)
 
        RX errors 1  dropped 9  overruns 0  frame 1
 
        TX packets 30112278  bytes 1622848602 (1.5 GiB)
 
        TX errors 0 dropped 0 overruns 0  carrier 0  collisions 0
 
</pre>
 
  
At that time, we should do for restart running;
+
- TP, 300 Hz, gain=40 (50 p.e.): L0 from 400 to 900 step=5 and L1 from 0 to 200 step=2
* sudo ifconfig <name of interface> down
 
* sudo ifconfig <name of interface> up
 
  
 +
- TP, 300 Hz, gain=20 (5 p.e.): L0 from 400 to 700 step=5 and L1 from 0 to 200 step=2
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-21 || Otger, Daniel K., Seiya || style="background: #FFCCCC;" | ECC went to error state
+
| 2018-12-07 || Léa, Seiya, Shunsuke || style="background: #CCFFCC;" | Datataking
||  
+
||
I used ClusCo@tcs01 for the monitoring, all the plots except "Amp. Temp" was updated indeed.
 
After that, I did init7 from ClusCo@cacooperator and waited the update of "Amp. Temp" plot.
 
At that time, ClusCo@tcs01 showed timeout, so I realized I cannot ping these modules and relay current went to 0 and ECC state went to error state(4).
 
I powered up again and ECC status went to 2(ready) as usual, but relay current was 0.
 
  
Taka explained why relay current was still 0 as below;
+
- TP synchronisation test with 1 us widown of legacy daq -> adding 10 ns delay for 10 MHz clock to all module seems to fix the problem
<pre>
 
When ECC goes to Error state, relay modules are also in a strange state. You need to reset relay modules as well.
 
However, even if you go to "safe" state in ECC, relays are still powered (not bus bars, but relay modules). That means, "safe" does not reset relays.
 
</pre>
 
  
Lea explained why ECC went to error as below;
+
- Random trigger.
<pre>
+
1) Run for 2 minutes and several module remains busy, no more trigger rate comming ->restart
Maybe what is possible also is that you lost the slow control connection during few seconds and then get it back without realising. Then If the modules are ON and that we lost the slow control connection, ECC goes to error and the relay current will remain at 0 as Taka explained.
+
||
</pre>
+
|- valign="top"
 +
| 2018-12-06 || Léa, Seiya, Shunsuke|| style="background: #FFCCCC;" | Osaka interface p1p2
 +
||
 +
 
 +
Something strange happens. As often, p1p2 was done. but it was impossible to have it up again. We had to switch OFF the camera.
 +
||
 +
|- valign="top"
 +
| 2018-12-07 || Léa, Seiya, Daniel and Daniela (from remote) || style="background: #FFFFCC;" | Test of slow control from Japan
 +
||
 +
- Pb monitor/slow control seems solved
  
We did hardware reset three times(15:00, 15:50, 16:45), but the situation was same.
+
- temperature monitoring of each pixel in the monitor function
This ECC error state seemes to be caused by loss of heart beat of CaCo.
+
||
We survived without CaCo (directory use ECC) for data taking today.
+
|- valign="top"
 +
| 2018-12-07 || Léa, Seiya, Daniel and Daniela (from remote) || style="background: #CCFFCC;" | start up
 +
||
 +
1) Two modules off: 10.1.5.3 and 10.1.6.28
  
 +
2) One module off: 10.1.5.3
  
<pre>
+
3) One module off: 10.1.5.3
1228269 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] ERROR com.prosysopc.ua.client.UaClient - Exception in ServerStatusListener
 
java.lang.ClassCastException: cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAVariable$DataInformation cannot be cast to java.lang.Integer
 
at cat.ifae.cta.cameracontrol.server.base.clients.ecc.OPCUAECCControl$ECCVariableStatus.update(OPCUAECCControl.java:25)
 
at java.util.Observable.notifyObservers(Observable.java:159)
 
at cat.ifae.cta.opcua.dataaccess.basicobjects.BasicCallbackVariable$ObservableVariable.setValue(BasicCallbackVariable.java:36)
 
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly._newStateWarn(OPCUAAssembly.java:533)
 
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly.consumeMessage(OPCUAAssembly.java:526)
 
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.statusChanged(OPCUAServerStatusListener.java:59)
 
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.onStateChange(OPCUAServerStatusListener.java:33)
 
at com.prosysopc.ua.client.UaClient.a(Unknown Source)
 
at com.prosysopc.ua.client.UaClient.updateServerStatus(Unknown Source)
 
at com.prosysopc.ua.client.UaClient$a.run(Unknown Source)
 
at java.lang.Thread.run(Thread.java:745)
 
1228371 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] WARN com.prosysopc.ua.client.Subscription - Server sent a previously acknowledged sequence number 0 for Subscription 47786
 
1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.SecureChannelTcp - 47856 Closed
 
1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed
 
1228373 [TcpConnection/Read] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed (expected)
 
  
</pre>
+
4) Two modules off: 10.1.5.3 and 10.1.6.40
  
|- valign="top"
+
5) One module off: 10.1.5.3
| 2018-11-21 || Otger, Daniel K., Seiya || style="background: #FFFFCC;" | dhcpd server for TIB restart
+
 
||
+
6) Two modules off: 10.1.5.3 and 10.1.6.40
DHCPd server for TIB stopped due to the shutdown of tcs01 yesterday, so we activated the server as below,
 
  
<pre>ifae@tcs01 ~]$ sudo service dhcpd status
+
7) One module off: 10.1.5.3
Redirecting to /bin/systemctl status  dhcpd.service
 
● dhcpd.service - DHCPv4 Server Daemon
 
  Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled)
 
  Active: inactive (dead)
 
    Docs: man:dhcpd(8)
 
          man:dhcpd.conf(5)
 
[ifae@tcs01 ~]$ sudo service dhcpd start
 
Redirecting to /bin/systemctl start  dhcpd.service
 
[ifae@tcs01 ~]$ sudo service dhcpd status
 
Redirecting to /bin/systemctl status  dhcpd.service
 
● dhcpd.service - DHCPv4 Server Daemon
 
  Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled)
 
  Active: active (running) since Wed 2018-11-21 09:14:08 WET; 2s ago
 
    Docs: man:dhcpd(8)
 
          man:dhcpd.conf(5)
 
Main PID: 453 (dhcpd)
 
  Status: "Dispatching packets..."
 
  CGroup: /system.slice/dhcpd.service
 
          └─453 /usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -group dhcpd --no-pid
 
  
Nov 21 09:14:08 tcs01 dhcpd[453]: All rights reserved.
+
8) One module off: 10.1.5.3
Nov 21 09:14:08 tcs01 dhcpd[453]: For info, please visit https://www.isc.org/software/dhcp/
 
Nov 21 09:14:08 tcs01 dhcpd[453]: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in...ig file
 
Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 deleted host decls to leases file.
 
Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 new dynamic host decls to leases file.
 
Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 4 leases to leases file.
 
Nov 21 09:14:08 tcs01 dhcpd[453]: Listening on LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16
 
Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on  LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16
 
Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on  Socket/fallback/fallback-net
 
Nov 21 09:14:08 tcs01 systemd[1]: Started DHCPv4 Server Daemon.
 
Hint: Some lines were ellipsized, use -l to show in full.
 
</pre>
 
  
 +
9) One module off: 10.1.5.3
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-20 || Daniel K., Seiya|| style="background: #FFFFCC;" | ClusCo monitoring restart
+
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | Module remaining OFF during the whole day
|| ClusCo monitoring map was not updated after the shutdown of tcs01. We contacted with Carlos and Carlos and they restarted it again.Now it works.
+
||  
* http://wwwae.ciemat.es/~delgadom/CTA/monitor/MODULE/
 
* http://wwwae.ciemat.es/~delgadom/CTA/monitor/PIXEL/
 
  
 +
10.1.5.3
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-20 || || style="background: #FFFFCC;" | TCS01 shutdown
+
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | ECC/CAco: Error and undefined
|| One of the memory cards of tcs01 is damaged and will be replaced
+
||  
by an authorized technician today starting 9am La Palma time.
+
- after one hour module ON, ECC went to error and Caco and undefined state. Then in implies hard reset for ECC to have current back in the bus bar
We will shutdown the server before that and once the card is exchanged we start it up again.
+
 
 +
- SwitchOn() from Caco, Caco fin in state 2 but ECC in error state...
  
 +
- ECC went to error and this time we saw Caco fine at state 3 whereas ECC in error and then Caco going to undefined as expected since ECC was OFF. This error state of the ECC appeared twice today in the middle of Data taking and after around 1 hours and half of camera switch on.
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-19|| Seiya, Daniel K.|| style="background: #FFFFCC;" | cannot connect with some modules
+
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | Data taking
|| With the configuration2(100Hz,ROI=1024) we could not connect some modules(IP10.1.6.148-173) and they still busy(busy state=1).
+
||  
After the re-initianlization, this problem disappeared.
+
- With legacy DAQ to have the 1024 ns window to work on BP synchronisation
  
|- valign="top"
+
- Try random trigger with writting with EVB during 7 minutes fine even if at one moment TIB rate went to 0 but it get back. ECC error so camera stop we couldn't test more
| 2018-11-19 || Seiya, Daniel K. || style="background: #FFFFCC;" | Test pulse data with DragonDaqM(LegacyDAQ)
 
|| We took test pulse datat with the following conditions;
 
1) 300Hz, ROI=1024, trigger was generated by mod265 (for reproducing the problem)
 
*File name is "TP300HzTrigMod265RD1024Delay3028RD1024_***"
 
  
2) 100Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)
+
- Try legacydaq run with random trigger without writtin but module 10.1.6.38 reach a connected but busy state and so no trigger was sent anymore.. Try again.
*File name is "TP100HzTrigMod265RD1024Delay3028RD1024_***"
+
Legacy daq data present same result of too busy rate, including higher than EVB
  
3) 300Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)
+
- Try long run of data with EVB and randon trigger: we run for one hour run in 20181206
*I sent each commands by hand and checked the registers(register8 & scalar) after PPS disable.
 
*It seems PPS disable worked well.
 
*File name is "TP300HzTrigMod265RD1024Delay3028_CHECKEDRD1024***".
 
  
4) 100Hz, ROI=1024, trigger was generated by mod265
+
||
*I set test pulse frequency before PPS synchronization.
 
 
 
*File name is "TP300HzTrigMod265RD1024Delay3028_TPconfigSynchroRD1024***".
 
  
 
|- valign="top"
 
|- valign="top"
| 2018-11-19 || Seiya, Daniel K. || style="background: #FFCCCC;" | 24V supply problem
+
| 2018-12-06 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | start up
|| We powered up the camera with the usual procedure, but only one busbar(the 4th one) worked and others didn't work. We tried this procedure again, but the result is the same(only the 4th busbar worked).So we switched off and on the camera breaker around 15pm. Fan didn't start to work at first, so I switched on and off the breaker again and fan started to work. After that we can power up the whole cameras.
+
||
 +
1) One module off: 10.1.5.3
  
 +
2) One module off: 10.1.5.3
  
|- valign="top"
+
3) One module off: 10.1.5.3
| 2018-11-13 || Mitsunari, Daniel K. || style="background: #FFFFCC;" | Software deployment
 
|| All the setup (except the uaexpert for ecc, tib and ucts) to control, monitor and take data with the camera was moved to the LST_CALP iMac (+ 1 screen) of the commissioning container.
 
  
|- valign="top"
+
4) One module off: 10.1.5.3
| 2018-11-12 || Mitsunari, Daniel K. || style="background: #FFFFCC;" | Test pulse data with DragonDaqM
 
|| Test pulse data were taken by DragonDaqM triggering by the module 264, which did not have a test pulse on 11-09.
 
  
|- valign="top"
+
5) One module off: 10.1.5.3: Today we will perform operation without this module
| 2018-11-12 || Mitsunari, Satoshi || style="background: #FFCCCC;" | Connect tcs07 to White Rabbit
 
|| WR switch management port and Management switch (mgtsw2 port 42) are connected by a Ethernet cable. Mitsunari tried to change the IP of the WR switch to 10.200.10.140, which is in VLAN 1001, but I failed. The WR interface file dot-config was not found in spite of the WR manual. Even when we created the file by ourselves, it was lost after rebooting.
 
  
|- valign="top"
+
Caco to undefined, ECC to error state -> hard reset
| 2018-11-12 || Mitsunari, Daniel K., Carlos Diaz || style="background: #FFFFCC;" | Software deployment
+
6) Two modules OFF: 10.1.5.3 and 10.1.6.28
|| Installing and compiling caco, cacoconsole, cacogui on tcs01 under /home/ifae/development. Compiling /home/ifae/clusco on tcs01 and adapting monitoring from CIEMAT. Setting up one additional screen for monitoring to the imac (monitoring computer), adding two forms (one for powering on the camera, one for shutting it down) to be filled by the operators.
 
  
|- valign="top"
+
7) One module off: 10.1.5.3
| 2018-11-09 || Mitsunari, Daniel K. || style="background: #FFFFCC;" | Test pulse data with EVB
 
|| Test pulse data were taken by EVB waiting PPS reaching all modules for 2 s. For the read depth 40, DAQ seemed to be successful. For the read depth 1024, however, the data were not stored.
 
  
|- valign="top"
+
8) ECC went to error at the switchon() -> hard reset then One module off: 10.1.5.3
| 2018-11-09 || Mitsunari, Daniel K. || style="background: #CCFFCC;" | Test pulse data with DragonDaqM
 
|| Test pulse data were taken by DragonDaqM waiting PPS reaching all modules for 2 s. The waveform data of six modules besides the central one were checked, and five modules had test pulses though the other module (No. 0) did not.
 
  
|- valign="top"
+
9) After one hour and half of Camera on, ECC went to error suddenly -> hard reset. Then Two modules OFF: 10.1.5.3 and 10.1.6.28
| 2018-11-03 || Mitsunari || style="background: #FFFFCC;" | Test pulse injection timing
 
|| Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 40, Sampling speed: 1 GHz
 
  
|- valign="top"
+
10) One module off: 10.1.5.3
| 2018-11-03 || Mitsunari || style="background: #FFFFCC;" | Test pulse injection timing
+
 
|| Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 1, Sampling speed: 5 GHz
+
11) Human mistake -> current went to high because dragon was reset before DAQ deconnection... -> switch off/On the camera
  
 +
12) One module off: 10.1.5.3
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-02 || Mitsunari || style="background: #FFFFCC;" | Test pulse data with EVB
+
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | start up
|| Data for investigating the test pulse issue were taken with EVB but seems to be failed. This should be inspected.<br>
+
||
Pulse rate: 300Hz, Read depth: 1024, Event number: ~9000, /fefs/onsite/data/20181102
+
1) Multiple this morning but with current busbar to 0
 +
 
 +
2) startup in the afternoon with normal version of ECC: ALL modules ON, All pixels ON
  
|- valign="top"
+
3) One module OFF: 10.1.6.28
| 2018-11-01 || Mitsunari || style="background: #CCFFCC;" | Large data with random trigger
 
|| Data of ~10^5 events were taken for pedestal random tirgger, EVB, the read depth 40 slices, and the dealy 3528 ns. The data are stored in /fefs/onsite/data/20181101.  
 
* 1kHz: Run 0001.0275-0001.0288
 
* 2kHz: Run 0001.0289-0001.0315
 
  
 +
4) ALL modules ON, All pixels ON
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-11-01 || Mitsunari || style="background: #CCFFCC;" | Avoiding TIB State 255
+
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | Data taking
|| The TIB state can go to 5 without resetting at state 255 by a combination of reseting TIB at state 0 and configuring dragons without resetting BPs. <br>
+
||  
* ECC->SetMode(2)
+
- With legacy DAQ to have 1024 ns window to work on BP synchronisation
* TIB->Reset()
+
- TP synchronised in all modules and all sending trigger. Some BP setting was updated since it was previously not done
*TIB->DisablePPS()
+
 
* TIB->ResetRun()
+
- TP synchronised in all modules and all sending trigger with default BP setting from the ring distribution
* ClusCo->Main->@config/init7_woBPreset
 
* UCTS->XMLConfiguration
 
* UCTS->Start()
 
* TIB->EnableTrigger()
 
  
Mitsunari repeated this procedure four times and succeeded for all of them. DAQ also seemed to be successful at the last trial. (At the first three trials, DAQ failed because of another reason.)
+
- Random Trigger with EVB, all runs in 20181205
 +
 +
- Random trigger with writing: some TIB crashs where all the rate and digital pedestal frequency go to 0 at one moment. It really depends each trial but sometimes we can reach 6500 Hz.
  
|- valign="top"
+
- Random trigger without writing: running for 13 minutes without any crash reach 6500 camera rate with 1700 HZ of busy rate.  
| 2018-10-31 || Mitsunari || style="background: #FFFFCC;" | TIB State 255 problem
 
|| init7 without BP reset at the beginning was tested. The first trial failed, namely, the state turned out to be 255. However, TIB state directly went to 5 In the second trial, when TIB was Reset just after turning on Camera. This behavior should be confirmed later.
 
  
|- valign="top"
+
- last try random trigger with writting and no crash for more than 10 minutes so we are lost now...
| 2018-10-31 || Mitsunari || style="background: #FFFFCC;" | Check for test pulse synchronization
 
|| It should be confirmed whether TenMHz counter vaue is idential among the modules for each test pulse event. Data for the check were taken by DragonDaqM with 300Hz. L1 threshold was set so that only the central module sent triggers. The data were stored in /home/dragon/IACMiniCamSetUp/DragonDaqM/Data20181031. TenMHz counter appeared to be synchronized, but it should be confirmed.
 
  
 +
- 3 initialization of modules without monitoring and no problem of busy modules
 +
||
 
|- valign="top"
 
|- valign="top"
| 2018-10-31 || Oscar, Mitsunari || style="background: #CCFFCC;" | PDB Fixation
+
| 2018-12-05 || Léa + Jean Luc and NAdia from remote || style="background: #CCFFCC;" | ECC version
 
||  
 
||  
 +
- install current version v32+small update (delay applied when coming back from error state) with an executable produced by their own. ECC to ready but current un the busbars to 0
 +
->hard reset but then fan off so second hard reset
  
PDB fixation: the fixation of the from plate is done know throw a screw and nut fixed to the back plate using a mixture to attach metals (Pattex Nural 21) and an additional nut to fix the front plate.
+
- Second try, still 0 in the bus bars -> hardreset and installation of the new ECC version v34 -> fan at 0 so new hard reset -> fan still at 0
  
We have started Modules twice with one hour break in between. Both times all Dragons and BP went up.
+
- Went back to the old version v32 we are using since two months and fan ON but current in the busbars to 0 ->hardreset
  
|- valign="top"
+
- Fan ON. Try to kill the ECC programm and start it again -> Fan OFF. This test indicates that there is a communication issue between ECC and PDB and when PDB lost ECC connection, goes to a safe mode
| 2018-10-30 || Taka, Mitsunari, Julien, [[User:DirkHoffmann|Dirk]] || style="background: #CCFFCC;" | Random trigger runs with EVB
 
||
 
  
Two runs (#30, #31) taken at various trigger rates as documented in [https://portal.cta-observatory.org/WG/lst/DAQ/SitePages/RunCatalog.aspx#ORM Run Catalog] and [https://cta-north.slack.com/messages/C4JB24UMC/p1540909546003100 Slack].
+
- Try again kill ECC and start again - > Fan ON so PDB get back to ECC connection without hardreset. Then try again kill/start ECC -> fan ON few seconds and went to 0. Then try again kill/start ECC -> fans still at 0 ->  hard reset
  
Corrected pixel map implemented (spiral numbering).
+
- new version v32 that cut the communication with the PDB -> fan ON after the hard reset. Two attemps of kill/start ECC and Fan still ON so clearly this indicates than Fan OFF comes from a problem of communication between ECC and PDB.
 +
 +
- Now coming back to the old version we are using since two months. Restart -> Fan OFF -> hard reset -> Fan OFF -> second hard reset > fan ON
  
 +
Two things learn:
 +
- Fan OFF seems to be due to no communication between PDB and and ECC. It is possible that PDB doens't connect to ECC after first hard reset and then go is safe mode. Maybe be due to a time connection problem of ECC now, to investigate since this problem appears only since one week... More over normally after a hard reset, this is the first connection from ECC to PDB so PDB should wait as time as it needs and not go in safe mode.
  
|- valign="top"
+
- Executable of ECC created from remote seems to work from some instances like controlling the mode, controlling the fans etc... but for the current in the bus bars for example it always went to 0
| 2018-10-29 || Oscar, Taka, Mitsunari || style="background: #CCFFCC;" | Power up
 
||
 
 
 
The Dragon with IP 10.1.6.28 (3rd column starting by the left from outside, 5th modules from below) was put in the busbar powered by relay 1 instead of 0. In exchange, module in 4th column 5ht from b below was put in the relay 0 instead of relay 1. Camera was powered up only once and all modules and BP went up.
 
  
 +
||
  
 
|- valign="top"
 
|- valign="top"
| 2018-10-27 || Taka, Mitsunari  || style="background: #FFFFCC;" | Random Trigger
+
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | Data taking
 
||  
 
||  
 +
- Data taken with TP synchornised but trigger sent by only one module
  
We took the random trigger. Following the instruction with Lea, random trigger could be easily produced.
+
- Data taken with TP only in central. until 15000 Hz, no busy since we are below the maximal writting speed. Then at 15 Khz, we have: 265 modules * 1344 kOctet * 15 khz= 5.3 GB = 40 Gbits/s. what we expect with the four link at 10 Gbits/s.
With DragonDaqM,  
+
 
 +
In /fefs/onsite/data/20181204
 +
- Run0001.0000 to .00101: one TP in the central BP, triggering the whole camera
  
300 Hz injection -> 300 Daq rate.
+
- Run0001.00103, .00104, .00105: TP in every modules synchronised, only module 265 sent trigger, new BP delay: it seems now we can see the pulse in the central module but we don't see it in other, to investigate...
  
1k Hz-> 783 Hz
+
- Run0001.0106 to .00107, .00108: old BP delay, TP in every modules synchronised, all modules sending trigger: no TP visible in the data, to investigate....
  
3k Hz-> 1162 Hz
+
- Run0001.0109 to Run0001.0154: random trigger pedestal run. At the pedestal frequency of 3100 Hz, all pedestal frequency, collected rate, Camera rate and busy rate went to 0 at the same time.
  
6.5k Hz -> 1303 Hz.
+
||
 +
|- valign="top"
 +
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | Caco-ECC error/undefined state
 +
||
 +
- First power up, ECC went to error state whereas Caco was fine
  
With EVB, we first tried with 6.5 kHz. Then EVB crashed because of buffer full.
+
- For one of the power up in which we stay 1h30 on ready mode ECC went to error and Caco to undefined (we don't know in which order).
But busy state of modules was 03, which means EVB are connected and modules were busy.
+
ECC to state 1 after disabling the _error_heart_bit but then no current in the busbar->hardreset
To recover from this state, we had to reboot Dragons. A few minutes later, Carlos Diaz called us.
+
Then impossible to get Caco back in a normal mode even after ECC to state 1-> kill and restart Cacolaucher...
The current consumption at bus bars are ~10Amp higher than usual. Normally 25-27 Amp but after rebooting Dragons, it was 35 Amp.
 
We shutdown the 24V. After 10 min or so, Carlos allowed us to restart.
 
All Dragons could be communicated from cacoserver, but not from Osaka. ip link set p*p* down/up didn't help.
 
We rebooted Osaka. Then Osaka could ping to all (but one) modules.
 
However, EVB didn't work. Later we learned from Dirk and Julien that we had to do
 
  
sudo modprobe -r ixgbe; sudo modprobe ixgbe
+
- After the second hard reset, fan was off so we add to do another hardreset... Then everything ok
 
+
||
 
+
|- valign="top"
 
+
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | TIB 255 issue fixed
|- valign="top"
+
||
| 2018-10-27 || Oscar, Laia , Taka, Mitsunari || style="background: #CCFFCC;" | Power up
+
 
||  
+
- We forgot to call the reset() method between each initialization of the modules. Now it works fine.
 
+
||
After checking that Dragon and BP regulators can stand input voltage above 30 V, we increased the power provide by the Power Supplies to 27.5V (the same for the 8 Power Supplies).  
+
|- valign="top"
 
+
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #CCFFCC;" | start up
With this configuration, the voltage while ramping up increase up 20.3 V and then only decreases to 19.8 V for about 1 ms. This should be completely find for the Dragons.
+
||
 
+
 
We power up the camera with the ECC 10 times. All BP went up all times. Only one Dragon (always the same) does not power up the first time after a ~1 hour break (tried two times), after this first power up all Dragons power up.
+
3)All modules ON and all pixel ON (data taking but then TIB state 255)
 
+
 
|- valign="top"
+
4)All modules ON and all pixel ON (data taking but then TIB state 255)
| 2018-10-26 || Taka, Mitsunari  || style="background: #FFFFCC;" | TIB state machine.
+
 
||  
+
5)All modules ON and all pixel ON (Caco went to undefined and ECC to error state, we don't know in wich order): For this start up we stayed one hour and half on ready mode
 
+
 
We tried to solve the "State 255" problem in TIB. Luis Angel suggested to configure modules at state 2. We followed his instruction, but we reached state 255.
+
7)All modules ON and all pixel ON
So we tried modules configuration at state 0. Same result. We tried module configuration at state 4, resulting in the same state 255.
+
 
 
+
||
We also tried to see the test pulse postion to the center of the readout window. But we could not see the test pulse at all. Delay setting in TIB or backplane is not correct.
+
|- valign="top"
 
+
| 2018-12-04 || Léa, Seiya, Shunsuke, Satoshi || style="background: #FFCCCC;" | start up
 
+
||
|- valign="top"
+
1) SiwthOn() from Caco, went to state 2 normal, but ECC to state 4 -> hard reset
| 2018-10-26 || Oscar, Laia , Taka, Mitsunari || style="background: #FFFFCC;" | Power up
+
 
||  
+
2) Caco and ECC fine but one module OFF: 10.1.6.28 -> OFF/ON again
 
+
 
The drop in the voltage is due to a current limit in the circuitry of the relay. Increasing the voltage of the power supplies should rise the value of the dip in the voltage so that it does not reach 18V.
+
6) ECC went to error state and Caco to state undefined (We don't know in wich order...). Before hard reset, we try directly from ECC to go to state ready but current to 0 in the bus bars ->hard reset
 
+
Then The fan speed were at 0 so hard reset again. Then it works fine
We measure again the transients for relay 0 with Power Supply at 24.98 V as reference.
+
||
We increase the voltage of Power Supplies to 25.25 V, the dip is about 100 mV higher.
+
|- valign="top"
 
+
| 2018-12-03 || [[User:DirkHoffmann|Dirk]] || style="background: #CCFFCC;" | CamerasToACTL v1.7
|- valign="top"
+
||
| 2018-10-25 || Taka, Mitsunari Yusuke  || style="background: #CCFFCC;" | Event Mixing
+
 
 +
Installed on tcs03 and tcs04 from repository (https://cta.cppm.in2p3.fr/repo/x86_64/) and tested with/by Léa and Seiya.
 +
||
 +
|- valign="top"
 +
| 2018-12-03 || Léa,Seiya || style="background: #FFCCCC;" | increase of the bus bars current
 +
||
 +
 
 +
Due to the BP reset wheread EVB was still connected to the modules, current increase to 40. It is now know than the current increase when DAQ is connected and no clock distributed... We still don't know why, in study!!!
 +
||
 +
|- valign="top"
 +
| 2018-12-03 || Léa, Seiya || style="background: #CCFFCC;" | Data Taking
 +
||
 +
- TP synchronised with legacy and EVB data
 +
 
 +
- Too many files created compared to the number of ZFW instances
 +
||
 +
|- valign="top"
 +
| 2018-12-03 || Léa,Seiya || style="background: #FFCCCC;" | TIB/UCTS
 +
||
 +
 
 +
- TIB went to state 255 even after reset. so we shut down and off the camera...
 +
 
 +
- Again, TIB went to 255 5 secdondes after reaching state 5, all rate at 1444O.
 +
 
 +
- TIB went to 255 from state 4 directly. The feeling in one day is that after 3 cycle of TIB going to 5 and then reset, it is going to 255 and we have to switch off the camera
 +
||
 +
|- valign="top"
 +
| 2018-12-03 || Léa || style="background: #CCFFCC;" | Fix UCTS configuraiton
 +
||
 +
- a virtual machine was using the IP 10.4.8.4 of the UCTS.... this is why it was not possible to configure it.
 +
 
 +
- I change to the IP it should take in the future: 10.1.4.4 and now it works
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-12-03 || Léa || style="background: #CCFFCC;" | Power up
 +
||
 +
- 1): All modules ON
 +
 
 +
-2) All module ON
 +
 
 +
-3) All module ON
 +
 
 +
-4) All module ON
 +
 
 +
-5) All modules ON
 +
 
 +
-6) module 10.1.6.40 off
 +
 
 +
-7) All module ON
 +
 
 +
-7) All module OFF (I think due to the previous increase of the current in the bus bars)
 +
 
 +
-8) All moduls ON
 +
||
 +
|- valign="top"
 +
| 2018-11-30 || Léa || style="background: #FFCCCC;" | TIB/UCTS
 +
||
 +
 
 +
- TIB remains in state 2 even when UCTS is configured
 +
||
 +
|- valign="top"
 +
| 2018-11-29 || Léa || style="background: ##FFFFCC;" | Data Taking
 +
||
 +
- Procedure of TP synchronised in all the modules
 +
 
 +
- EVB configuration file in /home/dragon/EVB/20181130
 +
 
 +
- First try, EVB conected but one module busy: 10.1.6.10 -> initialisation again of the modules
 +
 
 +
- Second try, EVB connected, all modules no busy but TIB remains at state 2 even if UCTS configuration seems ok -> I switched OFF and ON the camera...
 +
 
 +
- Third try, same than before. Try now to disconnect the cable from the WR switch to TCS07. same problem TIB remains at state 2 when UCTS is configured.
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-29 || Léa || style="background: #CCFFCC;" | Power up
 +
||
 +
- Fourth startup: ECC and Caco works well, All modules ON
 +
 
 +
- Five startup: ECC and Caco works well, All modules ON
 +
 
 +
- sixth startup: ECC and Caco works well, All modules ON
 +
||
 +
|- valign="top"
 +
| 2018-11-30 || Léa || style="background: #FFCCCC;" | Power up
 +
||
 +
- second start up: ECC and Caco works well but module 10.1.6.28 was OFF so I started again
 +
 
 +
-Third startup:  ECC and Caco works well, all module ON. But humane mistake (mine), ECC went to Error state and then no current in the pulse bar -> hard reset
 +
||
 +
|- valign="top"
 +
| 2018-11-30 || Léa || style="background: #FFCCCC;" | Power up
 +
||
 +
- From Caco, switchON() it went to his state 2, then good communication with the ECC. Then GetCameraStanby(), ECC was fine and went to state ready, all modules ON but Caco was in an undefined state so I did a sleep(), Caco recover his state ready (state=3) and ECC was still ready. I did a second call of the sleep() method to start from a clean environment and everything was fine Caco went to state safe and ECC also.
 +
||
 +
|- valign="top"
 +
| 2018-11-29 || Léa || style="background: #CCFFCC;" | Power up
 +
||
 +
- after the second hardreset, Fan ON and ECC went to ready from Cacoo day finished(-:
 +
 
 +
- All modules ON, configuration for TP synchronisation in all the modules seems fine. EVB segfault in GOTOREADY s
 +
||
 +
|- valign="top"
 +
| 2018-11-29 || Léa || style="background: #FFCCCC;" | Fan Off
 +
||
 +
 
 +
Following the hard reset since ECC went to error, again as yesterday morning the fan were down... It is a problem of heart_beat between the PDB and ECC
 +
 
 +
- Second hard reset
 +
||
 +
|- valign="top"
 +
| 2018-11-29 || Léa || style="background: #FFCCCC;" | Power up
 +
||
 +
1) Power up from Caco, powerON and GetCameraStandby(), ECC to Ready and all modules ON. Monitoring issue so go back to safe the time it is fixed
 +
 
 +
2) From Caco, SwithON, then ECC goes to errorstate 4 with _error_heart_bit to 4 without any clear reason. Caco was ok on state 2
 +
 
 +
3) Fixe the _error_heart_bit issue of ECC and try again to swtich on from Caco. Same issue, Caco state fine but ECC went to error state 4 due to _error_heart_beat at true.
 +
 
 +
4) Fixed _error_heart_beat and try directly to switch on from ECC. Works fine, ECC went to Ready but no current in the pulse bar, only the 4 one had current.
 +
 
 +
I did a hardreset
 +
||
 +
|- valign="top"
 +
| 2018-11-29 || Léa || style="background: #CCFFCC;" | WR switch
 +
||
 +
RJ45 port installed on port 9 of the WR switch for the connection to tcs07
 +
||
 +
|- valign="top"
 +
| 2018-11-28 || Daniel K., Léa || style="background: #FFFFCC;" | test of cluscolauncher
 +
||
 +
We tried the connection between Caco and Clusco: all fine. The current monitoring was not active because not the same files were updated. Will be fixed soon and then tested again.
 +
||
 +
|- valign="top"
 +
| 2018-11-28 || Daniel K., Léa || style="background: #FFCCCC;" | fans stopped
 +
||
 +
This morning around 8:45am the fans stopped running before we arrived on site. When we arrived we noticed the ECC was still in safe state (we expected error satte but it was not the case). We checked the rest of the ECC variables and everything looked fine. Using a multimeter we checked the 400 was properly arriving to the PDB inside the camera. We contacted the ECC experts that asked for screenshots of the ECC datapoints regarding the PDB for later evaluation of the problem. Then we hard reset the ECC and fans started just fine.
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-27 || Daniel K., Cristobal (remote) || style="background: #CCFFCC;" | fix of a compilation problem for ClusCo
 +
||
 +
Small fix for compilation, tested and merged to the master branch. Compilation on site works again.
 +
||
 +
|- valign="top"
 +
| 2018-11-27 || Daniel K. || style="background: #FFCCCC;" | test of new ECC version
 +
||
 +
Following and more extensive tests of the control of the individual intelligent relays with the new version of the ECC. No improvment. Detailed description of the tests performed will be emailed to the experts. As a consequence the old ECC version was reinstalled for now.
 +
||
 +
|- valign="top"
 +
| 2018-11-27 || Daniel K., Otger (remote) || style="background: #CCFFCC;" | installation of new version of libcluster
 +
||
 +
Following successful test of last week the fixes of libcluster were merged in the master branch and install on site.
 +
||
 +
|- valign="top"
 +
| 2018-11-26 || Daniel K. || style="background: #FFCCCC;" | test of new ECC version
 +
||
 +
After some small fixes of data point "Error description" and for control of the fans, the new version of the ECC version was tested. The control of the individual intelligent relays (main update with this version) was unstable. As a consequence the old ECC version was reinstalled for now.
 +
||
 +
|- valign="top"
 +
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" | too high temprature
 +
||
 +
The status of ECC monitoring went to "red" from "green" around 16:30pm.
 +
The change of temperature we are monitoring was quite different from as usual.
 +
It may be related with the water pressure of chiller. It is above 1 and stable as usual, but it was too low at the morning and rising during the day.
 +
 
 +
||
 +
[[Media:bptemp.JPG]]
 +
[[Media:Tempbad.JPG]]
 +
[[Media:ECCTemp.png]]
 +
[[Media:CameraPressure14-16.png]]
 +
 
 +
|- valign="top"
 +
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" |some network interface of osaka sometimes not running
 +
||
 +
some network interface of the osaka server doesn't start running at first every day... We activated p1p2 manually.
 +
 
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" |bad behaviour of mezzanine
 +
||
 +
After configuration of modules(init7 & pulse_injection_all), bad behavior of mezzanine was shown at three modules.
 +
* mod115: L0 & L1 trigger rate was 0, it has been no problem until yesterday.
 +
* mod167: L0 & L1 trigger rate was 0, it has been no problem until yesterday.
 +
* mod226: L0 trigger rate was 65535, but L1 trigger rate had no problem. So there may be a problem only at the line for IPR. It sometimes happened in this week.
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFFFCC;" | take data for TP synchronization
 +
||
 +
We took data with ClusCo monitoring for the test pulse synchronization.
 +
 
 +
1) 300Hz, ROI=1024, trigger was generated by mod265, 3000events
 +
* This was almost all the same condition as yesterday, the difference was only that ClusCo monitoring was being done.
 +
* The file is /mnt/cs1/store/DragonDaqData/Data20181123/TP300HzTrigMod265RD1024Delay3028_RD1024_FEB...
 +
 
 +
2) 300Hz, ROI=40, trigger was generated by all modules, 3000events
 +
* During the operation, I did some mistake(mistake about the DAQ setting, so I restarted again after TIB state goes to 5. So PPS and 10MHz couter shoud be synchronized), so this result may be worse
 +
* The file is /mnt/cs1/store/DragonDaqData/Data20181123/TP300HzTrigModAllRD40Delay3528_again_RD40_FEB...
 +
 
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFFFCC;" | validation test of ClusCo
 +
||
 +
- the strange value of humidity
 +
 
 +
- SiTCP reset
 +
* This function worked well, but due to the bug in DragonFPGA it worked with only half of the camera and took too much time(~5min) to finish this command.
 +
* Seiya will fix this problem from DragonFW side.
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" | new version of ECC
 +
||
 +
We implemented the new version of ECC. After reboot of ECC the fan didn't start working.
 +
* We did hardware reset(13:40pm). But this problem was still remain and ECC state went to 4(error state).
 +
* We changed the default setting of ECC, 1) T_safe_min -> 5, 2) disable light sensor. After hardware reset(13:50pm), the situation was same...
 +
* Moreover we changed the default value of T_safe_min to 2. After hardware reset(14:00), the result was same(error state and fan was still stopped)
 +
 
 +
As a result we decided to replace it with the current version of ECC. After reboot of ECC, all function worked well.
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-23 || Daniel K., Yuki, Seiya || style="background: #FFCCCC;" | monitoring plots were not updated
 +
||
 +
ECC monitoring plots were not updated after 9:30am. We can get various values(temperature etc.) in OPCUA client, only monitoring plots were not updated.
 +
After reboot of ECC for the update of ECC version monitoring plots started to be updated again.
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-22 || Yuki, Seiya || style="background: #CCFFCC;" | take data for TP synchronization study
 +
||
 +
I discussed with Taka, then I tried to take data as below;
 +
* set test pulse frequency for external reference clock
 +
* after that start TP synchronization
 +
 
 +
We took data with the following conditions and managed to synchronize test pulse at all modules finally.
 +
 
 +
0)
 +
* I wanted to take data with 300 Hz at first, but L1_local trigger rate was ~22Hz after initialization even though we set 300Hz as TP frequency.
 +
* So I decided to take data without changing test pulse frequency from the default one(444 444 counts for 10MHz = 22Hz)
 +
 
 +
1) 22Hz, ROI=1024, trigger was generated by mod265, 1000events
 +
* During initialization we didn't change test pulse frequency, so TP frequency was 22 Hz at that time.
 +
* The data is in /mnt/cs1/store/DragonDaqData/Data20181122/Trigger22HzRD1024...
 +
 
 +
2) 300Hz, ROI=1024, trigger was generated by mod1, 3000events
 +
* Before TP synchronization, we changed the test pulse frequency with "SET_TP_FREQUENCY 0 Off 33333" from ClusCo instead of "SET_TP_FREQUENCY 0 On 300". Then L1_local trigger was 300Hz with external reference clock.
 +
* The data is in /mnt/cs1/store/DragonDaqData/Data20181122/Trigger300HzRD1024...
 +
 
 +
||
 +
[[Media:TPSynchNov22nd22Hz.gif]]
 +
[[Media:TPSynchNov22nd300Hz.gif]]
 +
 
 +
|- valign="top"
 +
| 2018-11-21 || Seiya || style="background: #FFFFCC;" | home directory of osaka server was full
 +
||
 +
Home directory of osaka (/home) went to be full today.
 +
 
 +
Osaka ~ > df -h
 +
Filesystem                  Size  Used Avail Use% Mounted on
 +
/dev/mapper/scientific-root  50G  22G  29G  43% /
 +
devtmpfs                    252G    0  252G  0% /dev
 +
tmpfs                        252G    0  252G  0% /dev/shm
 +
tmpfs                        252G  50M  252G  1% /run
 +
tmpfs                        252G    0  252G  0% /sys/fs/cgroup
 +
/dev/sdb                      15T  8.5T  5.3T  62% /mnt/cs1
 +
/dev/sda1                    497M  272M  226M  55% /boot
 +
/dev/mapper/scientific-home  504G  504G  20K 100% /home
 +
tmpfs                        51G  12K  51G  1% /run/user/42
 +
tmpfs                        51G  4.0K  51G  1% /run/user/1000
 +
tmpfs                        51G    0  51G  0% /run/user/1001
 +
tmpfs                        51G    0  51G  0% /run/user/1002
 +
 
 +
Almost all of files(~80%) are the data taken by LegacyDAQ for the tests and in /home/dragon/IACMiniCamSetup/DragonDaqM
 +
 
 +
Osaka DragonDaqM > du -sh .
 +
417G
 +
 
 +
So I moved the data taken by LegacyDAQ to /mnt/cs1/store/DragonDaqData temporary. (We could transfer those data on the Lustre sytem (/fefs/ on tcs) later.)
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-21 || Daniel K., Seiya || style="background: #FFFFCC;" | take data with LegacyDAQ for EVB tests
 +
||
 +
Julien wants to use raw data of full camera for EVB debug tests.
 +
We took data with LegacyDAQ by random trigger(300Hz), which is digital pedestal trigger TIB generated.
 +
These files are in /mnt/cs1/store/DragonDaqData/Data20181121.
 +
 
 +
I wanted to take 30min data(300Hz*(60*30)=540,000 events), but the disk in osaka server went to be full during the test.
 +
The size of each file is ~219MB, which is equivalent to ~168,000 events and ~10min data.
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-21 || Seiya || style="background: #FFFFCC;" | how to run again the network interface
 +
||
 +
Some network at osaka server sometimes stopped running.
 +
 +
p2p2: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
 +
inet 10.1.6.192  netmask 255.255.255.128  broadcast 10.1.6.255
 +
inet6 fe80::a236:9fff:fef0:ccd6  prefixlen 64  scopeid 0x20<link>
 +
ether a0:36:9f:f0:cc:d6  txqueuelen 1000  (Ethernet)
 +
RX packets 68478703  bytes 95858361688 (89.2 GiB)
 +
RX errors 1  dropped 9  overruns 0  frame 1
 +
TX packets 30112278  bytes 1622848602 (1.5 GiB)
 +
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 +
 
 +
At that time, we should do for restart running;
 +
* sudo ifconfig <name of interface> down
 +
* sudo ifconfig <name of interface> up
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-21 || Otger, Daniel K., Seiya || style="background: #FFCCCC;" | ECC went to error state
 +
||
 +
I used ClusCo@tcs01 for the monitoring, all the plots except "Amp. Temp" was updated indeed.
 +
After that, I did init7 from ClusCo@cacooperator and waited the update of "Amp. Temp" plot.
 +
At that time, ClusCo@tcs01 showed timeout, so I realized I cannot ping these modules and relay current went to 0 and ECC state went to error state(4).
 +
I powered up again and ECC status went to 2(ready) as usual, but relay current was 0.
 +
 
 +
Taka explained why relay current was still 0 as below;
 +
 
 +
When ECC goes to Error state, relay modules are also in a strange state. You need to reset relay modules as well.
 +
However, even if you go to "safe" state in ECC, relays are still powered (not bus bars, but relay modules). That means, "safe" does not reset relays.
 +
 
 +
Lea explained why ECC went to error as below;
 +
 
 +
Maybe what is possible also is that you lost the slow control connection during few seconds and then get it back without realising. Then If the modules are ON and that we lost the slow control connection, ECC goes to error and the relay current will remain at 0 as Taka explained.
 +
 
 +
 
 +
We did hardware reset three times(15:00, 15:50, 16:45), but the situation was same.
 +
This ECC error state seemes to be caused by loss of heart beat of CaCo.
 +
We survived without CaCo (directory use ECC) for data taking today.
 +
 
 +
 
 +
1228269 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] ERROR com.prosysopc.ua.client.UaClient - Exception in ServerStatusListener
 +
java.lang.ClassCastException: cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAVariable$DataInformation cannot be cast to java.lang.Integer
 +
at cat.ifae.cta.cameracontrol.server.base.clients.ecc.OPCUAECCControl$ECCVariableStatus.update(OPCUAECCControl.java:25)
 +
at java.util.Observable.notifyObservers(Observable.java:159)
 +
at cat.ifae.cta.opcua.dataaccess.basicobjects.BasicCallbackVariable$ObservableVariable.setValue(BasicCallbackVariable.java:36)
 +
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly._newStateWarn(OPCUAAssembly.java:533)
 +
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly.consumeMessage(OPCUAAssembly.java:526)
 +
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.statusChanged(OPCUAServerStatusListener.java:59)
 +
at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.onStateChange(OPCUAServerStatusListener.java:33)
 +
at com.prosysopc.ua.client.UaClient.a(Unknown Source)
 +
at com.prosysopc.ua.client.UaClient.updateServerStatus(Unknown Source)
 +
at com.prosysopc.ua.client.UaClient$a.run(Unknown Source)
 +
at java.lang.Thread.run(Thread.java:745)
 +
 
 +
1228371 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] WARN com.prosysopc.ua.client.Subscription - Server sent a previously acknowledged sequence number 0 for Subscription 47786
 +
1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.SecureChannelTcp - 47856 Closed
 +
1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed
 +
1228373 [TcpConnection/Read] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed (expected)
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-21 || Otger, Daniel K., Seiya || style="background: #FFFFCC;" | dhcpd server for TIB restart
 +
||
 +
DHCPd server for TIB stopped due to the shutdown of tcs01 yesterday, so we activated the server as below,
 +
 
 +
ifae@tcs01 ~]$ sudo service dhcpd status
 +
Redirecting to /bin/systemctl status  dhcpd.service
 +
dhcpd.service - DHCPv4 Server Daemon
 +
Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled)
 +
Active: inactive (dead)
 +
Docs: man:dhcpd(8)
 +
man:dhcpd.conf(5)
 +
[ifae@tcs01 ~]$ sudo service dhcpd start
 +
Redirecting to /bin/systemctl start  dhcpd.service
 +
[ifae@tcs01 ~]$ sudo service dhcpd status
 +
Redirecting to /bin/systemctl status  dhcpd.service
 +
dhcpd.service - DHCPv4 Server Daemon
 +
Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled)
 +
Active: active (running) since Wed 2018-11-21 09:14:08 WET; 2s ago
 +
Docs: man:dhcpd(8)
 +
man:dhcpd.conf(5)
 +
Main PID: 453 (dhcpd)
 +
Status: "Dispatching packets..."
 +
CGroup: /system.slice/dhcpd.service
 +
453 /usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -group dhcpd --no-pid
 +
 
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: All rights reserved.
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: For info, please visit https://www.isc.org/software/dhcp/
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in...ig file
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 deleted host decls to leases file.
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 new dynamic host decls to leases file.
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 4 leases to leases file.
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: Listening on LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on  LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16
 +
Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on  Socket/fallback/fallback-net
 +
Nov 21 09:14:08 tcs01 systemd[1]: Started DHCPv4 Server Daemon.
 +
Hint: Some lines were ellipsized, use -l to show in full.
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-20 ||  Daniel K., Seiya|| style="background: #FFFFCC;" | ClusCo monitoring restart
 +
|| ClusCo monitoring map was not updated after the shutdown of tcs01. We contacted with Carlos and Carlos and they restarted it again.Now it works.
 +
* http://wwwae.ciemat.es/~delgadom/CTA/monitor/MODULE/
 +
* http://wwwae.ciemat.es/~delgadom/CTA/monitor/PIXEL/
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-20 ||  || style="background: #FFFFCC;" | TCS01 shutdown
 +
|| One of the memory cards of tcs01 is damaged and will be replaced
 +
by an authorized technician today starting 9am La Palma time.
 +
We will shutdown the server before that and once the card is exchanged we start it up again.
 +
 
 +
|- valign="top"
 +
| 2018-11-19||  Seiya, Daniel K.|| style="background: #FFFFCC;" | cannot connect with some modules
 +
|| With the configuration2(100Hz,ROI=1024) we could not connect some modules(IP10.1.6.148-173) and they still busy(busy state=1).
 +
After the re-initianlization, this problem disappeared.
 +
 
 +
|- valign="top"
 +
| 2018-11-19 || Seiya, Daniel K. || style="background: #FFFFCC;" | Test pulse data with DragonDaqM(LegacyDAQ)
 +
|| We took test pulse datat with the following conditions;
 +
1) 300Hz, ROI=1024, trigger was generated by mod265 (for reproducing the problem)
 +
*File name is "TP300HzTrigMod265RD1024Delay3028RD1024_***"
 +
 
 +
2) 100Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)
 +
*File name is "TP100HzTrigMod265RD1024Delay3028RD1024_***"
 +
 
 +
3) 300Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)
 +
*I sent each commands by hand and checked the registers(register8 & scalar) after PPS disable.
 +
*It seems PPS disable worked well.
 +
*File name is "TP300HzTrigMod265RD1024Delay3028_CHECKEDRD1024***".
 +
 
 +
4) 100Hz, ROI=1024, trigger was generated by mod265
 +
*I set test pulse frequency before PPS synchronization.
 +
 
 +
*File name is "TP300HzTrigMod265RD1024Delay3028_TPconfigSynchroRD1024***".
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-19 || Seiya, Daniel K. || style="background: #FFCCCC;" | 24V supply problem
 +
|| We powered up the camera with the usual procedure, but only one busbar(the 4th one) worked and others didn't work. We tried this procedure again, but the result is the same(only the 4th busbar worked).So we switched off and on the camera breaker around 15pm. Fan didn't start to work at first, so I switched on and off the breaker again and fan started to work. After that we can power up the whole cameras.
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-11-13 || Mitsunari, Daniel K. || style="background: #FFFFCC;" | Software deployment
 +
|| All the setup (except the uaexpert for ecc, tib and ucts) to control, monitor and take data with the camera was moved to the LST_CALP iMac (+ 1 screen) of the commissioning container.
 +
||
 +
|- valign="top"
 +
| 2018-11-12 || Mitsunari, Daniel K. || style="background: #FFFFCC;" | Test pulse data with DragonDaqM
 +
|| Test pulse data were taken by DragonDaqM triggering by the module 264, which did not have a test pulse on 11-09.
 +
||
 +
|- valign="top"
 +
| 2018-11-12 || Mitsunari, Satoshi || style="background: #FFCCCC;" | Connect tcs07 to White Rabbit
 +
|| WR switch management port and Management switch (mgtsw2 port 42) are connected by a Ethernet cable. Mitsunari tried to change the IP of the WR switch to 10.200.10.140, which is in VLAN 1001, but I failed. The WR interface file dot-config was not found in spite of the WR manual. Even when we created the file by ourselves, it was lost after rebooting.
 +
||
 +
|- valign="top"
 +
| 2018-11-12 || Mitsunari, Daniel K., Carlos Diaz || style="background: #FFFFCC;" | Software deployment
 +
|| Installing and compiling caco, cacoconsole, cacogui on tcs01 under /home/ifae/development. Compiling /home/ifae/clusco on tcs01 and adapting monitoring from CIEMAT. Setting up one additional screen for monitoring to the imac (monitoring computer), adding two forms (one for powering on the camera, one for shutting it down) to be filled by the operators.
 +
||
 +
|- valign="top"
 +
| 2018-11-09 || Mitsunari, Daniel K. || style="background: #FFFFCC;" | Test pulse data with EVB
 +
|| Test pulse data were taken by EVB waiting PPS reaching all modules for 2 s. For the read depth 40, DAQ seemed to be successful. For the read depth 1024, however, the data were not stored.
 +
||
 +
|- valign="top"
 +
| 2018-11-09 || Mitsunari, Daniel K. || style="background: #CCFFCC;" | Test pulse data with DragonDaqM
 +
|| Test pulse data were taken by DragonDaqM waiting PPS reaching all modules for 2 s. The waveform data of six modules besides the central one were checked, and five modules had test pulses though the other module (No. 0) did not.
 +
||
 +
|- valign="top"
 +
| 2018-11-03 || Mitsunari || style="background: #FFFFCC;" | Test pulse injection timing
 +
|| Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 40, Sampling speed: 1 GHz
 +
||
 +
|- valign="top"
 +
| 2018-11-03 || Mitsunari || style="background: #FFFFCC;" | Test pulse injection timing
 +
|| Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 1, Sampling speed: 5 GHz
 +
||
 +
|- valign="top"
 +
| 2018-11-02 || Mitsunari || style="background: #FFFFCC;" | Test pulse data with EVB
 +
|| Data for investigating the test pulse issue were taken with EVB but seems to be failed. This should be inspected.<br>
 +
Pulse rate: 300Hz, Read depth: 1024, Event number: ~9000, /fefs/onsite/data/20181102
 +
||
 +
|- valign="top"
 +
| 2018-11-01 || Mitsunari || style="background: #CCFFCC;" | Large data with random trigger
 +
|| Data of ~10^5 events were taken for pedestal random tirgger, EVB, the read depth 40 slices, and the dealy 3528 ns. The data are stored in /fefs/onsite/data/20181101.
 +
* 1kHz: Run 0001.0275-0001.0288
 +
* 2kHz: Run 0001.0289-0001.0315
 +
||
 +
|- valign="top"
 +
| 2018-11-01 || Mitsunari || style="background: #CCFFCC;" | Avoiding TIB State 255
 +
|| The TIB state can go to 5 without resetting at state 255 by a combination of reseting TIB at state 0 and configuring dragons without resetting BPs. <br>
 +
* ECC->SetMode(2)
 +
* TIB->Reset()
 +
*TIB->DisablePPS()
 +
* TIB->ResetRun()
 +
* ClusCo->Main->@config/init7_woBPreset
 +
* UCTS->XMLConfiguration
 +
* UCTS->Start()
 +
* TIB->EnableTrigger()
 +
 
 +
Mitsunari repeated this procedure four times and succeeded for all of them. DAQ also seemed to be successful at the last trial. (At the first three trials, DAQ failed because of another reason.)
 +
||
 +
|- valign="top"
 +
| 2018-10-31 || Mitsunari || style="background: #FFFFCC;" | TIB State 255 problem
 +
|| init7 without BP reset at the beginning was tested. The first trial failed, namely, the state turned out to be 255. However, TIB state directly went to 5 In the second trial, when TIB was Reset just after turning on Camera. This behavior should be confirmed later.
 +
||
 +
|- valign="top"
 +
| 2018-10-31 || Mitsunari || style="background: #FFFFCC;" | Check for test pulse synchronization
 +
|| It should be confirmed whether TenMHz counter vaue is idential among the modules for each test pulse event. Data for the check were taken by DragonDaqM with 300Hz. L1 threshold was set so that only the central module sent triggers. The data were stored in /home/dragon/IACMiniCamSetUp/DragonDaqM/Data20181031. TenMHz counter appeared to be synchronized, but it should be confirmed.
 +
||
 +
|- valign="top"
 +
| 2018-10-31 || Oscar, Mitsunari || style="background: #CCFFCC;" | PDB Fixation
 +
||
 +
 
 +
PDB fixation: the fixation of the from plate is done know throw a screw and nut fixed to the back plate using a mixture to attach metals (Pattex Nural 21) and an additional nut to fix the front plate.
 +
 
 +
We have started Modules twice with one hour break in between. Both times all Dragons and BP went up.
 +
 
 +
||
 +
|- valign="top"
 +
| 2018-10-30 || Taka, Mitsunari, Julien, [[User:DirkHoffmann|Dirk]] || style="background: #CCFFCC;" | Random trigger runs with EVB
 +
||
 +
 
 +
Two runs (#30, #31) taken at various trigger rates as documented in [https://portal.cta-observatory.org/WG/lst/DAQ/SitePages/RunCatalog.aspx#ORM Run Catalog] and [https://cta-north.slack.com/messages/C4JB24UMC/p1540909546003100 Slack].
 +
 
 +
Corrected pixel map implemented (spiral numbering).
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-10-29 || Oscar, Taka, Mitsunari || style="background: #CCFFCC;" | Power up
 +
||
 +
 
 +
The Dragon with IP 10.1.6.28 (3rd column starting by the left from outside, 5th modules from below) was put in the busbar powered by relay 1 instead of 0. In exchange, module in 4th column 5ht from b below was put in the relay 0 instead of relay 1. Camera was powered up only once and all modules and BP went up.
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-10-27 || Taka, Mitsunari  || style="background: #FFFFCC;" | Random Trigger
 +
||
 +
 
 +
We took the random trigger. Following the instruction with Lea, random trigger could be easily produced.
 +
With DragonDaqM,
 +
 
 +
300 Hz injection -> 300 Daq rate.
 +
 
 +
1k Hz-> 783 Hz
 +
 
 +
3k Hz-> 1162 Hz
 +
 
 +
6.5k Hz -> 1303 Hz.
 +
 
 +
With EVB, we first tried with 6.5 kHz. Then EVB crashed because of buffer full.
 +
But busy state of modules was 03, which means EVB are connected and modules were busy.
 +
To recover from this state, we had to reboot Dragons. A few minutes later, Carlos Diaz called us.
 +
The current consumption at bus bars are ~10Amp higher than usual. Normally 25-27 Amp but after rebooting Dragons, it was 35 Amp.
 +
We shutdown the 24V. After 10 min or so, Carlos allowed us to restart.
 +
All Dragons could be communicated from cacoserver, but not from Osaka. ip link set p*p* down/up didn't help.
 +
We rebooted Osaka. Then Osaka could ping to all (but one) modules.
 +
However, EVB didn't work. Later we learned from Dirk and Julien that we had to do
 +
 
 +
sudo modprobe -r ixgbe; sudo modprobe ixgbe
 +
 
 +
||
 +
 
 +
|- valign="top"
 +
| 2018-10-27 || Oscar, Laia , Taka, Mitsunari || style="background: #CCFFCC;" | Power up
 +
||  
 +
 
 +
After checking that Dragon and BP regulators can stand input voltage above 30 V, we increased the power provide by the Power Supplies to 27.5V (the same for the 8 Power Supplies).  
 +
 
 +
With this configuration, the voltage while ramping up increase up 20.3 V and then only decreases to 19.8 V for about 1 ms. This should be completely find for the Dragons.
 +
 
 +
We power up the camera with the ECC 10 times. All BP went up all times. Only one Dragon (always the same) does not power up the first time after a ~1 hour break (tried two times), after this first power up all Dragons power up.
 +
||
 +
|- valign="top"
 +
| 2018-10-26 || Taka, Mitsunari  || style="background: #FFFFCC;" | TIB state machine.
 +
||  
 +
 
 +
We tried to solve the "State 255" problem in TIB. Luis Angel suggested to configure modules at state 2. We followed his instruction, but we reached state 255.
 +
So we tried modules configuration at state 0. Same result. We tried module configuration at state 4, resulting in the same state 255.
 +
 
 +
We also tried to see the test pulse postion to the center of the readout window. But we could not see the test pulse at all. Delay setting in TIB or backplane is not correct.
 +
 
 +
||
 +
|- valign="top"
 +
| 2018-10-26 || Oscar, Laia , Taka, Mitsunari || style="background: #FFFFCC;" | Power up
 +
||  
 +
 
 +
The drop in the voltage is due to a current limit in the circuitry of the relay. Increasing the voltage of the power supplies should rise the value of the dip in the voltage so that it does not reach 18V.
 +
 
 +
We measure again the transients for relay 0 with Power Supply at 24.98 V as reference.
 +
We increase the voltage of Power Supplies to 25.25 V, the dip is about 100 mV higher.
 +
||
 +
|- valign="top"
 +
| 2018-10-25 || Taka, Mitsunari Yusuke  || style="background: #CCFFCC;" | Event Mixing
 
||  
 
||  
  
 
We understood the origin of EventMixing. It is due to the slow control command "Dragon - Start" after "Enable Trigger" in TIB.
 
We understood the origin of EventMixing. It is due to the slow control command "Dragon - Start" after "Enable Trigger" in TIB.
 
"Enable Trigger" should have been after "Dragon Start". This is dangerous actually. Mistake will be noticed only during analysis.
 
"Enable Trigger" should have been after "Dragon Start". This is dangerous actually. Mistake will be noticed only during analysis.
 
+
||
  
 
|- valign="top"
 
|- valign="top"
Line 1,191: Line 1,866:
  
 
The same is observed in relay 1.  
 
The same is observed in relay 1.  
 +
||
  
 
|- valign="top"
 
|- valign="top"
Line 1,199: Line 1,875:
 
Since it was already 5:50 pm, (though we announced that we use camera until 5:00 pm) we had to shutdown.
 
Since it was already 5:50 pm, (though we announced that we use camera until 5:00 pm) we had to shutdown.
 
We kept 230 and 400V on, chiller on, only 24V off.
 
We kept 230 and 400V on, chiller on, only 24V off.
 +
||
  
 
|- valign="top"
 
|- valign="top"
Line 1,208: Line 1,885:
 
One of the reasons was dead ports in Osaka. Sometimes, ports in Osaka sleep without obvious reason. This is actually critical problem. We need to investigate further.
 
One of the reasons was dead ports in Osaka. Sometimes, ports in Osaka sleep without obvious reason. This is actually critical problem. We need to investigate further.
 
Finally we gave up to take data with EVB.
 
Finally we gave up to take data with EVB.
 +
||
  
 
|- valign="top"
 
|- valign="top"
Line 1,223: Line 1,901:
 
So, currently, startup recipe is that 0->1->2->3->4->255->TIB Reset->0->1->2->3->4->5->configure modules -> TIB Reset -> 0 -> 1 ->2 ->3 ->4 ->5.
 
So, currently, startup recipe is that 0->1->2->3->4->255->TIB Reset->0->1->2->3->4->5->configure modules -> TIB Reset -> 0 -> 1 ->2 ->3 ->4 ->5.
  
 +
||
  
 
|- valign="top"
 
|- valign="top"
Line 1,235: Line 1,914:
 
|- valign="top"
 
|- valign="top"
 
| 2018-10-15 || [[User:DirkHoffmann|Dirk]] || style="background: #FFCCCC;" | UCTSd dead.
 
| 2018-10-15 || [[User:DirkHoffmann|Dirk]] || style="background: #FFCCCC;" | UCTSd dead.
|| <pre>● uctsd.service - Execute the UCTS OPC-UA server
+
||
  Loaded: loaded (/etc/systemd/system/uctsd.service; static; vendor preset: disabled)
+
uctsd.service - Execute the UCTS OPC-UA server
  Active: failed (Result: exit-code) since Di 2018-10-16 13:22:28 WEST; 2h 59min ago
+
Loaded: loaded (/etc/systemd/system/uctsd.service; static; vendor preset: disabled)
  Process: 152844 ExecStart=/home/dragon/ucm_temp/ucts_opcua_server.sh (code=exited, status=134)
+
Active: failed (Result: exit-code) since Di 2018-10-16 13:22:28 WEST; 2h 59min ago
Main PID: 152844 (code=exited, status=134)
+
Process: 152844 ExecStart=/home/dragon/ucm_temp/ucts_opcua_server.sh (code=exited, status=134)
 +
Main PID: 152844 (code=exited, status=134)
  
 
Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Connected to Server : opc.tcp://osaka:48010
 
Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Connected to Server : opc.tcp://osaka:48010
Line 1,250: Line 1,930:
 
Okt 16 13:22:28 osaka systemd[1]: uctsd.service: main process exited, code=exited, status=134/n/a
 
Okt 16 13:22:28 osaka systemd[1]: uctsd.service: main process exited, code=exited, status=134/n/a
 
Okt 16 13:22:28 osaka systemd[1]: Unit uctsd.service entered failed state.
 
Okt 16 13:22:28 osaka systemd[1]: Unit uctsd.service entered failed state.
Okt 16 13:22:28 osaka systemd[1]: uctsd.service failed.</pre>
+
Okt 16 13:22:28 osaka systemd[1]: uctsd.service failed.
 
Restarted.
 
Restarted.
  
Line 1,319: Line 1,999:
 
|- valign="top"
 
|- valign="top"
 
| 2018-10-15 || [[User:Ljouvin|Léa]], [[User:DirkHoffmann|Dirk]], Julien, taka, Saiya, Mitsunari || style="background: #FFCCCC;" | Slow control and uaexpert deconnection ||
 
| 2018-10-15 || [[User:Ljouvin|Léa]], [[User:DirkHoffmann|Dirk]], Julien, taka, Saiya, Mitsunari || style="background: #FFCCCC;" | Slow control and uaexpert deconnection ||
- Slow control connection lost in ready mode so then no more current in the pulse bar. GotOsafe GOtOready still no curent with the negative value in the pulse bar. We had to switch off and on the 233 and 400 V
+
- Slow control connection lost in ready mode so then no more current in the pulse bar. GotOsafe GOtOready still no curent with the negative value in the pulse bar. We had to switch off and on the 233 and 400 V
  
 
- We lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181015_005 and 20181015_006
 
- We lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181015_005 and 20181015_006
Line 1,406: Line 2,086:
 
| 2018-10-14 || [[User:DirkHoffmann|Dirk]] || style="background: #CCCCFF;" | Direct measurement of TX lasers ||  
 
| 2018-10-14 || [[User:DirkHoffmann|Dirk]] || style="background: #CCCCFF;" | Direct measurement of TX lasers ||  
 
INFO: Direct measurements can be done without danger for Photom-211
 
INFO: Direct measurements can be done without danger for Photom-211
測定範囲 -70 ~ +5dBm   
+
測定範囲 -70 ~ +5dBm   
 
according to [http://www.graytechnos.com/20prod_opt/10pwmtr/ datasheet]. That is 3.16&nbsp;mW to 0.1&nbsp;µW.  
 
according to [http://www.graytechnos.com/20prod_opt/10pwmtr/ datasheet]. That is 3.16&nbsp;mW to 0.1&nbsp;µW.  
 
|| ||
 
|| ||
Line 1,456: Line 2,136:
  
 
- One try with no external trigger and clock but with the CBP delivering the clock and pps and using 10 MHz clock as default clock for the dragons. L1 local Trigger didn't generated. No we come back to a configuration of dragon on their local clock but this issue has to be investigated. Script used in CLusCo: init7_noextTriggerClock_Test.uic
 
- One try with no external trigger and clock but with the CBP delivering the clock and pps and using 10 MHz clock as default clock for the dragons. L1 local Trigger didn't generated. No we come back to a configuration of dragon on their local clock but this issue has to be investigated. Script used in CLusCo: init7_noextTriggerClock_Test.uic
 
+
||
 
|-
 
|-
 
|  [[User:DirkHoffmann|Dirk]], Taka, Julien, Seiya, Léa || style="background: #FFCCCC;" | Too high temperatures in the Camera ||   
 
|  [[User:DirkHoffmann|Dirk]], Taka, Julien, Seiya, Léa || style="background: #FFCCCC;" | Too high temperatures in the Camera ||   
Line 1,462: Line 2,142:
  
 
- During the day, due to high temperatures we have to gotosafe to wait for the camera to cool almost 11 times but never the BP max temperature went more than 34 degree. The air inside reach at the maximum 26.5.
 
- During the day, due to high temperatures we have to gotosafe to wait for the camera to cool almost 11 times but never the BP max temperature went more than 34 degree. The air inside reach at the maximum 26.5.
 +
||
 
|-
 
|-
  
Line 1,467: Line 2,148:
 
- In the afternoon, 3 lost of ECC slow control communication due to the interruption between IT-Container/Driver-Container. Miscomunnication with AMC people... First time, temperature was already high in the camera, we had to switch OFF the 233 and 400V for security reasons. Two other times, we get the ECC connection back quite fast and ECC was in the same state that when the connection was lost meaning state 2 ready. Just no more current in the pulse bar so we have to gotosafe and gotoready both times. After that the current was -247 in the pulse bar... Not understood for the moment
 
- In the afternoon, 3 lost of ECC slow control communication due to the interruption between IT-Container/Driver-Container. Miscomunnication with AMC people... First time, temperature was already high in the camera, we had to switch OFF the 233 and 400V for security reasons. Two other times, we get the ECC connection back quite fast and ECC was in the same state that when the connection was lost meaning state 2 ready. Just no more current in the pulse bar so we have to gotosafe and gotoready both times. After that the current was -247 in the pulse bar... Not understood for the moment
 
- The second interruption happened, when the Moxa switch was reconnected, probably not correctly configured. It was disconnected again. Presently this impacts AMC '''and''' drive operation, until the Moxa can be reconnected.
 
- The second interruption happened, when the Moxa switch was reconnected, probably not correctly configured. It was disconnected again. Presently this impacts AMC '''and''' drive operation, until the Moxa can be reconnected.
 
+
||
 
|-
 
|-
  
 
| Léa, Taka, Julien, Seiya, [[User:DirkHoffmann|Dirk]] || style="background: #FFCCCC;" | Discovered SLOW control fiber lost, fibers changed connection recover || Interruption. Using UCTS section for replacement.
 
| Léa, Taka, Julien, Seiya, [[User:DirkHoffmann|Dirk]] || style="background: #FFCCCC;" | Discovered SLOW control fiber lost, fibers changed connection recover || Interruption. Using UCTS section for replacement.
 
+
||
 
|-
 
|-
 
| rowspan="3" | 2018-10-11 || Eric, Armand || style="background: #FFFFCC;" | Cable splicing || UCTS fibres ready and checked.  
 
| rowspan="3" | 2018-10-11 || Eric, Armand || style="background: #FFFFCC;" | Cable splicing || UCTS fibres ready and checked.  
 
+
||
 
|-
 
|-
 
| Léa, Taka, Julien, Seiya, [[User:DirkHoffmann|Dirk]] || style="background: #FFCCCC;" | DATA5-upstream broken || Located between DC-PP. and IC-PP. Eric is going to have a look on Friday, when working on the other (spare) fibres.
 
| Léa, Taka, Julien, Seiya, [[User:DirkHoffmann|Dirk]] || style="background: #FFCCCC;" | DATA5-upstream broken || Located between DC-PP. and IC-PP. Eric is going to have a look on Friday, when working on the other (spare) fibres.
 
+
||
 
|-
 
|-
 
| Léa, Taka, Julien, Seiya, [[User:DirkHoffmann|Dirk]] || style="background: #CCFFCC;" | Found correct order of DATA1-DATA6 || We eventually found that the fibres DATA1-6 were connected in (exactly) wrong order to the camera, which lead to a mismatch of switches/modules with respect to interfaces/addresses in '''osaka'''.  
 
| Léa, Taka, Julien, Seiya, [[User:DirkHoffmann|Dirk]] || style="background: #CCFFCC;" | Found correct order of DATA1-DATA6 || We eventually found that the fibres DATA1-6 were connected in (exactly) wrong order to the camera, which lead to a mismatch of switches/modules with respect to interfaces/addresses in '''osaka'''.  
This is an item for our "learned lessons": The indoor fibres had been labelled (switch-interface), but stayed in Mirca. The new fibres had been confectioned at ORM, and labels had to be "guessed" in one way or the other.
+
This is an item for our "learned lessons": The indoor fibres had been labelled (switch-interface), but stayed in Mirca. The new fibres had been confectioned at ORM, and labels had to be "guessed" in one way or the other.||
 +
||
 
|-
 
|-
  

Latest revision as of 15:08, 29 January 2019

Enter comments in reverse time order[edit]

Glossary at the end!

Date Actor/Author Action summary Comments Documents
2019-01-29 Taka, Mitsunari Moving to ELOG

We have moved the logbook to ELOG system in 10.200.100.102:9090.

2019-01-28 Taka, Mitsunari Checking down of Osaka ports

We iterated turning off and turning on the Data switches inside the camera. Then we checked whether the ports on Osaka are alive or dead with arp-scan 20 times. At the first first trial, we found p1p1, p1p2 and p3p1 were dead. P3p1 corresponds to Data5 fibre, which we reconnected this morning. We swapped the pair (up and down) of the patch cord between the patch panel and the Maxa switch inside the drive container. Then connection in p3p1 worked. Including the first test for p1p1 and p1p2, each port died p1p1: 5 times, p1p2: 2 times, p2p1: 1 times, p2p2: 1 times, p3p1: 0 times, and p3p2: 0 times.

2019-01-28 Taka, Armand, Riccardo, Mitsunari Measuring the optical fibre

We checked the optical throughput of the DATA5 fibre and UCTS fibre. The DATA5 fibre was fine, but the throughput of both of two fibres of the UCTS cable was too low. We connected the DATA5 cable to the connector for DATA5 on the camera and the DATA5 spare cable to the connector for the DATA5 spare.

2019-01-24 Daniel M., Elena, Mitsunari Implementing a changeable anode current limit and testing the anode current limiter
  • Daniel M. implemented a function which changes the anode current limit to a lower value, which was suggested by Riccardo R., in ClusCo. The limit cannot be changed to a higher value.
  • We supplied 1000 V for all module and set the limit 0.010 mA. Then the maximum anode current decreased from 0.012 mA to 0.010 mA. The anode current limiter was confirmed to be working.
2019-01-24 Mitsunari Reproducing dead pixel
  • I supplied HV 400V for all pixels.
  • I opened the camera shutter. The mean anode current was 0.12 uA.
  • I increased the HV to 1000 V by 100 V steps and supplied also the nominal HV. The mean anode current was 8.31 uA with the nominal HV. The HV was successfully supplied for all pixels by a precision of about 20 V (one-side full width), except for the off module. The anode current of Pixel 2 of Module 135 did not increase.
  • I supplied 1000 V only for Pixel 2 of Module 135. The read HV value was OK. The max anode current was 0.24 uA, and that of the pixel was lower than or equal to this value. This pixel appeared to be out of order.
  • Log files in TCS01:/var/log/clusco/reports
    • monitor-190124-195808.txt
    • monitor-190124-201256.txt
    • monitor-190124-202103.txt
    • monitor-190124-203336.txt
  • Screen shots of the camera monitor
Nominal HV for all modules
1000 V only for Module 135
1000V only for Pixel 2 of Module 135
2019-01-24 Elena, Mitsunari Opening/closing shutter
  • We confirmed the camera shutter can be opened/closed via the OPCUA client.
  • Opening the shutter via CaCo v1.0 failed.
2019-01-23 Mitsunari, Oscar, Shunsuke Check on HV supply

Checks on HV supply with the shutter closed. HV was correctly supplied for all pixel except for one off module.

2019-01-23 Elena Ratescans L0

Checks on the modules that showed wrong L0 output. From a first look it doesn't seem to affect L1 sum.

2019-01-23 Shunsuke, Mitsunari ClusCo mornitoring

We confirmed Shunsuke's script for monitoring a ClusCo report worked for the latest ClusCo.

2019-01-23 Elena Start-up

All modules up.

2019-01-22 Elena L0 ratescans
  • We started with a rate scan at L0 without pulse, init0 and starting up temperature ~16. Module 45 does not show the correct behaviour.
  • We power cycled the camera a couple of times as at first we got that the module 10.1.5.13 did not come back. After a L0 ratescan all modules and trigger mezzanines were fine.
  • We inject a 300Hz pulse and took a couple of rate scans at L0 with gain 40 and 32. The scans are mostly fine, but some pixels of some modules show less signal than expected.
2019-01-22 Elena Start-up

At the first start-up the module 10.1.6.28 did not come up. Power cycled ad all the modules powered up. Taking a L0 rate scan to check the behaviour of the trigger.

2019-01-21 Elena, Riccardo L0 ratescans
  • We started with a rate scan at L0 without pulse, init0 and starting up temperature ~13. Modules 25, 64, 80, 108, 123, 144, 155, 166, 174, 180, 196, 206, 210, 225, 232, 250 do not show the correct behaviour.
  • We moved to the icrr_dev ClusCo version (in the validation folder). We did a L0 scan without pulse (and init8 reloaded) to confirm the results seen before.
  • We cheched the registers 23 and 31 of the L0 ASIC. All the modules had them set to the same values: 23: x970000 and 31: x9f0000.
  • We finally took a L0 rate scan with the pulse injected to all the modules. Modules 25, 34, 64, 80, 108, 110, 123, 144, 155, 166, 174, 180, 196, 206, 210, 225, 232, 250 did not show a good behaviour (2 more than the previous rate scan).


2019-01-21 Elena, Riccardo Start-up

At the first start-up the module 10.1.6.28 did not come up. Power cycled, but the module 10.1.5.16 did not came up. At the third start-up all the modules powered up.

2019-01-18 Oscar, Elena Chiller

Carlos Diaz called us to check and alarm of the chiller. We found in the display a blinking orange message saying: "b 1AC". Carlos indicated (and sent us) a "Chiller Setting Procedure" that we followed to reset the chiller. Once reset we restarted the camera and waited for a while for the temperature to rise and the Chiller to reach the point of operation. All worked well.

      • The Chiller setting Procedure is now hanging on the wall behind the computers but we should collect all these kinds of documents in a useful folder.

Checking the historical monitoring we saw that the humidity has risen during the bad weather days. We told Carlos Diaz who replied he is looking into it and he thinks it is that the chiller was not working.


2019-01-18 Oscar, Elena ECC
  • We are forcing ECC to go in error while in Safe by changing minimum Temperature for Safe. We leave the ECC in error for defined time and the we recover and try to go to ready
    • 10 seconds -> recover to safe ok -> go to ready ok
    • 1 minute -> recover to safe ok -> go to ready ok
    • 5 minute -> recover to safe ok -> go to ready ok
    • 20 minute -> recover to safe ok -> go to ready ok
    • 60 minutes -> recover to safe ok -> got to ready ok


2019-01-18 Oscar, Elena L0 Trigger
  • L0 Rate scan after starting the modules at 7 deg (L0_scan_Init8_PIOff_HVOff_7deg_b.dat)
    • Modules 42, 45, 180. 225 and 231 show problems
    • Main register and register 4 for L0 is the expected one for all modules/pixels (19988 and 8651958)
    • After reconfiguring L0 to sent majority to L1 (register 4 to 8650870), a L1 scan is done for module 45 with DT599, which gives 65KHz rate for one pixel
    • L1 scan show rate of 65KHz until DT~15
  • Power off/on and init at 10 deg (L0_scan_Init8_PIOff_HVOff_10deg.dat)
    • Module 166 and 210 show problems
    • After reconfiguring L0 to sent majority to L1:
      • L1 scan is done for module 45 with L0DT599, which gives 0 Hz for all pixels -> L1 scan goes to 0 Hz at L1DT~2
      • L1 scan is done for module 45 with L0DT517, which gives 65 KHz for one pixel -> L1 scan goes to 0 Hz at L1DT~15
      • L1 scan is done for module 210 with L0DT599, which gives 65 KHz for three pixel -> L1 scan goes to 0 Hz at L1DT~15
  • Power off/on and init at 13 deg (L0_scan_Init8_PIOff_HVOff_13deg.dat)
    • All modules fine at L0 level
    • L1 scan done for module 45 and 210 with L0DT=599, which provide L0 rate 0, for both rate goes to 0 at L1DT~2, for module 45 it is consistent with previous one
    • L1 scan done for module 210 with L0DT=520, which provide 65KHz for three pixels -> L1 scan goes to 0 Hz at L1DT~20

This shows that the problem is already at the level of the discriminator, not the LVDS copy. Being discussed with L0 experts.

2019-01-18 Oscar, Elena Start Up

Camera was in safe and modules power up at once.

2019-01-17 Oscar, Elena UAExpert

Direction of UCTS has been changed from 10.1.4.12:48010 to 10.1.4.1:48010 in the project CACO_ECC_TOB_UCTS_2, the one recommend to be used. This allow to connect to the OPC-UA for the UCTS running in TCS01

2019-01-17 Oscar, Elena ECC
  • We are forcing ECC to go in error while in Safe by changing minimum Temperature for Safe. We leave the ECC in error for defined time:
    • 10 seconds -> recover to safe ok
    • 1 minute -> recover to safe ok
    • 5 minute -> recover to safe ok
    • 20 minute -> recover to safe ok
    • 60 minutes -> recover to safe ok. Then tried to go to Ready and:
      • ECC when to error indicating "IR RS485 Error". Fans were running.
      • We acknowledge the error and the error changed to "Switch, PDB, Cable Error". Fans stopped running.
    • Tomorrow we will check to force the error and go to ready for the shorter intervals of time.
2019-01-17 Oscar, Elena L0 Trigger
  • We took several L0 scans with no Pulse Injection, no HV:
    • L0_scan_Init6_PIOff_HVOff.dat and L0_scan_Init8_PIOff_HVOff.dat, only running init without powering off-> modules 42, 45, 157, 180 and 225 showed problems
    • L0_scan_Init8_PIOff_HVOff_b.dat: powering on/off (and init) -> module 231 showed problems. Reconfiguring, reseting L0, reseting L0 delays, ... did not solve it
    • L0_scan_Init8_PIOff_HVOff_{c,d,e,f,g}.dat: powering on/off (and init)-> all modules fine. Temperatures between 20 and 25 degrees.
  • We stop and let the camera cool down and take one scan powering up at 7 degrees
    • Module 225 showed problems

RateScanFiles

2019-01-17 Oscar, Elena Start Up

When arriving all was fine and everything worked smoothly to get the camera in ready. No error happened.

2019-01-16 Oscar, Elena ECC
  • Individual Switch On Vertical Bus bar relays
    • Off one by one, ok
    • On one by one, ok
    • Move to Safe and back to ready, ok
    • Off/On one by one. ok
  • PSB relays
    • TIB and UCTS, OK
    • General ShutDown (done twice with same result)
      • Only half camera on
      • PSB1 and PSB2 -> 24 V went off
      • But, ECC went in error saying: "PDB Com Error:TRUE PDB description error: IR RS485 Error", Error Number 0
      • Acknowledge Error, ECC went to Safe for 1 second and then to error again with : "PDB Com Error:TRUE PDB description error: Switch, PDB, Cable Error", Error Number 0
      • Not able to recover
  • PDB relays
    • Ethernet switches, ok
    • Front Fans, ok


2019-01-16 Oscar, Patricia, Elena Ethernet Cabling inside camera
  • Switch 1:
    • Changed Ethernet Cables
      • BP1015 moved from port 3 to port 5
      • BP1008 moved from port 5 to port 7
      • BP1009 moved from port 7 to port 9
      • BP1011 moved from port 9 to port 11
      • BP1013 moved from port 11 to port 13
    • Port 13 was free, now port 3 is free, so both normal and spare optical fibre can be used.
    • Port 2 is now connected to switch 3 instead of switch 2
  • Switch 2:
    • Changed Ethernet Cables
      • BP0716 moved from port 3 to port 4
      • Ethernet cable to Switch 3 In port 2, removed
      • Ethernet cable to Switch 1 in port 4 removed
      • Additional cable to Control Switch (across the camera) connected in port 2
    • Port 3 is free, so both normal and spare optical fibre can be used.
  • Switch 3:
    • Changed Ethernet Cables:
      • BP0401 moved from port 3 to 6
      • BP0408 moved from port 6 to 8
      • BP0409 moved from port 8 to 10
      • BP0411 moved from port 10 to 12
      • BP0405 moved from port 12 to 14
      • BP0407 moved from port 14 to 16
      • BP0403 moved from port 16 to 18
      • BP0309 moved from port 18 to 20
      • BP0312 moved from port 20 to 22
      • BP0305 moved from port 22 to 24
      • BP0310 moved from port 24 to 26
      • BP0306 moved from port 26 to 28
      • BP0308 moved from port 28 to 30
      • BP0107 moved from port 30 to 32
      • BP0108 moved from port 32 to 34
      • BP0203 moved from port 34 to 36
      • BP0102 moved from port 36 to 38
      • BP0103 moved from port 38 to 40
      • BP0205 moved from port 40 to 42
      • BP0105 moved from port 42 to 44
      • BP0106 moved from port 44 to 46
      • BP0202 moved from port 46 to 48
    • Port 48 was free, now port 3 is free, so both normal and spare optical fibre can be used.
    • Port 4 is now connected to switch 1 instead of switch 2

Media:NewCablingInsideCamera.jpg

2019-01-16 Oscar, Elena ECC
  • ECC when arriving in the morning was in error:
    • Error Description: "Control Regulation Chain", minimum temperature for safe was set to 5, at the moment we arrived temperatures were between 6 and 10 in the web monitoring
    • Looking at the history in UA-Expert, there is the sensor 8, that is and has been below 5, all the other are and were not
    • Acknowledge error -> ECC stays on error and Error Number oscillates between 0 and 2
    • Minimum Temperature set to 0 degrees and Acknowledge error, ECC went to safe. It was due to sensor temperature 8 that it seems not to be displayed in the web monitoring.
  • Few seconds after recovering ECC to safe, ECC went again to error
    • Error Description: ""PDB Com Error:TRUE PDB description error: IR RS485 Error", Error Number 0
    • Set Acknowledge -> ECC to Safe for 1 second and back to Error, and fans went off.
    • Not able to recover it
  • Hard reboot
    • ECC at recovering when back to error with Error Description: "Control Regulation Chain" and Error Number oscillates between 0 and 2, Fans off
    • SetMinimum temperature to 0 and acknowledge error brings ECC to safe with Fans turning
    • ECC to ready, ok



2019-01-15 Oscar, Elena Test ECC version
  • ECC V34
    • Installed and started, running for 3 hours without problems
    • Checking functionalities:
      • Error recover, due to too high minimum temperature from safe: OK
        • ECC in safe
        • Minimum temp for Safe set above current temperature->ECC goes to error
        • Minimum temp for Safe set below current temperature -> ECC stays in error
        • Acknowledge error -> ECC goes to Safe
      • Error recover, due to too high minimum temperature from ready:
        • Minimum temp for Ready set above current temperature->ECC goes to error
        • Minimum temp for Ready set below current temperature -> ECC stays in error
        • Acknowledge error -> ECC stays in error while it should go to safe, ErrorResolution says COntrol Rrgulation Chain, ErrorNumber 1, ErrorDescription ...empty
        • Rearme 400V -> ECC stays in error
        • Acknowledge error -> ECC goes to safe
      • Repeat error recoverr, due to too high minimum temperature from ready two more times: Ok, recovering to safe and then ready without the need of rearming.
      • Turn off, relay by relay:
        • ok for 0,1,2,3
        • When turning of relay 4, ECC went to error, saying: "PDB Com Error:TRUE PDB description error: IR RS485 Error"
        • Not able to recover from that error doing rearm 400 V. Fans were turning and they went off when rearming 400V. Neither using the setPDB_contactors method


2019-01-15 Oscar, Elena, Patricia Preparing Access to Camera
  • Check how to access the rear part of the camera safely to re-cable the ethernet cables so that the spare fibres are functional at the same time that main ones.
  • Switch on was easy to re-arrange and ti has already been done.
  • Work will continue tomorrow


2019-1-9 Pepa, Elena, Cristobal (remote) validation of calibration software

We tested the IPRscan calibration, it seems to work but some more output is needed to fully validate it. We also understood the problem of no module and pixel monitoring while working with caco.


2019-1-10 Pepa, Elena turning on camera

We turned on the camera 4 times because the module 10.1.6.28 did not come up for 3 times.


2019-1-9 Pepa, Elena, Cristobal (remote) validation of calibration software

We turned on the camera and the first time the module 27 (IP: 10.1.6.28) did not come up; the second time the module 10.1.5.13 did not come up; while the third time all turned on.

We tested the DTN calibration over all the camera and it works. Although to make it working we had to activate the InitModules(0,0) function. We know that this function is not activated (with a valid reason) in the master version, but we need it to test the calibration software.

The DTN calibration could not find the noise region for some pixels in the modules 102, 182, 188 and had some troubles with other pixels. We then run a ipr_scan from ClusCo for the pixels of the module 188 and found out that some pixels have 0 rate in all the DT range.



2018-12-19 Oscar, Lea, Shunsuke Rate Scans


(Files available through the Documents field)

We do several rate scans with pulse injection. We aim to cross-check results from 2018-12-18. To make sure of the configuration, register 100, 101 and 102 of L1 are read after each L1 adder configuration (RegistersCheckPulseInjection.txt and RegistersCheckHVNominal.txt)

  • Rate Scans with Pulse Injection with gain 20 (PI14 was meant to mean gain 14, but finally we used 20)
    • They were launched through the dic file: L1Scan_InputsInAdders_20181219.uic
    • No Input in any adder
      • File Name: L1_PI14_step2_OnlyReset_AdderN_20181219_1.dat, N={A,B,C}
    • Local Input only in one adder
      • File Name: L1_PI14_step2_LocalInM_AdderN_20181219_1.dat, M={A,B,C}, N={A,B,C}
    • LocalMode
      • File Name: L1_PI14_step2_LocalMode_AdderN_20181219_1.dat, N={A,B,C}
    • Other Neighbours than local needed for Mod 3, individually added.
      • File Name: L1_PI14_step2_NXInY_AdderY_20181219_1.dat, (X,Y)={(A,2),(A,4)(B,4),(B,5)}
    • Registers were fine for all settings and signal is observed in all adders where it is expected.
  • Rate Scans with nominal HV and shutter close
    • They were launched through the dic file: L1Scan_Mode3_ShutterClose_HVNominal_20181219.uic
    • Mode 3:
      • FIle Name: L1_HVNominalShutterClose_step2_Mode3_AdderX_20181219_1.dat, X=(A,B,C)
      • Adder C shows nothing as expected, since no input is linked to it.

Everything seems reasonable this time. Strange feature observed last night (never something fully crazy), may have been due to bad configuration. Sometimes it may have been due to human mistakes, but at least the fact that some rates were observed for adder C in mode 3 does not seem so, since it was executed launching the file L1Scan_ShutterClose_20181214.uic.

RateScanFiles RegistersPI RegistersHV MacroPI MacroHV MacroHV20181218

2018-12-19 Oscar, Lea, Shunsuke and Taka (remote) DataTaking for capacitor calibration
  • Taking 1kHz Testpulse Data with ROI 40, DRS internal clock, using Externall trigger.

Trigger was created from module ID 265.

  • I performed short run first (-20 second). This is (.../onsite/data/20181218/Run00001.0004 and Run00001.0005)
  • The long run performed with EVB, files are in .../onsite/data/20181219/Run00001.0000 - Run00001.0020.
  • According to Taka quick analysis, These was no pulse in ROI. So Another data taing was perforemd with Legacy DAQ just 30 event.


2018-12-19 Oscar, Lea, Shunsuke ECC version: v32 patch

ECC was in error due to PDB error communication when we arrive this morning

-> Decided to go back to the version V32patch


2018-12-19 Oscar, Lea, Shunsuke, Cristobal (remote) and Elena (remote) Filling MongoDB
  • Test with one BP and one pixels variables successful
  • ECC variable being filled to the MongoDB
2018-12-19 Oscar, Lea, Shunsuke Rate Scans

We continue the planned rate scans

  • L1 rate scan, step 2, Pulse Injection (PI) 20 (~5Phe), HV OFF:
    • Only local L0, adders A, B and C
      • No crazy things, all seems reasonable, Adder A finishes at DT ~120
      • Details to be checked
      • File names: L1_NoHV_PI5Phe_ModeLocal_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
      • Third repetition for A and first for B and C are full range, others until it stops
      • Adder B and C have no signal, rate scan done only once. But they should have if the local mode was correctly set.
    • Trigger Mode 3, adders A, B and C
      • No crazy things, all seems reasonable, Adder A and B finishes at DT ~220 and ~230 respectively
      • Details to be checked
      • File names: L1_NoHV_PI5Phe_Mode3_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
      • First repetition is always full range, others until it stops
      • Adder C no signal as expected, hence only one rate scan done
  • L1 rate scan, step 2, Shutter Close, HV Nominal:
    • Only local L0, adders A, B and C
      • No crazy things, all seems reasonable
      • Details to be checked
      • File names: L1_shutterclose_step2_modelocal_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
      • First repetitions are full range, others until rates are 0
    • Trigger Mode 3, adders A, B and C
      • No crazy things, all seems reasonable
      • Details to be checked
      • File names: L1_shutteroclose_step2_mode3_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
      • First repetition is always full range, others until it stops
      • Adder C no signal expected, hence only one rate scan done. Still, 1 Hz rate is observed from time to time for basically any DT value.
  • L1 rate scan, step 2, Shutter Open, HV = 800 V (moon quite bright):
    • Only local L0, adders A, B and C
      • No crazy things, all seems reasonable
      • Details to be checked
      • File names: L1_shutterclose_step2_modelocal_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
      • First repetitions are full range, others until rates are 0
    • Trigger Mode 3, adders A, B and C
      • No crazy things, all seems reasonable
      • Details to be checked
      • File names: L1_shutteroclose_step2_mode3_AdderX_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
      • First repetition is always full range, others until it stops
      • Adder C no signal expected. Still, 1 Hz rate is observed from time to time for basically any DT value.

We will continue tomorrow. One thing we want to check is rate scan with HV with no input to any Adder, to see if the the 1 Hz is still there.

Files


2018-12-18 Oscar, Mau, Michele Calibration Box On ground
  • Calibration IP address and mask set according to LST1NetworkOrganization
    • IP : 10.1.4.69
    • Mask: 255.255.255.128
    • Gateway: 10.1.4.1
  • Communication with UA-expert stablished and wheels moved
2018-12-18 Léa, Oscar, Sunsuke, Datataking

In 20181218

  • Run00079: Test ZFW speed writting, PEriodic frequency from the TIB 1000 Hz
  • Run00080: Test ZFW speed writting
    • Periodic pedestal from TIB 1000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW  ??? MB/s
    • Periodic pedestal from TIB 2000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW ??? MB/s
    • Periodic pedestal from TIB 3000 Hz, busy rate 0 Hz, buffer filling 1% stable, waiting speed ZFW ??? MB/s
    • Periodic pedestal from TIB 4000 Hz, busy rate 0 Hz, buffer filling 1% stable , waiting speed ZFW ??? MB/s
    • Periodic pedestal from TIB 5000 Hz, busy rate 0 Hz, buffer filling 1% stable , waiting speed ZFW ??? MB/s
    • Periodic pedestal from TIB 6000 Hz, busy rate 0 Hz, buffer filling slowly , waiting speed ZFW ??? MB/s
    • Periodic pedestal from TIB 100 Hz, to free the buffer.
  • It looks like the IP and ports were mixed in the configuration, they should be:
    • tcs03:* 10.200.100.70:13820, 10.200.100.72:13840
    • tcs04:* 10.200.100.71:13830, 10.200.100.73:13850
    • But it did not work. It needs to be investigated.
2018-12-18 Oscar, Shunsuke, Lea, Elena (remote) and Cristobal (remote) Internal Camera Calibrations
  • Lunching calibrations from CaCo: some initialisation problems. Cristobal will check with modules at IFAE and we try again tomorrow.
2018-12-18 Léa, Oscar, Sunsuke, Disconnect 10.1.5.3

- IR 2 switch on from safe: 23.4, 23.5, 23.6

- On ready, all IR ON, current in IR 2: 25.1 - 25.2, both before and after unplugging the module.

Media:UnpoweredCluster.jpg

2018-12-18 Léa, Oscar, Sunsuke, Jean luc and Nadia from remote ECC intelligence relay test

- From safe, not possible to switch one the relay

- GotoReady: IR, 2 3 6 not switch on

- switch them OFF one by one: 0, 1, 4, 5, 7 went to OFF

- switch them all ON: all the IRs ON

- switch them all OFF (-1, false): All off

- switch them all On (-1, true): All on

- GO to safe then Ready: IR 1, 2, 4, 5 OFF. switch them all ON (-1, true) -> All ON

- GO to safe then Ready: IR 1, 2, 6 OFF, switch them all ON (-1, true) -> All ON

- Bug fixed, running with the new version now

2018-12-17 Léa Jouvin, Oscar Blanch, Sunsuke Sakurai) Rate Scans

We go for the planned rate scans

  • L1 rate scan, step 2, Pulse Injection (PI) OFFl, HV OFF:
    • Only local L0, adders A, B and C
      • No crazy things, all seems reasonable
      • Details to be checked
      • File names: L1_NoHV_NoPI_ModeLocal_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition
    • Trigger Mode 3, adders A, B and C
      • Three first rate scans for Adder A showed crazy modules
      • We reconfigured L1 manually: the reset and set of the three adders was done in BBMenu
      • Afterwards, rate scans had not crazy issues.
      • Details to be checked
      • File names: L1_NoHV_NoPI_Mode3_AdderX_step2_N.txt, X={A,B,C} for the adder, N={1,2,3, ...} for repetition

Files

2018-12-17 Léa Jouvin, Oscar Blanch, Dirk (remote), Julien (remote) and Luis Angel (remote) TIB - EVB data

( You can find the Catalogue for the runs taken)

  • We set Analogue random trigger, Analogue Trigger (AT) = 1500 (we got rates between 300 Hz and 3000 Hz)
  • We reset and start a run
  • p1p2 and p2p2 ports at Osaka were not running, we put them up and start a pedestal run
  • Random pedestal run (AT = 1600, since with 1500 we were getting some times too high rate)
    • Rate was about 1000 Hz, but camera rate only 100 Hz. The other 900 Hz were busy rate.
    • The BusyMap for the modules did not shown almost any busy.
    • The RunNumber is Run 0076 (https://cta.cppm.in2p3.fr/LSTCAM/ZFITS/)
  • Periodic pedestal run (we keep same run while increasing rate)
    • Rate 1 kHz, all Ok
    • Rate 2 KHz, all Ok
    • Rate 4 KHz, buffer filling increasing
    • When buffer at ~80%, rate moved to 100 Hz to free it.
    • Two slowControl triggers in the run "interleaved" at the end
    • The RunNumber is Run00077
  • Random pedestal with AT=1600 and 500 Hz pulse injection in module 265
    • Camera rate about 500Hz from module + ~100 Hz from pedestal, the other pedestals are those mainly contributing to busy rate ~1KHz
    • The RunNumber is Run00078
  • Periodic pedestal rate for speed tests (not kept on disk!)
    • Periodic pedestal from TIB 1000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW 84.5MB/s
    • Periodic pedestal from TIB 2000 Hz, busy rate 0 Hz, buffer not filling (0%), waiting speed ZFW 170 MB/s
    • Periodic pedestal from TIB 3000 Hz, busy rate 0 Hz, buffer filling 1% stable, waiting speed ZFW 255 MB/s
    • Periodic pedestal from TIB 4000 Hz, busy rate 0 Hz, buffer filling slowly increasing , waiting speed ZFW 340 MB/s
    • Periodic pedestal from TIB 100 Hz to free buffer
2018-12-17 Léa Jouvin, Oscar Blanch Rate Scans
  • We do three rate scans, first one with step 10, the other two with step 2. For DAC 1, and trigger mod 1.
    • All three look fine, with no strange behaviour.
  • We reconfigure the adder and we take a rate scan, this is done two times (for the first time, the rate scan is taken twice without reconfiguring adder)
    • All three look fine, with no strange behaviour.
  • We enable/disable the test pulse (only disable/enable, no reconfiguration) and we take rate scan:
    • Previous ones were finishing at DT~120
    • First one, module 13 stays at 300 Hz until DT~170
    • Second one, module 13 stays at 300 Hz until DT~170
    • Third one, module 13 stays at 300 Hz until DT~170
  • We do full reconfiguration of modules (ClusCo init) and enable again Pulse Injection and we take one rate scan
    • The rate scan finished back to DT~120, with model 13 being the higher one

Since we are not able to reproduce the strange results observed on Friday, we decide to go for the list of rate scans prepared by Gustavo and Elena. If at any point we see strange behaviours we will check it.

Files

2018-12-17 Léa Jouvin, Oscar Blanch Start up

1) 2 modules OFF!: 10.1.5.16 and 10.1.5.3

2) 1 module OFF!: 10.1.5.3

3) 1 module OFF!: 10.1.5.3

2018-12-17 Léa Jouvin, Oscar Blanch Camera Inspection

Visual inspection once the drive test have finished. We checked that nothing had fall down. We also check some connectors and screws. Everything looks fine.

2018-12-14 Léa, Seiya, Sunsuke Data taking

In 20181214:

- Run00001.0000 to Run00001.0008 (Run00073): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT

Run stoppping, start again - Run00001.0008 to Run00001.0013 (Run00074): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT

Run stoppping, start again - Run00001.0014 to Run00001.0027 (Run00075): Camera 27 degree in zenith, HV: 1000 V, L1 DT applied to each module at 20% more than the NSB DT

2018-12-14 Léa, Seiya, Sunsuke Rate scan

During the afternoon, shutter close

- mode 1: withthout TP - mode 3: withthout TP

-mode 1: TP, 300 Hz, 5 pe -> TP saturation issue for some modules presenting a rate of 300 HZ at high DT -mode 3: TP, 300 HZ, 5 pe -> module 44 strange as on the wednesday night but others fine

2018-12-14 Léa, Seiya, Sunsuke Start up

1) Two modules off: 10.1.5.3 and 10.1.5.16

2) One module off: 10.1.5.3

2018-12-14 Léa, Seiya, Sunsuke Test IR On/OFF one by one by ECC

1) From safe, impossible to swith ON the relay

2) From ready ready: - (-1, false) means all IR at OFF: only IR 0,1,2,3,4 was OFF

- (-1, True) means all ON : all ON but IR 1 that remains at 0 so we did (1, True) again and then IR 1 also ON.

- Then we switch them all again with (-1,off): all off but IR 6. we did again (-1,off) and then all OFF.

- Again (-1,True): all OFF but IR 6 so we did again (-1,True) and then ECC to error to to the PDB

-> hard reset

From ready: - OFF one by one: from 0 to 7, all ok - ON one by one: all ok until IR 7. We start switching ON from 0 to 7 and fro 7 ECC went to error again with the PDB

-> hard reset

- ECC still in Error and error say PDB communication error so ->hard reset again

- ECC safe but then after one minute went to error

- hard reset again

- ECC safe ans seems stabilized


2018-12-13 Léa, Seiya, Sunsuke Start up

1) one module off: 10.1.5.3

2) one module off: 10.1.5.3

3) Two modules off: 10.1.5.3 and 10.1.6.28

4) Two modules off: 10.1.5.3 and 10.1.6.28

5) Two modules off: 10.1.5.3

2018-12-13 Léa, Seiya, Sunsuke Datataking


In 20181213

- Run0001.0000 - Run0001.0001 : TP in all modules, 300Hz, change module by module generated trigger. But TIB didn't send any trigger during the test, so we tried this test again

- Run0001.0002 - Run0001.0009 : TP in all modules, 300Hz, change module by module generated trigger.

- Run0001.0010: we wanted to to some random trigger test from TIB but we don't know why EVB didn't receive anything trigger whereas collected rate was 600 Hz

From here, we copy again the config xml file in ~hoffman/20181212 - - Run0001.0010: one test EVB recorData=False to confirm that for pedestalfrequency higher than 6500 HZ, we have a collected rate that doesn't match the camera rate and busy rate

- Run0001.0011 - Run0001.0044: AnalogPedestal Run

In 20181212 - Run0001.0000 : TP in all modules, 1HZ, change module by module generated trigger

- Run0001.0000: test new pixel id scheme -> Run confusion

- Run0001.0001: Park position, shutter open, HV 1000 V, threshold for L1 DT was 10% of NSB level,

- Run0001.0002, Run0001.0003, 00001.00004: Park position, shutter open, Nominal HV 1000 V,first we trigger on noise for the EVB to receive high rate (1000 HZ) and then we move threshold for L1 DT at 10% of NSB level so trigger rate was between 5 and 10 Hz


- Run00001.00005: Park position, shutter open, Nominal HV 1000 V,threshold for L1 DT at 10% of NSB level so trigger rate was between 5 and 10 Hz

2018-12-12 UCTS:new MOS on tcs01

- stop the MOS on Osaka, is it now on tcs01 and we can well configure the UCTS

2018-12-12/2018-12-13 Léa, Nadia, Julie, Jan Luc ECC test

Short summary of tests done at ORM this week on the ECC:

-Remote loading is now understood and available.

-Release “V32 patch” is available. This version is similar to the one used since September, however it:

· Corrects the issues met with the intelligent relays in the transition alarm to safe.

· Adds more understandable shutter datapoints

· Heart beat with CaCo is temporary disabled to avoid disconnection with CaCo as seen last week.

This version has been extensively tested on Thursday and was used during the previous night.


-A new ECC version called V34 has been finalized. In addition to the “V32 patch” features,

· More data points are available to improve the camera monitoring & control (individual power control of IR, PSB (avoid), TIB, UCTS, data switches, …)

· A configuration file is available. It allows to configure different parameters with recompiling the ECC (delays, CaCo heart beat …)

· Better alarm identification & recovery is also available

This version has been tested ~1h on the camera. More tests will be done the coming days to take enough insurance before using it in the night runs.


-The 3 exe versions (V32, V32 patch, V34) are available for the shifters. A script will be given to the shift leader to facilitate the exe loading.

2018-12-11 Léa, Seiya, Sunsuke start up

1) one module off: 10.1.5.3

2) one module off: 10.1.5.3

3) Two modules off: 10.1.5.3 and 10.1.6.28

4) one module off: 10.1.5.3

2018-12-11 Léa, Seiya, Sunsuke Operation with HV

- Shutter close: 265 modules to 400 V and then nominal HV. Everything went smooth so then:

- Shutter open: 1 module (central one): 400 V then 800 V then nominal HV. Everything smooth so we went for 265

- L1 and L0 scan. For Data taking we went from 60 to 40 in the DT by step of 5. From 40, some modules present to high rate for L1

2018-12-11 Léa, Seiya, Sunsuke (Daniel, Daniela from remote) Monitoring fix

Now we use only CLusCo on tcs01 and L0 and L1 internal and external are also monitored

2018-12-11 Léa, Seiya, Sunsuke Data Taking

In 20181211:

- All TP synchronised, 1 Hz, trigger sent module by module, 10 ns additional delay in TIB and 40 ns in the UP trigger propagation of the trigger for the central BP: from run0001

- run0002 should be delete it wad to test ZFW writting

- At night with HV ON and shutter open: run 0004

2018-12-11 Léa, Seiya, Sunsuke start up


1) Two modules off: 10.1.5.3 and 10.1.6.28

2) one module off: 10.1.5.3

3) Two modules off: 10.1.5.3 and 10.1.6.28

4) Two modules off: 10.1.5.3 and 10.1.6.28

5) Two modules off: 10.1.5.3 and 10.1.6.28

6) Two modules off: 10.1.5.3 and 10.1.6.28

7) one modules off: 10.1.5.3

8) one modules off: 10.1.5.3

9) Two modules off: 10.1.5.3 and 10.1.6.28

10) Two modules off: 10.1.5.3 and 10.1.6.28

2018-12-10 Shunsuke, Léa, Seiya HV

We perform tests as follows.

- HV Supplying Test for Central Module with shutter closed.

--We supplied 400 V, 500 V, 600 V, 700 V, 800 V, 900V, 1000 V and Nominal Voltage to central module (module:133).

--In the test, HV are put off by script. It came from Shunsuke's mistake. But we confirmed his script works well. Any other problems were found.

- HV Supplying for central 19 modules as before test.

- HV Supplying for all modules as before test.

- .L0 & L1 rate scan with all modules applied with nominal HV with shutter closed

2018-12-10 Léa, Seiya, sunsuke Data taking

In 2018/12/10

-TP synchronised in all the modules, trigger sent by all the modules, 10 ns external delay add in the TIB: 0001.0000, 0001.0001 and 0001.0002.

-TP synchronised in all the modules, trigger sent by all the modules, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0004, 0001.0005 and 0001.0006.

-TP synchronised in all the modules, trigger sent only by module 265, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0007 to 0001.0009

-TP synchronised in all the modules, trigger sent only by module 100, 10 ns external delay add in the TIB, 40 ns add in the trigger propagation from CBP to TIB: 0001.0010 to 0001.0012

2018-12-10 Léa, Seiya, Sunsuke start up

1) one module off: 10.1.5.3

2) Two modules off: 10.1.5.3 and 10.1.6.28

3) Two modules off: 10.1.5.3 and 10.1.6.28

4) Two modules off: 10.1.5.3 and 10.1.6.28

5) one module off: 10.1.5.3

6) one module off: 10.1.5.3

7) one module off: 10.1.5.3

2018-12-07 Léa, Seiya, Shunsuke Rate scan

- L0 and L1 scan

- With No TP, L0 from 300 to 650 step=5 and L1 from 0 to 200 step=2

- TP, 300 Hz, gain=40 (50 p.e.): L0 from 400 to 900 step=5 and L1 from 0 to 200 step=2

- TP, 300 Hz, gain=20 (5 p.e.): L0 from 400 to 700 step=5 and L1 from 0 to 200 step=2

2018-12-07 Léa, Seiya, Shunsuke Datataking

- TP synchronisation test with 1 us widown of legacy daq -> adding 10 ns delay for 10 MHz clock to all module seems to fix the problem

- Random trigger. 1) Run for 2 minutes and several module remains busy, no more trigger rate comming ->restart

2018-12-06 Léa, Seiya, Shunsuke Osaka interface p1p2

Something strange happens. As often, p1p2 was done. but it was impossible to have it up again. We had to switch OFF the camera.

2018-12-07 Léa, Seiya, Daniel and Daniela (from remote) Test of slow control from Japan

- Pb monitor/slow control seems solved

- temperature monitoring of each pixel in the monitor function

2018-12-07 Léa, Seiya, Daniel and Daniela (from remote) start up

1) Two modules off: 10.1.5.3 and 10.1.6.28

2) One module off: 10.1.5.3

3) One module off: 10.1.5.3

4) Two modules off: 10.1.5.3 and 10.1.6.40

5) One module off: 10.1.5.3

6) Two modules off: 10.1.5.3 and 10.1.6.40

7) One module off: 10.1.5.3

8) One module off: 10.1.5.3

9) One module off: 10.1.5.3

2018-12-06 Léa, Seiya, Shunsuke, Satoshi Module remaining OFF during the whole day

10.1.5.3

2018-12-06 Léa, Seiya, Shunsuke, Satoshi ECC/CAco: Error and undefined

- after one hour module ON, ECC went to error and Caco and undefined state. Then in implies hard reset for ECC to have current back in the bus bar

- SwitchOn() from Caco, Caco fin in state 2 but ECC in error state...

- ECC went to error and this time we saw Caco fine at state 3 whereas ECC in error and then Caco going to undefined as expected since ECC was OFF. This error state of the ECC appeared twice today in the middle of Data taking and after around 1 hours and half of camera switch on.

2018-12-06 Léa, Seiya, Shunsuke, Satoshi Data taking

- With legacy DAQ to have the 1024 ns window to work on BP synchronisation

- Try random trigger with writting with EVB during 7 minutes fine even if at one moment TIB rate went to 0 but it get back. ECC error so camera stop we couldn't test more

- Try legacydaq run with random trigger without writtin but module 10.1.6.38 reach a connected but busy state and so no trigger was sent anymore.. Try again. Legacy daq data present same result of too busy rate, including higher than EVB

- Try long run of data with EVB and randon trigger: we run for one hour run in 20181206

2018-12-06 Léa, Seiya, Shunsuke, Satoshi start up

1) One module off: 10.1.5.3

2) One module off: 10.1.5.3

3) One module off: 10.1.5.3

4) One module off: 10.1.5.3

5) One module off: 10.1.5.3: Today we will perform operation without this module

Caco to undefined, ECC to error state -> hard reset 6) Two modules OFF: 10.1.5.3 and 10.1.6.28

7) One module off: 10.1.5.3

8) ECC went to error at the switchon() -> hard reset then One module off: 10.1.5.3

9) After one hour and half of Camera on, ECC went to error suddenly -> hard reset. Then Two modules OFF: 10.1.5.3 and 10.1.6.28

10) One module off: 10.1.5.3

11) Human mistake -> current went to high because dragon was reset before DAQ deconnection... -> switch off/On the camera

12) One module off: 10.1.5.3

2018-12-04 Léa, Seiya, Shunsuke, Satoshi start up

1) Multiple this morning but with current busbar to 0

2) startup in the afternoon with normal version of ECC: ALL modules ON, All pixels ON

3) One module OFF: 10.1.6.28

4) ALL modules ON, All pixels ON

2018-12-04 Léa, Seiya, Shunsuke, Satoshi Data taking

- With legacy DAQ to have 1024 ns window to work on BP synchronisation - TP synchronised in all modules and all sending trigger. Some BP setting was updated since it was previously not done

- TP synchronised in all modules and all sending trigger with default BP setting from the ring distribution

- Random Trigger with EVB, all runs in 20181205

- Random trigger with writing: some TIB crashs where all the rate and digital pedestal frequency go to 0 at one moment. It really depends each trial but sometimes we can reach 6500 Hz.

- Random trigger without writing: running for 13 minutes without any crash reach 6500 camera rate with 1700 HZ of busy rate.

- last try random trigger with writting and no crash for more than 10 minutes so we are lost now...

- 3 initialization of modules without monitoring and no problem of busy modules

2018-12-05 Léa + Jean Luc and NAdia from remote ECC version

- install current version v32+small update (delay applied when coming back from error state) with an executable produced by their own. ECC to ready but current un the busbars to 0 ->hard reset but then fan off so second hard reset

- Second try, still 0 in the bus bars -> hardreset and installation of the new ECC version v34 -> fan at 0 so new hard reset -> fan still at 0

- Went back to the old version v32 we are using since two months and fan ON but current in the busbars to 0 ->hardreset

- Fan ON. Try to kill the ECC programm and start it again -> Fan OFF. This test indicates that there is a communication issue between ECC and PDB and when PDB lost ECC connection, goes to a safe mode

- Try again kill ECC and start again - > Fan ON so PDB get back to ECC connection without hardreset. Then try again kill/start ECC -> fan ON few seconds and went to 0. Then try again kill/start ECC -> fans still at 0 -> hard reset

- new version v32 that cut the communication with the PDB -> fan ON after the hard reset. Two attemps of kill/start ECC and Fan still ON so clearly this indicates than Fan OFF comes from a problem of communication between ECC and PDB.

- Now coming back to the old version we are using since two months. Restart -> Fan OFF -> hard reset -> Fan OFF -> second hard reset > fan ON

Two things learn: - Fan OFF seems to be due to no communication between PDB and and ECC. It is possible that PDB doens't connect to ECC after first hard reset and then go is safe mode. Maybe be due to a time connection problem of ECC now, to investigate since this problem appears only since one week... More over normally after a hard reset, this is the first connection from ECC to PDB so PDB should wait as time as it needs and not go in safe mode.

- Executable of ECC created from remote seems to work from some instances like controlling the mode, controlling the fans etc... but for the current in the bus bars for example it always went to 0

2018-12-04 Léa, Seiya, Shunsuke, Satoshi Data taking

- Data taken with TP synchornised but trigger sent by only one module

- Data taken with TP only in central. until 15000 Hz, no busy since we are below the maximal writting speed. Then at 15 Khz, we have: 265 modules * 1344 kOctet * 15 khz= 5.3 GB = 40 Gbits/s. what we expect with the four link at 10 Gbits/s.

In /fefs/onsite/data/20181204 - Run0001.0000 to .00101: one TP in the central BP, triggering the whole camera

- Run0001.00103, .00104, .00105: TP in every modules synchronised, only module 265 sent trigger, new BP delay: it seems now we can see the pulse in the central module but we don't see it in other, to investigate...

- Run0001.0106 to .00107, .00108: old BP delay, TP in every modules synchronised, all modules sending trigger: no TP visible in the data, to investigate....

- Run0001.0109 to Run0001.0154: random trigger pedestal run. At the pedestal frequency of 3100 Hz, all pedestal frequency, collected rate, Camera rate and busy rate went to 0 at the same time.

2018-12-04 Léa, Seiya, Shunsuke, Satoshi Caco-ECC error/undefined state

- First power up, ECC went to error state whereas Caco was fine

- For one of the power up in which we stay 1h30 on ready mode ECC went to error and Caco to undefined (we don't know in which order). ECC to state 1 after disabling the _error_heart_bit but then no current in the busbar->hardreset Then impossible to get Caco back in a normal mode even after ECC to state 1-> kill and restart Cacolaucher...

- After the second hard reset, fan was off so we add to do another hardreset... Then everything ok

2018-12-04 Léa, Seiya, Shunsuke, Satoshi TIB 255 issue fixed

- We forgot to call the reset() method between each initialization of the modules. Now it works fine.

2018-12-04 Léa, Seiya, Shunsuke, Satoshi start up

3)All modules ON and all pixel ON (data taking but then TIB state 255)

4)All modules ON and all pixel ON (data taking but then TIB state 255)

5)All modules ON and all pixel ON (Caco went to undefined and ECC to error state, we don't know in wich order): For this start up we stayed one hour and half on ready mode

7)All modules ON and all pixel ON

2018-12-04 Léa, Seiya, Shunsuke, Satoshi start up

1) SiwthOn() from Caco, went to state 2 normal, but ECC to state 4 -> hard reset

2) Caco and ECC fine but one module OFF: 10.1.6.28 -> OFF/ON again

6) ECC went to error state and Caco to state undefined (We don't know in wich order...). Before hard reset, we try directly from ECC to go to state ready but current to 0 in the bus bars ->hard reset Then The fan speed were at 0 so hard reset again. Then it works fine

2018-12-03 Dirk CamerasToACTL v1.7

Installed on tcs03 and tcs04 from repository (https://cta.cppm.in2p3.fr/repo/x86_64/) and tested with/by Léa and Seiya.

2018-12-03 Léa,Seiya increase of the bus bars current

Due to the BP reset wheread EVB was still connected to the modules, current increase to 40. It is now know than the current increase when DAQ is connected and no clock distributed... We still don't know why, in study!!!

2018-12-03 Léa, Seiya Data Taking

- TP synchronised with legacy and EVB data

- Too many files created compared to the number of ZFW instances

2018-12-03 Léa,Seiya TIB/UCTS

- TIB went to state 255 even after reset. so we shut down and off the camera...

- Again, TIB went to 255 5 secdondes after reaching state 5, all rate at 1444O.

- TIB went to 255 from state 4 directly. The feeling in one day is that after 3 cycle of TIB going to 5 and then reset, it is going to 255 and we have to switch off the camera

2018-12-03 Léa Fix UCTS configuraiton

- a virtual machine was using the IP 10.4.8.4 of the UCTS.... this is why it was not possible to configure it.

- I change to the IP it should take in the future: 10.1.4.4 and now it works

2018-12-03 Léa Power up

- 1): All modules ON

-2) All module ON

-3) All module ON

-4) All module ON

-5) All modules ON

-6) module 10.1.6.40 off

-7) All module ON

-7) All module OFF (I think due to the previous increase of the current in the bus bars)

-8) All moduls ON

2018-11-30 Léa TIB/UCTS

- TIB remains in state 2 even when UCTS is configured

2018-11-29 Léa Data Taking

- Procedure of TP synchronised in all the modules

- EVB configuration file in /home/dragon/EVB/20181130

- First try, EVB conected but one module busy: 10.1.6.10 -> initialisation again of the modules

- Second try, EVB connected, all modules no busy but TIB remains at state 2 even if UCTS configuration seems ok -> I switched OFF and ON the camera...

- Third try, same than before. Try now to disconnect the cable from the WR switch to TCS07. same problem TIB remains at state 2 when UCTS is configured.

2018-11-29 Léa Power up

- Fourth startup: ECC and Caco works well, All modules ON

- Five startup: ECC and Caco works well, All modules ON

- sixth startup: ECC and Caco works well, All modules ON

2018-11-30 Léa Power up

- second start up: ECC and Caco works well but module 10.1.6.28 was OFF so I started again

-Third startup: ECC and Caco works well, all module ON. But humane mistake (mine), ECC went to Error state and then no current in the pulse bar -> hard reset

2018-11-30 Léa Power up

- From Caco, switchON() it went to his state 2, then good communication with the ECC. Then GetCameraStanby(), ECC was fine and went to state ready, all modules ON but Caco was in an undefined state so I did a sleep(), Caco recover his state ready (state=3) and ECC was still ready. I did a second call of the sleep() method to start from a clean environment and everything was fine Caco went to state safe and ECC also.

2018-11-29 Léa Power up

- after the second hardreset, Fan ON and ECC went to ready from Cacoo day finished(-:

- All modules ON, configuration for TP synchronisation in all the modules seems fine. EVB segfault in GOTOREADY s

2018-11-29 Léa Fan Off

Following the hard reset since ECC went to error, again as yesterday morning the fan were down... It is a problem of heart_beat between the PDB and ECC

- Second hard reset

2018-11-29 Léa Power up

1) Power up from Caco, powerON and GetCameraStandby(), ECC to Ready and all modules ON. Monitoring issue so go back to safe the time it is fixed

2) From Caco, SwithON, then ECC goes to errorstate 4 with _error_heart_bit to 4 without any clear reason. Caco was ok on state 2

3) Fixe the _error_heart_bit issue of ECC and try again to swtich on from Caco. Same issue, Caco state fine but ECC went to error state 4 due to _error_heart_beat at true.

4) Fixed _error_heart_beat and try directly to switch on from ECC. Works fine, ECC went to Ready but no current in the pulse bar, only the 4 one had current.

I did a hardreset

2018-11-29 Léa WR switch

RJ45 port installed on port 9 of the WR switch for the connection to tcs07

2018-11-28 Daniel K., Léa test of cluscolauncher

We tried the connection between Caco and Clusco: all fine. The current monitoring was not active because not the same files were updated. Will be fixed soon and then tested again.

2018-11-28 Daniel K., Léa fans stopped

This morning around 8:45am the fans stopped running before we arrived on site. When we arrived we noticed the ECC was still in safe state (we expected error satte but it was not the case). We checked the rest of the ECC variables and everything looked fine. Using a multimeter we checked the 400 was properly arriving to the PDB inside the camera. We contacted the ECC experts that asked for screenshots of the ECC datapoints regarding the PDB for later evaluation of the problem. Then we hard reset the ECC and fans started just fine.

2018-11-27 Daniel K., Cristobal (remote) fix of a compilation problem for ClusCo

Small fix for compilation, tested and merged to the master branch. Compilation on site works again.

2018-11-27 Daniel K. test of new ECC version

Following and more extensive tests of the control of the individual intelligent relays with the new version of the ECC. No improvment. Detailed description of the tests performed will be emailed to the experts. As a consequence the old ECC version was reinstalled for now.

2018-11-27 Daniel K., Otger (remote) installation of new version of libcluster

Following successful test of last week the fixes of libcluster were merged in the master branch and install on site.

2018-11-26 Daniel K. test of new ECC version

After some small fixes of data point "Error description" and for control of the fans, the new version of the ECC version was tested. The control of the individual intelligent relays (main update with this version) was unstable. As a consequence the old ECC version was reinstalled for now.

2018-11-23 Daniel K., Yuki, Seiya too high temprature

The status of ECC monitoring went to "red" from "green" around 16:30pm. The change of temperature we are monitoring was quite different from as usual. It may be related with the water pressure of chiller. It is above 1 and stable as usual, but it was too low at the morning and rising during the day.

Media:bptemp.JPG Media:Tempbad.JPG Media:ECCTemp.png Media:CameraPressure14-16.png

2018-11-23 Daniel K., Yuki, Seiya some network interface of osaka sometimes not running

some network interface of the osaka server doesn't start running at first every day... We activated p1p2 manually.


2018-11-23 Daniel K., Yuki, Seiya bad behaviour of mezzanine

After configuration of modules(init7 & pulse_injection_all), bad behavior of mezzanine was shown at three modules.

  • mod115: L0 & L1 trigger rate was 0, it has been no problem until yesterday.
  • mod167: L0 & L1 trigger rate was 0, it has been no problem until yesterday.
  • mod226: L0 trigger rate was 65535, but L1 trigger rate had no problem. So there may be a problem only at the line for IPR. It sometimes happened in this week.
2018-11-23 Daniel K., Yuki, Seiya take data for TP synchronization

We took data with ClusCo monitoring for the test pulse synchronization.

1) 300Hz, ROI=1024, trigger was generated by mod265, 3000events

  • This was almost all the same condition as yesterday, the difference was only that ClusCo monitoring was being done.
  • The file is /mnt/cs1/store/DragonDaqData/Data20181123/TP300HzTrigMod265RD1024Delay3028_RD1024_FEB...

2) 300Hz, ROI=40, trigger was generated by all modules, 3000events

  • During the operation, I did some mistake(mistake about the DAQ setting, so I restarted again after TIB state goes to 5. So PPS and 10MHz couter shoud be synchronized), so this result may be worse
  • The file is /mnt/cs1/store/DragonDaqData/Data20181123/TP300HzTrigModAllRD40Delay3528_again_RD40_FEB...


2018-11-23 Daniel K., Yuki, Seiya validation test of ClusCo

- the strange value of humidity

- SiTCP reset

  • This function worked well, but due to the bug in DragonFPGA it worked with only half of the camera and took too much time(~5min) to finish this command.
  • Seiya will fix this problem from DragonFW side.
2018-11-23 Daniel K., Yuki, Seiya new version of ECC

We implemented the new version of ECC. After reboot of ECC the fan didn't start working.

  • We did hardware reset(13:40pm). But this problem was still remain and ECC state went to 4(error state).
  • We changed the default setting of ECC, 1) T_safe_min -> 5, 2) disable light sensor. After hardware reset(13:50pm), the situation was same...
  • Moreover we changed the default value of T_safe_min to 2. After hardware reset(14:00), the result was same(error state and fan was still stopped)

As a result we decided to replace it with the current version of ECC. After reboot of ECC, all function worked well.

2018-11-23 Daniel K., Yuki, Seiya monitoring plots were not updated

ECC monitoring plots were not updated after 9:30am. We can get various values(temperature etc.) in OPCUA client, only monitoring plots were not updated. After reboot of ECC for the update of ECC version monitoring plots started to be updated again.

2018-11-22 Yuki, Seiya take data for TP synchronization study

I discussed with Taka, then I tried to take data as below;

  • set test pulse frequency for external reference clock
  • after that start TP synchronization

We took data with the following conditions and managed to synchronize test pulse at all modules finally.

0)

  • I wanted to take data with 300 Hz at first, but L1_local trigger rate was ~22Hz after initialization even though we set 300Hz as TP frequency.
  • So I decided to take data without changing test pulse frequency from the default one(444 444 counts for 10MHz = 22Hz)

1) 22Hz, ROI=1024, trigger was generated by mod265, 1000events

  • During initialization we didn't change test pulse frequency, so TP frequency was 22 Hz at that time.
  • The data is in /mnt/cs1/store/DragonDaqData/Data20181122/Trigger22HzRD1024...

2) 300Hz, ROI=1024, trigger was generated by mod1, 3000events

  • Before TP synchronization, we changed the test pulse frequency with "SET_TP_FREQUENCY 0 Off 33333" from ClusCo instead of "SET_TP_FREQUENCY 0 On 300". Then L1_local trigger was 300Hz with external reference clock.
  • The data is in /mnt/cs1/store/DragonDaqData/Data20181122/Trigger300HzRD1024...

Media:TPSynchNov22nd22Hz.gif Media:TPSynchNov22nd300Hz.gif

2018-11-21 Seiya home directory of osaka server was full

Home directory of osaka (/home) went to be full today.

Osaka ~ > df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/scientific-root 50G 22G 29G 43% / devtmpfs 252G 0 252G 0% /dev tmpfs 252G 0 252G 0% /dev/shm tmpfs 252G 50M 252G 1% /run tmpfs 252G 0 252G 0% /sys/fs/cgroup /dev/sdb 15T 8.5T 5.3T 62% /mnt/cs1 /dev/sda1 497M 272M 226M 55% /boot /dev/mapper/scientific-home 504G 504G 20K 100% /home tmpfs 51G 12K 51G 1% /run/user/42 tmpfs 51G 4.0K 51G 1% /run/user/1000 tmpfs 51G 0 51G 0% /run/user/1001 tmpfs 51G 0 51G 0% /run/user/1002

Almost all of files(~80%) are the data taken by LegacyDAQ for the tests and in /home/dragon/IACMiniCamSetup/DragonDaqM

Osaka DragonDaqM > du -sh . 417G

So I moved the data taken by LegacyDAQ to /mnt/cs1/store/DragonDaqData temporary. (We could transfer those data on the Lustre sytem (/fefs/ on tcs) later.)

2018-11-21 Daniel K., Seiya take data with LegacyDAQ for EVB tests

Julien wants to use raw data of full camera for EVB debug tests. We took data with LegacyDAQ by random trigger(300Hz), which is digital pedestal trigger TIB generated. These files are in /mnt/cs1/store/DragonDaqData/Data20181121.

I wanted to take 30min data(300Hz*(60*30)=540,000 events), but the disk in osaka server went to be full during the test. The size of each file is ~219MB, which is equivalent to ~168,000 events and ~10min data.

2018-11-21 Seiya how to run again the network interface

Some network at osaka server sometimes stopped running.

p2p2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 10.1.6.192 netmask 255.255.255.128 broadcast 10.1.6.255 inet6 fe80::a236:9fff:fef0:ccd6 prefixlen 64 scopeid 0x20<link> ether a0:36:9f:f0:cc:d6 txqueuelen 1000 (Ethernet) RX packets 68478703 bytes 95858361688 (89.2 GiB) RX errors 1 dropped 9 overruns 0 frame 1 TX packets 30112278 bytes 1622848602 (1.5 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

At that time, we should do for restart running;

  • sudo ifconfig <name of interface> down
  • sudo ifconfig <name of interface> up
2018-11-21 Otger, Daniel K., Seiya ECC went to error state

I used ClusCo@tcs01 for the monitoring, all the plots except "Amp. Temp" was updated indeed. After that, I did init7 from ClusCo@cacooperator and waited the update of "Amp. Temp" plot. At that time, ClusCo@tcs01 showed timeout, so I realized I cannot ping these modules and relay current went to 0 and ECC state went to error state(4). I powered up again and ECC status went to 2(ready) as usual, but relay current was 0.

Taka explained why relay current was still 0 as below;

When ECC goes to Error state, relay modules are also in a strange state. You need to reset relay modules as well. However, even if you go to "safe" state in ECC, relays are still powered (not bus bars, but relay modules). That means, "safe" does not reset relays.

Lea explained why ECC went to error as below;

Maybe what is possible also is that you lost the slow control connection during few seconds and then get it back without realising. Then If the modules are ON and that we lost the slow control connection, ECC goes to error and the relay current will remain at 0 as Taka explained.


We did hardware reset three times(15:00, 15:50, 16:45), but the situation was same. This ECC error state seemes to be caused by loss of heart beat of CaCo. We survived without CaCo (directory use ECC) for data taking today.


1228269 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] ERROR com.prosysopc.ua.client.UaClient - Exception in ServerStatusListener java.lang.ClassCastException: cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAVariable$DataInformation cannot be cast to java.lang.Integer at cat.ifae.cta.cameracontrol.server.base.clients.ecc.OPCUAECCControl$ECCVariableStatus.update(OPCUAECCControl.java:25) at java.util.Observable.notifyObservers(Observable.java:159) at cat.ifae.cta.opcua.dataaccess.basicobjects.BasicCallbackVariable$ObservableVariable.setValue(BasicCallbackVariable.java:36) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly._newStateWarn(OPCUAAssembly.java:533) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAAssembly.consumeMessage(OPCUAAssembly.java:526) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.statusChanged(OPCUAServerStatusListener.java:59) at cat.ifae.cta.opcua.dataaccess.uaobjects.OPCUAServerStatusListener.onStateChange(OPCUAServerStatusListener.java:33) at com.prosysopc.ua.client.UaClient.a(Unknown Source) at com.prosysopc.ua.client.UaClient.updateServerStatus(Unknown Source) at com.prosysopc.ua.client.UaClient$a.run(Unknown Source) at java.lang.Thread.run(Thread.java:745)

1228371 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] WARN com.prosysopc.ua.client.Subscription - Server sent a previously acknowledged sequence number 0 for Subscription 47786 1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.SecureChannelTcp - 47856 Closed 1228372 [PublishTask-com.prosysopc.ua.client.UaClient@166f6c4f] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed 1228373 [TcpConnection/Read] INFO org.opcfoundation.ua.transport.tcp.io.TcpConnection - /10.1.4.66:4841 Closed (expected)

2018-11-21 Otger, Daniel K., Seiya dhcpd server for TIB restart

DHCPd server for TIB stopped due to the shutdown of tcs01 yesterday, so we activated the server as below,

ifae@tcs01 ~]$ sudo service dhcpd status Redirecting to /bin/systemctl status dhcpd.service dhcpd.service - DHCPv4 Server Daemon Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: man:dhcpd(8) man:dhcpd.conf(5) [ifae@tcs01 ~]$ sudo service dhcpd start Redirecting to /bin/systemctl start dhcpd.service [ifae@tcs01 ~]$ sudo service dhcpd status Redirecting to /bin/systemctl status dhcpd.service dhcpd.service - DHCPv4 Server Daemon Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2018-11-21 09:14:08 WET; 2s ago Docs: man:dhcpd(8) man:dhcpd.conf(5) Main PID: 453 (dhcpd) Status: "Dispatching packets..." CGroup: /system.slice/dhcpd.service 453 /usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -group dhcpd --no-pid

Nov 21 09:14:08 tcs01 dhcpd[453]: All rights reserved. Nov 21 09:14:08 tcs01 dhcpd[453]: For info, please visit https://www.isc.org/software/dhcp/ Nov 21 09:14:08 tcs01 dhcpd[453]: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in...ig file Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 deleted host decls to leases file. Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 0 new dynamic host decls to leases file. Nov 21 09:14:08 tcs01 dhcpd[453]: Wrote 4 leases to leases file. Nov 21 09:14:08 tcs01 dhcpd[453]: Listening on LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16 Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on LPF/ens1f0/a0:36:9f:eb:51:34/10.1.0.0/16 Nov 21 09:14:08 tcs01 dhcpd[453]: Sending on Socket/fallback/fallback-net Nov 21 09:14:08 tcs01 systemd[1]: Started DHCPv4 Server Daemon. Hint: Some lines were ellipsized, use -l to show in full.

2018-11-20 Daniel K., Seiya ClusCo monitoring restart ClusCo monitoring map was not updated after the shutdown of tcs01. We contacted with Carlos and Carlos and they restarted it again.Now it works.
2018-11-20 TCS01 shutdown One of the memory cards of tcs01 is damaged and will be replaced

by an authorized technician today starting 9am La Palma time. We will shutdown the server before that and once the card is exchanged we start it up again.

2018-11-19 Seiya, Daniel K. cannot connect with some modules With the configuration2(100Hz,ROI=1024) we could not connect some modules(IP10.1.6.148-173) and they still busy(busy state=1).

After the re-initianlization, this problem disappeared.

2018-11-19 Seiya, Daniel K. Test pulse data with DragonDaqM(LegacyDAQ) We took test pulse datat with the following conditions;

1) 300Hz, ROI=1024, trigger was generated by mod265 (for reproducing the problem)

  • File name is "TP300HzTrigMod265RD1024Delay3028RD1024_***"

2) 100Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)

  • File name is "TP100HzTrigMod265RD1024Delay3028RD1024_***"

3) 300Hz, ROI=1024, trigger was generated by mod265 (suggested by Taka)

  • I sent each commands by hand and checked the registers(register8 & scalar) after PPS disable.
  • It seems PPS disable worked well.
  • File name is "TP300HzTrigMod265RD1024Delay3028_CHECKEDRD1024***".

4) 100Hz, ROI=1024, trigger was generated by mod265

  • I set test pulse frequency before PPS synchronization.
  • File name is "TP300HzTrigMod265RD1024Delay3028_TPconfigSynchroRD1024***".
2018-11-19 Seiya, Daniel K. 24V supply problem We powered up the camera with the usual procedure, but only one busbar(the 4th one) worked and others didn't work. We tried this procedure again, but the result is the same(only the 4th busbar worked).So we switched off and on the camera breaker around 15pm. Fan didn't start to work at first, so I switched on and off the breaker again and fan started to work. After that we can power up the whole cameras.
2018-11-13 Mitsunari, Daniel K. Software deployment All the setup (except the uaexpert for ecc, tib and ucts) to control, monitor and take data with the camera was moved to the LST_CALP iMac (+ 1 screen) of the commissioning container.
2018-11-12 Mitsunari, Daniel K. Test pulse data with DragonDaqM Test pulse data were taken by DragonDaqM triggering by the module 264, which did not have a test pulse on 11-09.
2018-11-12 Mitsunari, Satoshi Connect tcs07 to White Rabbit WR switch management port and Management switch (mgtsw2 port 42) are connected by a Ethernet cable. Mitsunari tried to change the IP of the WR switch to 10.200.10.140, which is in VLAN 1001, but I failed. The WR interface file dot-config was not found in spite of the WR manual. Even when we created the file by ourselves, it was lost after rebooting.
2018-11-12 Mitsunari, Daniel K., Carlos Diaz Software deployment Installing and compiling caco, cacoconsole, cacogui on tcs01 under /home/ifae/development. Compiling /home/ifae/clusco on tcs01 and adapting monitoring from CIEMAT. Setting up one additional screen for monitoring to the imac (monitoring computer), adding two forms (one for powering on the camera, one for shutting it down) to be filled by the operators.
2018-11-09 Mitsunari, Daniel K. Test pulse data with EVB Test pulse data were taken by EVB waiting PPS reaching all modules for 2 s. For the read depth 40, DAQ seemed to be successful. For the read depth 1024, however, the data were not stored.
2018-11-09 Mitsunari, Daniel K. Test pulse data with DragonDaqM Test pulse data were taken by DragonDaqM waiting PPS reaching all modules for 2 s. The waveform data of six modules besides the central one were checked, and five modules had test pulses though the other module (No. 0) did not.
2018-11-03 Mitsunari Test pulse injection timing Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 40, Sampling speed: 1 GHz
2018-11-03 Mitsunari Test pulse injection timing Test pulse data were taken with L1 threshold which all modules can produce camera trigger. According to the data, the timing of test pulse injection distributes aver ~70 ms. Test pulse injection rate: 1 Hz, Read depth: 1, Sampling speed: 5 GHz
2018-11-02 Mitsunari Test pulse data with EVB Data for investigating the test pulse issue were taken with EVB but seems to be failed. This should be inspected.

Pulse rate: 300Hz, Read depth: 1024, Event number: ~9000, /fefs/onsite/data/20181102

2018-11-01 Mitsunari Large data with random trigger Data of ~10^5 events were taken for pedestal random tirgger, EVB, the read depth 40 slices, and the dealy 3528 ns. The data are stored in /fefs/onsite/data/20181101.
  • 1kHz: Run 0001.0275-0001.0288
  • 2kHz: Run 0001.0289-0001.0315
2018-11-01 Mitsunari Avoiding TIB State 255 The TIB state can go to 5 without resetting at state 255 by a combination of reseting TIB at state 0 and configuring dragons without resetting BPs.
  • ECC->SetMode(2)
  • TIB->Reset()
  • TIB->DisablePPS()
  • TIB->ResetRun()
  • ClusCo->Main->@config/init7_woBPreset
  • UCTS->XMLConfiguration
  • UCTS->Start()
  • TIB->EnableTrigger()

Mitsunari repeated this procedure four times and succeeded for all of them. DAQ also seemed to be successful at the last trial. (At the first three trials, DAQ failed because of another reason.)

2018-10-31 Mitsunari TIB State 255 problem init7 without BP reset at the beginning was tested. The first trial failed, namely, the state turned out to be 255. However, TIB state directly went to 5 In the second trial, when TIB was Reset just after turning on Camera. This behavior should be confirmed later.
2018-10-31 Mitsunari Check for test pulse synchronization It should be confirmed whether TenMHz counter vaue is idential among the modules for each test pulse event. Data for the check were taken by DragonDaqM with 300Hz. L1 threshold was set so that only the central module sent triggers. The data were stored in /home/dragon/IACMiniCamSetUp/DragonDaqM/Data20181031. TenMHz counter appeared to be synchronized, but it should be confirmed.
2018-10-31 Oscar, Mitsunari PDB Fixation

PDB fixation: the fixation of the from plate is done know throw a screw and nut fixed to the back plate using a mixture to attach metals (Pattex Nural 21) and an additional nut to fix the front plate.

We have started Modules twice with one hour break in between. Both times all Dragons and BP went up.

2018-10-30 Taka, Mitsunari, Julien, Dirk Random trigger runs with EVB

Two runs (#30, #31) taken at various trigger rates as documented in Run Catalog and Slack.

Corrected pixel map implemented (spiral numbering).

2018-10-29 Oscar, Taka, Mitsunari Power up

The Dragon with IP 10.1.6.28 (3rd column starting by the left from outside, 5th modules from below) was put in the busbar powered by relay 1 instead of 0. In exchange, module in 4th column 5ht from b below was put in the relay 0 instead of relay 1. Camera was powered up only once and all modules and BP went up.

2018-10-27 Taka, Mitsunari Random Trigger

We took the random trigger. Following the instruction with Lea, random trigger could be easily produced. With DragonDaqM,

300 Hz injection -> 300 Daq rate.

1k Hz-> 783 Hz

3k Hz-> 1162 Hz

6.5k Hz -> 1303 Hz.

With EVB, we first tried with 6.5 kHz. Then EVB crashed because of buffer full. But busy state of modules was 03, which means EVB are connected and modules were busy. To recover from this state, we had to reboot Dragons. A few minutes later, Carlos Diaz called us. The current consumption at bus bars are ~10Amp higher than usual. Normally 25-27 Amp but after rebooting Dragons, it was 35 Amp. We shutdown the 24V. After 10 min or so, Carlos allowed us to restart. All Dragons could be communicated from cacoserver, but not from Osaka. ip link set p*p* down/up didn't help. We rebooted Osaka. Then Osaka could ping to all (but one) modules. However, EVB didn't work. Later we learned from Dirk and Julien that we had to do

sudo modprobe -r ixgbe; sudo modprobe ixgbe

2018-10-27 Oscar, Laia , Taka, Mitsunari Power up

After checking that Dragon and BP regulators can stand input voltage above 30 V, we increased the power provide by the Power Supplies to 27.5V (the same for the 8 Power Supplies).

With this configuration, the voltage while ramping up increase up 20.3 V and then only decreases to 19.8 V for about 1 ms. This should be completely find for the Dragons.

We power up the camera with the ECC 10 times. All BP went up all times. Only one Dragon (always the same) does not power up the first time after a ~1 hour break (tried two times), after this first power up all Dragons power up.

2018-10-26 Taka, Mitsunari TIB state machine.

We tried to solve the "State 255" problem in TIB. Luis Angel suggested to configure modules at state 2. We followed his instruction, but we reached state 255. So we tried modules configuration at state 0. Same result. We tried module configuration at state 4, resulting in the same state 255.

We also tried to see the test pulse postion to the center of the readout window. But we could not see the test pulse at all. Delay setting in TIB or backplane is not correct.

2018-10-26 Oscar, Laia , Taka, Mitsunari Power up

The drop in the voltage is due to a current limit in the circuitry of the relay. Increasing the voltage of the power supplies should rise the value of the dip in the voltage so that it does not reach 18V.

We measure again the transients for relay 0 with Power Supply at 24.98 V as reference. We increase the voltage of Power Supplies to 25.25 V, the dip is about 100 mV higher.

2018-10-25 Taka, Mitsunari Yusuke Event Mixing

We understood the origin of EventMixing. It is due to the slow control command "Dragon - Start" after "Enable Trigger" in TIB. "Enable Trigger" should have been after "Dragon Start". This is dangerous actually. Mistake will be noticed only during analysis.

2018-10-25 Oscar , Laia Power up

No water was found inside the camera. We measure the voltage at the output of the Redundancy modules: 24.98 V We connect a Current sensor between master bus bar and relay 0. We power up relay 0 and measure transient for both current and voltage: - Voltage shows a drop of around 1.5 V once it arrive at 20V, which is afterward (4 ms) recovered and keeps increasing until about 24.5 V - Current increases steadily with a small slope change on the drop on the voltage happens. It also show a drop of about 30% when the voltage reach 24.5 V that it recovers after about 80ms after

The voltage reduction for 4ms brings the voltage very close to 18V, and some times may go slightly down.

The same is observed in relay 1.

2018-10-21 Taka, Mitsunari, Yusuke Timing Calibration.

We tried to see the test pulse in the center of window. But we did not succeed. DAQ was with EthDisp from Taka's macbook through slow control network. We need to understand the delay in TIB and backplane. Since it was already 5:50 pm, (though we announced that we use camera until 5:00 pm) we had to shutdown. We kept 230 and 400V on, chiller on, only 24V off.

2018-10-21 Taka, Mitsunari, Yusuke Event Mixing Test

To confirm again the event mixing problem, we took data with the LegacyDaq. After init7.uic, we injected the test pulse in the central module with 300 Hz. TIB could see the rate properly. We took 20000 events. After that, we tried to take data with EVB, but it was not successful. EVB could not connect to all modules. We had the same problem a few times in a row. One of the reasons was dead ports in Osaka. Sometimes, ports in Osaka sleep without obvious reason. This is actually critical problem. We need to investigate further. Finally we gave up to take data with EVB.

2018-10-21 Taka, Mitsunari, Yusuke TIB/UCTS study

After power up, we tried to initialize TIB. But state didn't reach "5". After state 4, if we enable trigger, state went to 255. We knew that the RJ45 cable on the WR was damaged by the rack door. We changed it to new cable. We also used a different port in WR (port 8->5).And we reset TIB. Then with the standard procedure, state reached 5. We were happy. Just to be sure, we changed back to the damaged cable and retried. Then state was again 5. So, the reason was not the cable. But "Reset" of TIB was the key.

After initializing PMT modules, TIB didn't work well. It didn't send back the trigger. Since temperature was too high (BP 35 deg.) We had to switch off the 24V. During this break, we changed the WR port from 5 to 8.

After power up, we repeated the procedure. Again, TIB didn't send back the trigger. But, TIB reset helped. So, currently, startup recipe is that 0->1->2->3->4->255->TIB Reset->0->1->2->3->4->5->configure modules -> TIB Reset -> 0 -> 1 ->2 ->3 ->4 ->5.

2018-10-21 Taka, Mitsunari, Yusuke Restart the Camera.

Before powering up 400V for the first time since last Tuesday, we examined the camera visually. Camera is properly parked. There was a water condensation on the camera body. The platform is not perfectly closed. There was a 2 cm gap between left and right. But it is not dangerous for us. At 11:45, we applied 400 V putting the breaker at the Drive container. After 15 min of stabilization, we started 24V from ECC (state ready). Then, we realized that TIB and UCTS do not respond on Ping. It was because dhcpd on tcs01 was dead. Also, uctsd on Osaka was dead We restarted dhcpd and uctsd and switched off and on 24V. Then, TIB, UCTS could be booted.

2018-10-15 Dirk UCTSd dead.

uctsd.service - Execute the UCTS OPC-UA server Loaded: loaded (/etc/systemd/system/uctsd.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Di 2018-10-16 13:22:28 WEST; 2h 59min ago Process: 152844 ExecStart=/home/dragon/ucm_temp/ucts_opcua_server.sh (code=exited, status=134) Main PID: 152844 (code=exited, status=134)

Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Connected to Server : opc.tcp://osaka:48010 Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Verification of MOS version with lappweb Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: ******************** Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Press CTRL-C to shutdown server Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: /home/dragon/ucm_temp/ucts_opcua_server.sh: line 9: 152847 Aborted (core dumped) ./MOS_Device -d /MOS/plugins/Plugin_UCTS/UCTS.xml Okt 16 13:22:28 osaka systemd[1]: uctsd.service: main process exited, code=exited, status=134/n/a Okt 16 13:22:28 osaka systemd[1]: Unit uctsd.service entered failed state. Okt 16 13:22:28 osaka systemd[1]: uctsd.service failed. Restarted.

2018-10-15 Taka MOXA Switch connected SLOW control connection intact. Drive network can be used from remote tomorrow.
2018-10-16 Léa, Dirk, Julien, taka, Saiya, Mitsunari Modules deconection

- It happens two times today that after around 25 minutes, around 15 modules were not powered anymore whereas ECC was in state 2 and current in the pulse bar. At the newt switch ON, ALL powered

2018-10-16 Léa, Dirk, Julien, taka, Saiya, Mitsunari uaexpert deconnection

- Again, we lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181016_003 and 20181016_004

2018-10-16 Léa, Dirk, Julien, taka, Saiya, Mitsunari TIB issues

- TIB goes from state 0 to 4 but then when we enable trigger go to state 255 as the alarms vector

2018-10-16 Léa, Dirk, Julien, taka, Saiya, Mitsunari Small run summary

- 7 GotoSafe and GoToReady for the ECC due to too high temperatures so switch ON/Off of the Modules/BP:

1) All module ON, 2 BPs OFF associated to module 10.1.6.12 and 10.1.6.27

2) All module ON, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27

3) All module ON, 1 BP OFF associated to module 10.1.7.171

4) All modules and BPs ON

5) All modules and BPs ON

6) All modules and BPs ON

7) All modules and BPs ON

8) All modules and BPs ON

9) All modules and BPs ON

10) All modules and BPs ON

11) All module ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149

12) All module ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149

13) ALL modules ON

14) ALL modules ON

2018-10-15 Dirk Charging Walkie-Talkies with our private mini-USB adapters, while waiting for the real charger to reappear Alternative: $8.99 on Amazon


2018-10-15 Léa, Dirk, Julien, taka, Saiya, Mitsunari Slow control and uaexpert deconnection

- Slow control connection lost in ready mode so then no more current in the pulse bar. GotOsafe GOtOready still no curent with the negative value in the pulse bar. We had to switch off and on the 233 and 400 V

- We lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181015_005 and 20181015_006

2018-10-15 Léa, Dirk, Julien, taka, Saiya, Mitsunari Small run summary

- 7 GotoSafe and GoToReady for the ECC due to too high temperatures so switch ON/Off of the Modules/BP:

- Cut busy propagation from BP, dragon on local clock

1) 1 module OFF: 10.1.6.28, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27

2) 1 module OFF: 10.1.6.28, 3 BPs OFF associated to module 10.1.6.24, 10.1.6.27 and 10.1.7.147

3) 1 module OFF: 10.1.6.28, 1 BP OFF associated to module 10.1.7.147

4) All modules ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149

5) All modules ON and All BPs ON

- pb of internal/external trigger clock for the Dragon fixed: For DRS4, referential clock is now 10 MHz external clock. - Configuration of UCTS and TIB, two last runs taken with the TIB so with external clock, external trigger and busy propagation. 6) 1 module OFF: 10.1.5.16, 3 BPs OFF associated to module 10.1.6.27 and 10.1.7.146 and 10.1.7.149

7) 1 module OFF: 10.1.6.28, 1 BP OFF associated to module 10.1.6.28

8) ALL modules On

2018-10-15 Léa, Taka, Dirk, Daniel SLOW Control lost

While Camera was on, the SLOW control connection was interrupted in the Drive container to prepare connection of Drive/AMC network.

Consequently the EMC went to SAFE. But also the UaExpert interface was stuck (which is the current base for Camera monitoring). The setup was then restored as well as we could, including DataLogger function.

2018-10-15 Léa, Taka, Julien, Seiya, Dirk writing speed limitation in data taking

1 ZFW: validate speed 300 MB writing speed 8 ZFW: validate 8* 300 MB/s writing speed 16 ZFW: writing speed: 16*150 MB writing speed. Maybe problem due to the disk. To investigate

2018-10-15 Léa, Dirk, Julien Slow control deconnection+ disconnect from OPC-UA
2018-10-14 Léa, Dirk, Julien Small run summary

Runs0016-0019

- No TIB/UCTS

- Cut busy propagation from BP, dragon on local clock

- 7 GotoSafe and GoToReady for the ECC due to too high temperatures so 7 switch ON/Off of the Modules/BP

1) All modules ON, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27

2) All modules ON, didn't check the BPs

3) All modules ON, didn't check the BPs

4) All modules ON, 1 BPs OFF associated to module 10.1.7.147

5) All modules ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.171

6) All modules ON, 1 BPs OFF associated to 10.1.7.147

7) All modules ON, 1 BP OFF associated to module 10.1.7.147


2018-10-14 Eric, Dirk All (DATA) fibres straight now!
  • There are straight and crossed fibre patch cords (AB->AB and AB->BA)! They are obviously used indifferently and mixed on our site. :-(
  • The fibres in the IC-PP that are connected to the couplers are all yellow! (No colour code to trace them.)
  • We have chosen the same convention as on the transceivers for input/output of the LC connectors
  • Problem/drawback: All fibres at the IC-PP are now reversed. Need to think/investigate that (last?) point.
  • All PP boxes now closed and secure. Should not be touched any more without agreement by INFRA experts!
Cisco-Transceiver 13927.jpg
2018-10-14 Dirk Direct measurement of TX lasers

INFO: Direct measurements can be done without danger for Photom-211 測定範囲 -70 ~ +5dBm according to datasheet. That is 3.16 mW to 0.1 µW.

2018-10-14 Léa, Dirk, Julien First full-cam data run up to 15kHz! That is what we would have liked to see last week.

Now it's Champagne time. :-)

2018-10-14 Julien, Eric Fibres DATAsp1-6 tested Optically, between DC and Cam. Data1-6spare measurements 20181014.jpg
2018-10-13 Léa, Dirk, Julien Run0015 Still no UCTS(/TIB); fibre broken between DC and Cam. Running with half-cam and two additional missing modules (BP problem): 6.24, 6.27.

- r0015 all events (at runstart), 300Hz and 10kHz, but ZFW problems (testing with 16 instances).

ALL Door knob! Falling apart from the CC door. Urgent action needed. (Bigger screw?)
Taka, Seiya, Mitsunari Fiber Check We checked optical connection between DC and Camera because some labels were lost due to UV damage. We checked Data2, Data 6, SlowControl and UCTS. Only UCTS had a problem (no splicing at Drive PP). The rest where OK.
Taka, Seiya, Mitsunari Labeling fibers We labeled optical fibers of the data (DATA 1 - 6) at patch panels in Drive contaniner and in IT container. The spare cables have not been done yet because a ribbon ran out.
Seiya, Mitsunari Connection validation We validated 12 optical fiber connection (No. 1-6, 13-18) from Drive coontainer to IT container. Strength is -35 to -38 dBm.
2018-10-12 Léa, Taka, Julien, Seiya, Dirk Runs0012 sqq. No UCTS/TIB today.

These runs have 3 modules missing (as identified in the preparatory phase: 6.21, 6.25, 6.28). According to a quick check, all EventNb=TriggerNb otherwise for all runs today. See RunCatalog for details.

Dirk Creation of logbook
Dirk, Taka, Julien, Seiya, Léa Data acquisition

- Pb with the ClusCo on tcs01. The root propagation for the BPs for the trigger doesn't work. Using exactly the same script it works on CacoOperator.

- We validate for 3 fibers the new connections to the dataswitch fiber. Eric is fixing the one missing or broken. So for now only the right part of the camera is used for data acquisition

- No TIB/ UCTS

- Few runs were taken with no external trigger from TIB. 3 Modules didn't appear busy but didn't sent any data. In those test the busy from the CBP was cut. Those missing modules have to be investigated in more details but due to a lot of slow control deconnection problem and high temperature in the camera it was not possible. Script used in CLusCo: init7_noextTrigger_Test.uic. To not cut the busy, name is: init7_noextTrigger_noBusyCut_Test.uic

- One try with no external trigger and clock but with the CBP delivering the clock and pps and using 10 MHz clock as default clock for the dragons. L1 local Trigger didn't generated. No we come back to a configuration of dragon on their local clock but this issue has to be investigated. Script used in CLusCo: init7_noextTriggerClock_Test.uic

Dirk, Taka, Julien, Seiya, Léa Too high temperatures in the Camera

- Limit fix to 27 for the aire temperature inside and 35 for the BP temperature. Pression also get some alarms

- During the day, due to high temperatures we have to gotosafe to wait for the camera to cool almost 11 times but never the BP max temperature went more than 34 degree. The air inside reach at the maximum 26.5.

Dirk, Taka, Julien, Seiya, Léa ECC lost connection (2 times!)

- In the afternoon, 3 lost of ECC slow control communication due to the interruption between IT-Container/Driver-Container. Miscomunnication with AMC people... First time, temperature was already high in the camera, we had to switch OFF the 233 and 400V for security reasons. Two other times, we get the ECC connection back quite fast and ECC was in the same state that when the connection was lost meaning state 2 ready. Just no more current in the pulse bar so we have to gotosafe and gotoready both times. After that the current was -247 in the pulse bar... Not understood for the moment - The second interruption happened, when the Moxa switch was reconnected, probably not correctly configured. It was disconnected again. Presently this impacts AMC and drive operation, until the Moxa can be reconnected.

Léa, Taka, Julien, Seiya, Dirk Discovered SLOW control fiber lost, fibers changed connection recover Interruption. Using UCTS section for replacement.
2018-10-11 Eric, Armand Cable splicing UCTS fibres ready and checked.
Léa, Taka, Julien, Seiya, Dirk DATA5-upstream broken Located between DC-PP. and IC-PP. Eric is going to have a look on Friday, when working on the other (spare) fibres.
Léa, Taka, Julien, Seiya, Dirk Found correct order of DATA1-DATA6 We eventually found that the fibres DATA1-6 were connected in (exactly) wrong order to the camera, which lead to a mismatch of switches/modules with respect to interfaces/addresses in osaka.

This is an item for our "learned lessons": The indoor fibres had been labelled (switch-interface), but stayed in Mirca. The new fibres had been confectioned at ORM, and labels had to be "guessed" in one way or the other.||

Glossary[edit]

  • CC = Commissioning Container (present LST1 Control Room)
  • DC = Drive Container
  • IC = IT-Container