Difference between revisions of "Logbook CameraCommissioning ORM Oct18"
(→Enter comments in reverse time order) |
DirkHoffmann (talk | contribs) (Adding entry for yesterday) |
||
Line 14: | Line 14: | ||
We have started Modules twice with one hour break in between. Both times all Dragons and BP went up. | We have started Modules twice with one hour break in between. Both times all Dragons and BP went up. | ||
+ | |||
+ | |- valign="top" | ||
+ | | 2018-10-30 || Taka, Mitsunari, Julien, [[User:DirkHoffmann|Dirk]] || style="background: #CCFFCC;" | Random trigger runs with EVB | ||
+ | || | ||
+ | |||
+ | Two runs (#30, #31) taken at various trigger rates as documented in [https://portal.cta-observatory.org/WG/lst/DAQ/SitePages/RunCatalog.aspx#ORM Run Catalog] and [https://cta-north.slack.com/archives/C4JB24UMC/p1540909546003100 Slack]. | ||
+ | |||
+ | Corrected pixel map implemented (spiral numbering). | ||
+ | |||
|- valign="top" | |- valign="top" |
Revision as of 12:43, 31 October 2018
Enter comments in reverse time order
Glossary at the end!
Date | Actor/Author | Action summary | Comments | Documents | |
---|---|---|---|---|---|
2018-10-31 | Oscar, Mitsunari | PDB Fixation |
PDB fixation: the fixation of the from plate is done know throw a screw and nut fixed to the back plate using a mixture to attach metals (Pattex Nural 21) and an additional nut to fix the front plate. We have started Modules twice with one hour break in between. Both times all Dragons and BP went up. | ||
2018-10-30 | Taka, Mitsunari, Julien, Dirk | Random trigger runs with EVB |
Two runs (#30, #31) taken at various trigger rates as documented in Run Catalog and Slack. Corrected pixel map implemented (spiral numbering).
| ||
2018-10-29 | Oscar, Taka, Mitsunari | Power up |
The Dragon with IP 10.1.6.28 (3rd column starting by the left from outside, 5th modules from below) was put in the busbar powered by relay 1 instead of 0. In exchange, module in 4th column 5ht from b below was put in the relay 0 instead of relay 1. Camera was powered up only once and all modules and BP went up.
| ||
2018-10-27 | Taka, Mitsunari | Random Trigger |
We took the random trigger. Following the instruction with Lea, random trigger could be easily produced. With DragonDaqM, 300 Hz injection -> 300 Daq rate. 1k Hz-> 783 Hz 3k Hz-> 1162 Hz 6.5k Hz -> 1303 Hz. With EVB, we first tried with 6.5 kHz. Then EVB crashed because of buffer full. But busy state of modules was 03, which means EVB are connected and modules were busy. To recover from this state, we had to reboot Dragons. A few minutes later, Carlos Diaz called us. The current consumption at bus bars are ~10Amp higher than usual. Normally 25-27 Amp but after rebooting Dragons, it was 35 Amp. We shutdown the 24V. After 10 min or so, Carlos allowed us to restart. All Dragons could be communicated from cacoserver, but not from Osaka. ip link set p*p* down/up didn't help. We rebooted Osaka. Then Osaka could ping to all (but one) modules. However, EVB didn't work. Later we learned from Dirk and Julien that we had to do sudo modprobe -r ixgbe; sudo modprobe ixgbe
| ||
2018-10-27 | Oscar, Laia , Taka, Mitsunari | Power up |
After checking that Dragon and BP regulators can stand input voltage above 30 V, we increased the power provide by the Power Supplies to 27.5V (the same for the 8 Power Supplies). With this configuration, the voltage while ramping up increase up 20.3 V and then only decreases to 19.8 V for about 1 ms. This should be completely find for the Dragons. We power up the camera with the ECC 10 times. All BP went up all times. Only one Dragon (always the same) does not power up the first time after a ~1 hour break (tried two times), after this first power up all Dragons power up. | ||
2018-10-26 | Taka, Mitsunari | TIB state machine. |
We tried to solve the "State 255" problem in TIB. Luis Angel suggested to configure modules at state 2. We followed his instruction, but we reached state 255. So we tried modules configuration at state 0. Same result. We tried module configuration at state 4, resulting in the same state 255. We also tried to see the test pulse postion to the center of the readout window. But we could not see the test pulse at all. Delay setting in TIB or backplane is not correct.
| ||
2018-10-26 | Oscar, Laia , Taka, Mitsunari | Power up |
The drop in the voltage is due to a current limit in the circuitry of the relay. Increasing the voltage of the power supplies should rise the value of the dip in the voltage so that it does not reach 18V. We measure again the transients for relay 0 with Power Supply at 24.98 V as reference. We increase the voltage of Power Supplies to 25.25 V, the dip is about 100 mV higher. | ||
2018-10-25 | Taka, Mitsunari Yusuke | Event Mixing |
We understood the origin of EventMixing. It is due to the slow control command "Dragon - Start" after "Enable Trigger" in TIB. "Enable Trigger" should have been after "Dragon Start". This is dangerous actually. Mistake will be noticed only during analysis.
| ||
2018-10-25 | Oscar , Laia | Power up |
No water was found inside the camera. We measure the voltage at the output of the Redundancy modules: 24.98 V We connect a Current sensor between master bus bar and relay 0. We power up relay 0 and measure transient for both current and voltage: - Voltage shows a drop of around 1.5 V once it arrive at 20V, which is afterward (4 ms) recovered and keeps increasing until about 24.5 V - Current increases steadily with a small slope change on the drop on the voltage happens. It also show a drop of about 30% when the voltage reach 24.5 V that it recovers after about 80ms after The voltage reduction for 4ms brings the voltage very close to 18V, and some times may go slightly down. The same is observed in relay 1. | ||
2018-10-21 | Taka, Mitsunari, Yusuke | Timing Calibration. |
We tried to see the test pulse in the center of window. But we did not succeed. DAQ was with EthDisp from Taka's macbook through slow control network. We need to understand the delay in TIB and backplane. Since it was already 5:50 pm, (though we announced that we use camera until 5:00 pm) we had to shutdown. We kept 230 and 400V on, chiller on, only 24V off. | ||
2018-10-21 | Taka, Mitsunari, Yusuke | Event Mixing Test |
To confirm again the event mixing problem, we took data with the LegacyDaq. After init7.uic, we injected the test pulse in the central module with 300 Hz. TIB could see the rate properly. We took 20000 events. After that, we tried to take data with EVB, but it was not successful. EVB could not connect to all modules. We had the same problem a few times in a row. One of the reasons was dead ports in Osaka. Sometimes, ports in Osaka sleep without obvious reason. This is actually critical problem. We need to investigate further. Finally we gave up to take data with EVB. | ||
2018-10-21 | Taka, Mitsunari, Yusuke | TIB/UCTS study |
After power up, we tried to initialize TIB. But state didn't reach "5". After state 4, if we enable trigger, state went to 255. We knew that the RJ45 cable on the WR was damaged by the rack door. We changed it to new cable. We also used a different port in WR (port 8->5).And we reset TIB. Then with the standard procedure, state reached 5. We were happy. Just to be sure, we changed back to the damaged cable and retried. Then state was again 5. So, the reason was not the cable. But "Reset" of TIB was the key. After initializing PMT modules, TIB didn't work well. It didn't send back the trigger. Since temperature was too high (BP 35 deg.) We had to switch off the 24V. During this break, we changed the WR port from 5 to 8. After power up, we repeated the procedure. Again, TIB didn't send back the trigger. But, TIB reset helped. So, currently, startup recipe is that 0->1->2->3->4->255->TIB Reset->0->1->2->3->4->5->configure modules -> TIB Reset -> 0 -> 1 ->2 ->3 ->4 ->5.
| ||
2018-10-21 | Taka, Mitsunari, Yusuke | Restart the Camera. |
Before powering up 400V for the first time since last Tuesday, we examined the camera visually. Camera is properly parked. There was a water condensation on the camera body. The platform is not perfectly closed. There was a 2 cm gap between left and right. But it is not dangerous for us. At 11:45, we applied 400 V putting the breaker at the Drive container. After 15 min of stabilization, we started 24V from ECC (state ready). Then, we realized that TIB and UCTS do not respond on Ping. It was because dhcpd on tcs01 was dead. Also, uctsd on Osaka was dead We restarted dhcpd and uctsd and switched off and on 24V. Then, TIB, UCTS could be booted.|
| ||
2018-10-15 | Dirk | UCTSd dead. | ● uctsd.service - Execute the UCTS OPC-UA server Loaded: loaded (/etc/systemd/system/uctsd.service; static; vendor preset: disabled) Active: failed (Result: exit-code) since Di 2018-10-16 13:22:28 WEST; 2h 59min ago Process: 152844 ExecStart=/home/dragon/ucm_temp/ucts_opcua_server.sh (code=exited, status=134) Main PID: 152844 (code=exited, status=134) Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Connected to Server : opc.tcp://osaka:48010 Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: (MOS) : Info : 2018-10-13.12:30:47 : Verification of MOS version with lappweb Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: ******************** Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: Press CTRL-C to shutdown server Okt 16 13:22:28 osaka ucts_opcua_server.sh[152844]: /home/dragon/ucm_temp/ucts_opcua_server.sh: line 9: 152847 Aborted (core dumped) ./MOS_Device -d /MOS/plugins/Plugin_UCTS/UCTS.xml Okt 16 13:22:28 osaka systemd[1]: uctsd.service: main process exited, code=exited, status=134/n/a Okt 16 13:22:28 osaka systemd[1]: Unit uctsd.service entered failed state. Okt 16 13:22:28 osaka systemd[1]: uctsd.service failed. Restarted. | ||
2018-10-15 | Taka | MOXA Switch connected | SLOW control connection intact. Drive network can be used from remote tomorrow. | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Modules deconection |
- It happens two times today that after around 25 minutes, around 15 modules were not powered anymore whereas ECC was in state 2 and current in the pulse bar. At the newt switch ON, ALL powered | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | uaexpert deconnection |
- Again, we lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181016_003 and 20181016_004 | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | TIB issues |
- TIB goes from state 0 to 4 but then when we enable trigger go to state 255 as the alarms vector | ||
2018-10-16 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Small run summary |
- 7 GotoSafe and GoToReady for the ECC due to too high temperatures so switch ON/Off of the Modules/BP: 1) All module ON, 2 BPs OFF associated to module 10.1.6.12 and 10.1.6.27 2) All module ON, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27 3) All module ON, 1 BP OFF associated to module 10.1.7.171 4) All modules and BPs ON 5) All modules and BPs ON 6) All modules and BPs ON 7) All modules and BPs ON 8) All modules and BPs ON 9) All modules and BPs ON 10) All modules and BPs ON 11) All module ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149 12) All module ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149 13) ALL modules ON 14) ALL modules ON
| ||
2018-10-15 | Dirk | Charging Walkie-Talkies | with our private mini-USB adapters, while waiting for the real charger to reappear | Alternative: $8.99 on Amazon
| |
2018-10-15 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Slow control and uaexpert deconnection |
- Slow control connection lost in ready mode so then no more current in the pulse bar. GotOsafe GOtOready still no curent with the negative value in the pulse bar. We had to switch off and on the 233 and 400 V - We lost uaexpert that was completely stuck so to have the monitoring back again the DataLogger are now written in /home/cacooperator/CoolingSystem/20181015_005 and 20181015_006 | ||
2018-10-15 | Léa, Dirk, Julien, taka, Saiya, Mitsunari | Small run summary |
- 7 GotoSafe and GoToReady for the ECC due to too high temperatures so switch ON/Off of the Modules/BP: - Cut busy propagation from BP, dragon on local clock 1) 1 module OFF: 10.1.6.28, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27 2) 1 module OFF: 10.1.6.28, 3 BPs OFF associated to module 10.1.6.24, 10.1.6.27 and 10.1.7.147 3) 1 module OFF: 10.1.6.28, 1 BP OFF associated to module 10.1.7.147 4) All modules ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.149 5) All modules ON and All BPs ON - pb of internal/external trigger clock for the Dragon fixed: For DRS4, referential clock is now 10 MHz external clock. - Configuration of UCTS and TIB, two last runs taken with the TIB so with external clock, external trigger and busy propagation. 6) 1 module OFF: 10.1.5.16, 3 BPs OFF associated to module 10.1.6.27 and 10.1.7.146 and 10.1.7.149 7) 1 module OFF: 10.1.6.28, 1 BP OFF associated to module 10.1.6.28 8) ALL modules On
| ||
2018-10-15 | Léa, Taka, Dirk, Daniel | SLOW Control lost |
While Camera was on, the SLOW control connection was interrupted in the Drive container to prepare connection of Drive/AMC network. Consequently the EMC went to SAFE. But also the UaExpert interface was stuck (which is the current base for Camera monitoring). The setup was then restored as well as we could, including DataLogger function. | ||
2018-10-15 | Léa, Taka, Julien, Seiya, Dirk | writing speed limitation in data taking |
1 ZFW: validate speed 300 MB writing speed 8 ZFW: validate 8* 300 MB/s writing speed 16 ZFW: writing speed: 16*150 MB writing speed. Maybe problem due to the disk. To investigate | ||
2018-10-15 | Léa, Dirk, Julien | Slow control deconnection+ disconnect from OPC-UA | |||
2018-10-14 | Léa, Dirk, Julien | Small run summary | - No TIB/UCTS
- Cut busy propagation from BP, dragon on local clock - 7 GotoSafe and GoToReady for the ECC due to too high temperatures so 7 switch ON/Off of the Modules/BP 1) All modules ON, 2 BPs OFF associated to module 10.1.6.24 and 10.1.6.27 2) All modules ON, didn't check the BPs 3) All modules ON, didn't check the BPs 4) All modules ON, 1 BPs OFF associated to module 10.1.7.147 5) All modules ON, 2 BPs OFF associated to module 10.1.7.147 and 10.1.7.171 6) All modules ON, 1 BPs OFF associated to 10.1.7.147 7) All modules ON, 1 BP OFF associated to module 10.1.7.147
|
||
2018-10-14 | Eric, Dirk | All (DATA) fibres straight now! |
|
||
2018-10-14 | Dirk | Direct measurement of TX lasers |
INFO: Direct measurements can be done without danger for Photom-211 測定範囲 -70 ~ +5dBm according to datasheet. That is 3.16 mW to 0.1 µW. |
||
2018-10-14 | Léa, Dirk, Julien | First full-cam data run up to 15kHz! | That is what we would have liked to see last week.
Now it's Champagne time. :-) |
||
2018-10-14 | Julien, Eric | Fibres DATAsp1-6 tested | Optically, between DC and Cam. | ||
2018-10-13 | Léa, Dirk, Julien | Run0015 | Still no UCTS(/TIB); fibre broken between DC and Cam. Running with half-cam and two additional missing modules (BP problem): 6.24, 6.27.
- r0015 all events (at runstart), 300Hz and 10kHz, but ZFW problems (testing with 16 instances). |
||
ALL | Door knob! | Falling apart from the CC door. Urgent action needed. (Bigger screw?) | |||
Taka, Seiya, Mitsunari | Fiber Check | We checked optical connection between DC and Camera because some labels were lost due to UV damage. We checked Data2, Data 6, SlowControl and UCTS. Only UCTS had a problem (no splicing at Drive PP). The rest where OK. | |||
Taka, Seiya, Mitsunari | Labeling fibers | We labeled optical fibers of the data (DATA 1 - 6) at patch panels in Drive contaniner and in IT container. The spare cables have not been done yet because a ribbon ran out. | |||
Seiya, Mitsunari | Connection validation | We validated 12 optical fiber connection (No. 1-6, 13-18) from Drive coontainer to IT container. Strength is -35 to -38 dBm. | |||
2018-10-12 | Léa, Taka, Julien, Seiya, Dirk | Runs0012 sqq. | No UCTS/TIB today.
These runs have 3 modules missing (as identified in the preparatory phase: 6.21, 6.25, 6.28). According to a quick check, all EventNb=TriggerNb otherwise for all runs today. See RunCatalog for details. |
||
Dirk | Creation of logbook | ||||
Dirk, Taka, Julien, Seiya, Léa | Data acquisition |
- Pb with the ClusCo on tcs01. The root propagation for the BPs for the trigger doesn't work. Using exactly the same script it works on CacoOperator. - We validate for 3 fibers the new connections to the dataswitch fiber. Eric is fixing the one missing or broken. So for now only the right part of the camera is used for data acquisition - No TIB/ UCTS - Few runs were taken with no external trigger from TIB. 3 Modules didn't appear busy but didn't sent any data. In those test the busy from the CBP was cut. Those missing modules have to be investigated in more details but due to a lot of slow control deconnection problem and high temperature in the camera it was not possible. Script used in CLusCo: init7_noextTrigger_Test.uic. To not cut the busy, name is: init7_noextTrigger_noBusyCut_Test.uic - One try with no external trigger and clock but with the CBP delivering the clock and pps and using 10 MHz clock as default clock for the dragons. L1 local Trigger didn't generated. No we come back to a configuration of dragon on their local clock but this issue has to be investigated. Script used in CLusCo: init7_noextTriggerClock_Test.uic | |||
Dirk, Taka, Julien, Seiya, Léa | Too high temperatures in the Camera |
- Limit fix to 27 for the aire temperature inside and 35 for the BP temperature. Pression also get some alarms - During the day, due to high temperatures we have to gotosafe to wait for the camera to cool almost 11 times but never the BP max temperature went more than 34 degree. The air inside reach at the maximum 26.5. | |||
Dirk, Taka, Julien, Seiya, Léa | ECC lost connection (2 times!) |
- In the afternoon, 3 lost of ECC slow control communication due to the interruption between IT-Container/Driver-Container. Miscomunnication with AMC people... First time, temperature was already high in the camera, we had to switch OFF the 233 and 400V for security reasons. Two other times, we get the ECC connection back quite fast and ECC was in the same state that when the connection was lost meaning state 2 ready. Just no more current in the pulse bar so we have to gotosafe and gotoready both times. After that the current was -247 in the pulse bar... Not understood for the moment - The second interruption happened, when the Moxa switch was reconnected, probably not correctly configured. It was disconnected again. Presently this impacts AMC and drive operation, until the Moxa can be reconnected. | |||
Léa, Taka, Julien, Seiya, Dirk | Discovered SLOW control fiber lost, fibers changed connection recover | Interruption. Using UCTS section for replacement. | |||
2018-10-11 | Eric, Armand | Cable splicing | UCTS fibres ready and checked. | ||
Léa, Taka, Julien, Seiya, Dirk | DATA5-upstream broken | Located between DC-PP. and IC-PP. Eric is going to have a look on Friday, when working on the other (spare) fibres. | |||
Léa, Taka, Julien, Seiya, Dirk | Found correct order of DATA1-DATA6 | We eventually found that the fibres DATA1-6 were connected in (exactly) wrong order to the camera, which lead to a mismatch of switches/modules with respect to interfaces/addresses in osaka.
This is an item for our "learned lessons": The indoor fibres had been labelled (switch-interface), but stayed in Mirca. The new fibres had been confectioned at ORM, and labels had to be "guessed" in one way or the other. |
Glossary
- CC = Commissioning Container (present LST1 Control Room)
- DC = Drive Container
- IC = IT-Container